0% found this document useful (0 votes)
16 views7 pages

Imc16a Final

This document analyzes the prevalence and impact of ad blockers on the online advertising ecosystem using unique data from a 2 million person user panel and telemetry from nearly 2 trillion web transactions. It discusses the challenges of estimating ad blocker usage and presents a novel Mixture Proportion Estimation (MPE) method to quantify ad blocker installations and their effects on ad exposure for users and publishers. The findings aim to inform the digital media ecosystem about the implications of ad blockers and how it can adapt to these changes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views7 pages

Imc16a Final

This document analyzes the prevalence and impact of ad blockers on the online advertising ecosystem using unique data from a 2 million person user panel and telemetry from nearly 2 trillion web transactions. It discusses the challenges of estimating ad blocker usage and presents a novel Mixture Proportion Estimation (MPE) method to quantify ad blocker installations and their effects on ad exposure for users and publishers. The findings aim to inform the digital media ecosystem about the implications of ad blockers and how it can adapt to these changes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Ad Blockers: Global Prevalence and Impact

Matthew Malloy+ , Mark McNamara+ , Aaron Cahn+ , Paul Barford+†


{mmalloy,mmcnamara,acahn,pbarford}@comscore.com
+ †
comScore, Inc. University of Wisconsin - Madison

ABSTRACT digital media ecosystem, which has delivered a wide range


Ad blockers are a formidable threat to the vitality of the on- of transformative services that have been funded by online
line advertising eco-system. Understanding their prevalence ads.
and impact is challenging due to the massive scale and di- Ad blockers are typically implemented as plugins or browser
versity of the eco-system. In this paper, we utilize unique extensions that when installed, attempt to intercept and elim-
data gathering assets to assess the prevalence and impact of inate outgoing ad requests from a base web page 1 . They use
ad blockers from an Internet-wide perspective. Our study is a variety of mechanisms to identify ad requests. One of the
based on (i) a 2 million person world-wide user panel that most common and effective mechanisms is to compare the
provides ground truth for ad blocker installations and (ii) URLs in the embedded requests to a blacklist(s) of URLs
telemetry from large number of publisher web pages and ads of ad servers and advertising platforms. If there is a match,
served to publishers. We describe a novel method for assess- the blocker will prevent the request from being transmitted.
ing the prevalence of ad blocker installations that is based While a number of the most popular ad blockers are open
on Mixture Proportion Estimation. We apply this method to source and free to users, authors of these systems are now
nearly 2 trillion web transactions collected over the period of monetizing their efforts by offering to whitelist certain ad-
1 month (February 2016), to derive ad blocker prevalence es- vertisers and publishers [9].
timates for desktop systems in diverse geographic areas and While ad blockers have been available for over a decade,
for diverse demographic groups. Next, using deployment es- they have been receiving significant attention in the popu-
timates we consider the impact of ad blockers on users and lar media over the past year. Given this attention, we seek
on publisher sites. Specifically, we report on the reduction of answers to several simple questions: what is the prevalence
ads shown to users with ad blockers installed and show that of ad blocker installs in the internet? what is the behavior
even though a user may have an ad blocker installed, they are of ad blockers when installed? and what is the impact to
still exposed to a significant number of ads. We also char- publishers? Answers to these questions will help to clarify
acterize the impact of ad blockers across different categories the broader conversation about ad blockers and inform the
of publisher sites including those that may be participating digital media ecosystem about how it can evolve.
in whitelisting [9]. There are two significant challenges in addressing ques-
tions that focus on internet-wide population estimation. First,
it is difficult to assemble data sets on browser configurations
1. INTRODUCTION that enable such estimates. Second, since it is impossible to
have ground truth for all browser configurations deployed on
An ad blocker, as the name suggests, offers the capability all user systems, estimates based on a smaller population of
to prevent ads from being delivered to a user’s browser. The users is required. Care must be taken in any such estimate to
stated intent of entities that have developed ad blockers is to remove bias from the sample population.
enable users to surf the web without annoying ads. While In this paper we report results of our study of the preva-
the definition of “annoying" is somewhat unclear, what is lence and impact of ad blockers in the internet. Our anal-
clear is that these capabilities pose a significant threat to the yses is based on unique data assets that enable us to as-
sess ad blocker deployment and impact in a comprehensive
Permission to make digital or hard copies of all or part of this work for personal fashion that goes well beyond standard reports on blocker
or classroom use is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that copies bear this notice downloads (e.g., Ad Block Plus [5]). The data assets include
and the full citation on the first page. Copyrights for components of this work census information from nearly 2 trillion web transactions
owned by others than ACM must be honored. Abstracting with credit is per- collected over a period of a month. This is complemented
mitted. To copy otherwise, or republish, to post on servers or to redistribute to
lists, requires prior specific permission and/or a fee. Request permissions from by data collected over the same period of time from a 2M
permissions@acm.org. person user panel (distributed across the internet) that ex-
IMC 2016, November 14-16, 2016, Santa Monica, CA, USA
1
© 2016 ACM. ISBN 978-1-4503-4526-2/16/11. . . $15.00 While the focus of this paper is the web, VPN-based ad
DOI: http://dx.doi.org/10.1145/2987443.2987460 blockers are also available for mobile apps.
poses all details of browser configurations. This panel data, to be less concerned with privacy – a trait that could correlate
with ground truth information about ad blocker installs, is with ad block installation. As we show, the panel is indeed
the starting point for our study. biased away from ad block usage. One of the key aspects
We posit that users who opt-in to participate in a world- of our work is correcting this bias using the broader census
wide user panel that is specifically designed to track web via MPE. For the purposes of estimating ad block usage in
browsing and ad consumption are less likely to install and the general internet population, the panel provides a data set
enable an ad blocker. We develop a novel technique based consisting of tuples of (i) a comScore browser cookie and (ii)
on Mixture Proportion Estimation (MPE) to quantify this ef- a binary label indicating the presence or absence of ad block-
fect. In particular, MPE enables the proportions of subpopu- ing software. Details of the MPE approach are provided in
lations in a mixture to be estimated from samples. Our tech- Section 3.
nique is based on the assumption that users with ad blockers Census. The comScore census network is one of the most
will see fewer ads than those without ad blockers. We be- widely deployed internet census networks in the world. The
gin by separating panelists into two groups – those with and census network collects information daily on over 20 billion
those without ad blockers – and then quantify the ad display page views across half a million top level domains. In ad-
rates for those groups. The ad display rates allow us to infer dition, the census network collects data on over 2 billion ad
the prevalence of ad blocking in the census data. Theoret- deliveries daily. This data is collected via JavaScript tags de-
ical guarantees in the form of confidence ranges and statis- ployed on publisher pages and JavaScript tags deployed with
tical significance levels are provided. The MPE technique advertisements. In both settings, a client machine executes
employed in this paper enables, to the best of our knowl- the tag locally and reports information directly to our data
edge, the largest scale assessment of ad blocker deployment warehouse. The information includes a cookie identifier, a
to date. timestamp, and the type of tag (e.g., page or ad).
Estimation of the prevalence of ad block usage across the
2. DATA census network relies on counts of page and ad tags associ-
ated with individual cookies over the course of a month. In
In this section we provide an overview of the two sources
particular, for each cookie present on the census network, the
of data used in this study.
count of page views and the count of ad deliveries is tallied
Panel. The comScore panel consists of 2 million users
over the reporting period. The data gathered from the cen-
worldwide who voluntarily install monitoring software in
sus network are tuples consisting of (i) a comScore browser
exchange for various benefits, including cloud storage, anti-
cookie (ii) the count of tagged ads delivered to that browser
virus software, tree planting, and other cash prizes 2 . The
cookie and (iii) the count of tagged page views delivered
panel provides measurements on panelist’s web browsing
to that browser cookie. The data set under consideration is
behavior and internet use. When a panelist registers, they
reduced to ensure a longitudinal view of each cookie; specif-
voluntary provide their geographic location and demographic
ically, a cookie is excluded from the study if (i) it has fewer
information including age, sex, household income, etc.
than 200 pageviews or (ii) has not existed on the census net-
The panel monitoring software is also able to observe in-
work for a minimum of 30 days.
stallations of software on panelist’s computer. This enables
Of the remaining cookies, a subset correspond to com-
enumeration of web browser configurations including whether
Score panelists. This subset of cookies can be labeled as
or not ad block software packages are installed. We do this
associated or not associated with ad block software by com-
by using search queries to build a list of popular ad block-
paring against the panel data set described above. Ultimately
ers (e.g., Ad Block Plus, Ad Block, etc.) for three ma-
this defines three disjoint populations of cookies: (i) the set
jor browsers (e.g., Internet Explorer, Google Chrome, and
of labeled cookies associated with ad block software, de-
Mozilla Firefox) and then search for these names in the con-
noted Sblock , (ii) the set of labeled cookies known to not be
figuration data. The current lists include 10 ad blockers for
associated with ad block software, denoted Sads , and (iii) the
Internet Explorer, 20 ad blockers for Google Chrome, and 15
set unlabeled cookies, denoted S.
ad blockers for Mozilla Firefox. While we make no claims
Ad Ratio Statistic. The ad ratio statistic, defined as the
that each list is complete, we argue that each include most
number of ads delivered divided by the number of page views,
if not all of the widely reported blockers (which we quantify
is computed on a cookie by cookie basis:
in Section 4). We refer to the subset of panel users that have
an ad blocker installed as the panel percent ad block. count of ad deliveries
ad ratio = (1)
As with any widely recruited panel, bias can be intro- count of pageviews
duced when some populations find the incentivized recruit-
This statistic, in aggregate, is closely tied to ad block usage
ment more attractive than others. This is true of the com-
since it acts as an estimate of the number of ads delivered to
Score panel and the populations of internet users that do/don’t
a user per unit of internet browsing. However, this statistic
use ad block software. Our intuition is that an individual who
alone is not sufficient to infer if a user has or does not have an
voluntarily installs panel software on their machine is likely
ad blocker installed. First, in some cases users may browse
2
The panel is 100% opt-in with thorough disclosure pages that do not deliver ads, and will have an ad statistic
on data privacy. The privacy policy can be found at equal to zero regardless of whether or not they have ad block
http://www.comscore.com/About-comScore/Privacy-Policy. installed. Second, it is often the case that users with an ad
blocker installed, either through disabling of the ad blocker classifier; in the noisy, feature limited setting studied here,
or whitelisting of ads [9], are shown some number of ads and where classification would be based on the ad ratio statistic
possibly many ads. alone, this approach can fail entirely.
Ultimately the inherent restrictions of the data on hand In the context of estimation of ad blocking deployment,
gives rise to two challenges: (i) the panel is biased away the MPE approach relies on the distribution of the ad ra-
from ad block usage and (ii) census data is insufficient to re- tio statistic associated with the three populations of cookies:
liably classify individual cookies as associated with ad block- Sads , Sblock and S, corresponding to labeled cookies with-
ing software. We address both of these challenges via the out ad blocking software, labeled cookies with ad blocking
MPE approach described below. software, and unlabeled cookies. These populations define
Cookies and Users. There are a number of important de- three histograms over the ad ratio statistic:
tails and nuances pertaining to the precise definition of an 1. Pbads ∈ Rm – normalized histogram over the ad ra-
ad block user, and the relationship between users and cook- tio statistic for cookies known to have one or more ad
ies. While the panel allows association of a user with ad blockers installed
blocking software, it does not imply that the user actually 2. Pbblock ∈ Rm – normalized histogram over the ad ratio
employed ad blocking software for the entire reporting pe- statistic for cookies not associated with one or more ad
riod. Instead, it simply indicates the presence of ad block- blockers,
ing software at some point during the reporting period. A 3. Pb ∈ Rm – normalized histogram of unlabeled com-
user may install ad block software and immediately disable Score cookies from the comScore census network.
it entirely or more commonly install ad block software and Histograms are generated so that each of the m bins con-
disable it on a subset of sites; in both cases the user would tains approximately the same number of cookies  1for P , i.e.,
b
still be considered an ad block user. the boundaries of the bins are such that P ≈ m , . . . , m
b 1

.
Definition - Ad block user: a person with one or more ad With the three histograms as input, a single variable opti-
blocking programs installed on their primary computer at mization is run to find the mixture proportion such that the
some point during the reporting period. labeled histograms best align with the unlabeled histogram
Second, when studying ad block prevalence in the cen- according to (2). The procedure is detailed in Algorithm 1.
sus network, ad blocking is associated directly with browser
cookies, not users. In this study, there is nearly a one to Algorithm 1 MPE for Ad Block Prevalence
one correspondence between a user and a cookie. Using the
1: Input: ad ratio statistics for cookies in Sads , Sblock , S.
subset of cookies associated with the panel, the estimated
2: Compute bin edges a1 , . . . am−1 such that histogram of
number of cookies per person was 1.03, sufficiently close to
ad ratio for S satisfies
1 to be omitted.  
1 1
Pb ≈ ,...,
3. MIXTURE PROPORTION m m
ESTIMATION 3: Generate histograms Pbblock , Pbads , and Pb by binning
Mixture proportion estimation (MPE) [8] is a technique data according to [0, a1 ), [a1 , a2 ) . . . [am−1 , ∞)
for finding the proportions of classes in unlabeled data sets. 4: Solve optimization:
While there are many variations, the basic problem setup  
can be captured as follows. Let 1, . . . , k denote the classes, π ∗ = arg min f Pb, π Pbblock + (1 − π)Pbads
π∈[0,1]
and let P1 , . . . , Pk be known, estimated, or otherwise re-
stricted class conditional probability distributions over a fea- 5: Output: estimate of proportion of users with ad block
ture space. Given unlabeled data with a probability distribu- software π ∗ and significance level (p-value) derived
tion P , and the estimates of the class conditional distribu- from χ2 test
tions, find the relative proportion of each class in the unla-
beled data. In short, given P1 , . . . , Pk , and P , find propor- The algorithm returns the estimate of the proportion of
tions π1 , . . . , πk , such that ad block users, denoted π ∗ . We refer to this value as the
k
X MPE percent adblock. As equality in (2) is rarely if ever
P = πi Pi . (2) satisfied for any mixture proportion, the mixture proportion
i=1 that results in the minimum objective function, f (·), between
the histograms is returned by the optimization.
MPE is a powerful approach as it circumvents the need We employ three objective functions: the canonical L1
for data classification. The proportions of the classes are and L2 norms and the χ2 statistic, given by:
inferred from the data in aggregate, permitting success even
m
when classification error rates are prohibitively high. In con- X (Q(i) − P (i))2
trast, a manifest and domineering approach to this problem f (P, Q) = n (3)
i=1
P (i)
is to first classify each data point, and then find the propor-
tions of each class directly from the inferred classification. where n is the number of samples in histogram Q.
This approach is inherently limited by the error rates of the Validation via Significance Testing. The underlying hy-
pothesis of MPE is that unlabeled data are well represented 0.7
P̂block
0.12
P̂ads
by a combination of the labeled datasets according to (2). 0.6 0.10
After the optimization is run, this hypothesis can be con- 0.5
0.08
firmed or rejected by asking, in the context of goodness of 0.4
0.06
fit (GoF) testing, how well does the best mixture distribution, 0.3
0.04
π ∗ Pbblock +(1−π ∗ )Pbads , match the data, Pb? If the results of 0.2
0.02
0.1
the GoF test indicate that the histograms aren’t well matched 0.0 0.00
the original hypothesis is rejected, and the estimated propor- 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0

tions are invalid. P̂ αP̂block + (1 − α)P̂ads


0.20 0.20
The canonical GoF test for categorical data is Pearson’s
0.15 0.15
Chi-squared test [1]. In general, the Chi-squared test takes
the χ2 statistic and the number of samples associated with 0.10 0.10
the histogram under test as input, and outputs a p-value.
Note that in our setting, a high p-value is good, as it indi- 0.05 0.05

cates the data is well represented by the mixture distribution. 0.00 0.00
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
As both histograms are empirical, we consider the larger
dataset, Pb, to represent the theoretical frequencies, which
Figure 1: The underlying distribution of the ad ratio statistic asso-
is a standard approach in GoF testing. When the χ2 statis-
ciated with the three populations of cookies from Section 3: P̂block ,
tic is used as the objective, the GoF test is baked into the
P̂ads , and P̂ . The bottom right histogram is the mixture combination
optimization and MPE approach is equivalent to finding the
of P̂ads and P̂block utilizing the MPE approach. Visually, the success
mixture proportions that result in the maximum p-value (i.e.,
of the method is dictated by how closely the histogram generated
the least statistically significant outcome, the outcome that
by the mixture of P̂ads and P̂block matches the histogram of P̂ .
best matches the data). We use Chi-squared tests to confirm
the validity of the MPE approach on the various datasets.
0.225
4. RESULTS 0.210

This section presents estimates on the prevalence and im- 0.195

pact of ad blockers in the internet. More precisely, our MPE 0.180

approach was used to generate ad block percentages for key 0.165


geographies as shown in Table 1. From these percentages, 0.150
the MPE projection factor for each of the key geographies 0.135
was determined by dividing the MPE percent ad block, with 0.120
the χ2 objective, by the panel percent ad block. The MPE 0.105
approach can be applied to any arbitrary “breakout" (subset
of the overall population) by multiplying the projection fac- Figure 2: A heat map of ad blocker penetration on a state by state
tors by panel results for a target breakout. Due to space con- in the US. Vermont has the highest ad blocker penetration at 23.6%
siderations, we limit our analysis to several key breakouts in- and Mississippi has the lowest at 9.9%.
cluding (i) geographic, (ii) demographic, and (iii) publisher,
which highlights the impact of ad blockers in terms of po-
tential revenue loss. functions. Ad block penetration in the US is 18% and varies
To motivate and validate our MPE-based approach for this between 16% and 37% for other countries.
problem we provide a visual representation for the underly- Within the US, we consider ad blocker installations on
ing distributions for P̂ , P̂ads , and P̂block for the US in Fig- state by state basis. Figure 2 quantifies the ad block pene-
ure 1. Alongside these three histograms (e.g., the bottom tration rates in a heat map. We find that ad blocker penetra-
right histogram), we provide the histogram generated by us- tion is greatest in Vermont (23.6%) and lowest in Mississippi
ing the α value that minimized the objective function in our (9.9%).
MPE algorithm. Visually, the success of the method is dic-
tated by how closely the histogram generated by the mix- 4.2 Demographic Breakout
ture of P̂ads and P̂block matches the histogram of P̂ . Figure 1 Figure 3 provides ad blocker penetration rates for key de-
shows that the two histograms match quite closely. mographic categories in Germany, the UK, and the US. Ad
blocker penetration is most prevalent among males 18-34.
4.1 Geographic Breakout This finding is consistent across all geographic areas with
Our geographic analysis of ad blocker prevalence consid- Germany at 49%, the US at 29%, and the UK at 29%. The
ers the US, the UK, Germany, France, and Canada. These 18-34 age group is also consistently (across key geos) the
countries were selected because they are all large digital ad- most prevalent ad blocker group among females with Ger-
vertising markets. Table 1 shows results of the MPE ap- many at 43%, the UK at 22%, and the US at 20%.
proach using L1 , L2 , and the χ2 statistic as the objective
Feb-16
Geo L1 L2 χ2 p 95% Confidence nblock nad ncensus
US 18% 18% 17% 0.10 15.7% - 18.6% 6,788 52,368 49,406,827
UK 16% 16% 17% 0.88 11.5% - 23.5% 2,200 11,952 8,660,037
Germany 32% 32% 37% 0.56 28.4% - 46.6% 1,114 2,142 3,174,325
France 29% 29% 32% 0.89 22.5% - 42.5% 1,133 3,016 3,949,981
Canada 22% 22% 24% 0.52 18.5% - 30.5% 1,666 6,033 5,376,049

Table 1: The percentage of users with an ad blocker installed (the MPE percent ad block) in key geographies for the month of February.
Results from using L1 , L2 , and χ2 statistical distance with p-value as the objective function are shown. A high p-value indicates success
of MPE as it implies the resulting mixture distribution is not statistically significant with respect to the unlabeled data. The 95% confidence
values indicate the range for which the corresponding mixture distribution has a p-value greater than 0.05. The underlying size n of each
data set used in generating Pbads , Pbblock , and Pb are also provided.

50
Internet Explorer Chrome Firefox
35-50 Female
50+ Female US
50+ Male 0.8
40 18-34 Female IT
18-34 Male 0.6
35-50 Male
DE
30 0.4
% Adblock

FR
0.2
20
UK
0.0

Adblock Plus

Adblock Plus

Adblock Plus
Adblock Pro

Adblock Pro

Adblock Pro
Adblock

Adblock

Adblock
Other

Other

Other
10

0
Germany United Kingdom United States
Figure 4: A heat map showing the market share of the top three
ad block offerings across three major browsers (Internet Explorer,
Figure 3: Ad block penetration rates among key user demographic Google Chrome, and Mozilla Firefox). Results are further stratified
categories for Germany, the UK, and the US. across key geographies.

Feb-16
4.3 Ad Block Market Share Analysis Publisher segment % Ad Block users
Automotive 18.82%
As indicted in Section 2, there are a number of different ad Entertainment 20.21%
Games 22.30%
blockers available and in use today. A number of these report Lifestyles 20.56%
their total installations. Thus, it is of interest to investigate News/Information 20.21%
their relative market share, and our data and analytic method Portals 17.77%
Search/Navigation 18.12%
allows us to estimate the prevalence of specific browsers/ad Sports 20.91%
blocker deployments. To quantify this, we focused on three Technology 20.04%
major browsers (e.g., Internet Explorer, Google Chrome, and XXX Adult 24.74%

Mozilla Firefox). Figure 4 is a heat map of the relative mar- Table 3: The percent of users with an ad blocker installed by pub-
ket share of the top three ad block offerings (as well as a lisher segment.
catch all Other category) across three major browsers. It
is clear from Figure 4 the market is dominated by Adblock
Plus for both Firefox (95.2% averaged across geos) and In- with an ad blocker installed can be markedly different from
ternet Explorer (93.8% averaged across geos). For Google site to site. To quantify this behavior and it’s associated im-
Chrome, the market share is distributed fairly evenly be- pact, we calculate the percentage of users with an ad blocker
tween Adblock Plus (49.7% averaged across geos) and Ad- installed across (i) ten publishers and (ii) the major publisher
block (56.9% averaged across geos). Note, the values for segments. The ten publishers are a random selection of those
a particular geo/browser pair will not sum to 1 as a single with large audiences that illustrate the scope of impact of ad
panelist may have more than one ad block offering installed. blockers. Publisher names have been removed to preserve
anonymity.
4.4 Publisher Breakout Table 2 provides the results for ten publishers. The per-
The analysis in subsections 4.1 and 4.2 on ad blocker pen- centage of users with an ad blocker installed ranges from
etration across key geographies and key demographics high- 25.27% for Publisher H to 17.95% for Publisher B. Note,
lights the difference in ad block use among different popu- Publishers A, B, D, E, and F appear in some form on Ad-
lation segments. For instance, ad blocker penetration skews block Plus’s whitelist while Publishers C, G, H, I do not.
toward young males. These users inherently carry bias in the To further elucidate user ad block install behavior, we
sites they are likely to visit. Thus, the percentage of users ran the same analysis across publisher segments. The re-
Feb-16
% Ad block % Ad requests Ad blocker Potential
Publisher AVad AVblock
users blocked exposure rate revenue loss
Publisher A 19.52% 18.99% 0.23 0.08 0.34 $1,550,138
Publisher B 17.95% 5.17% 0.95 0.57 0.60 $508,534
Publisher C 21.09% 5.82% 1.95 1.49 0.76 $3,904,207
Publisher D 18.47% 7.76% 1.33 0.72 0.54 $1,575,406
Publisher E 21.96% 14.63% 0.40 0.17 0.42 $183,531
Publisher F 18.82% 8.06% 0.69 0.31 0.44 $190,625
Publisher G 21.43% 16.21% 1.81 0.55 0.30 $195,651
Publisher H 25.27% 16.07% 2.10 0.76 0.36 $170,779
Publisher I 23.00% 14.42% 0.70 0.22 0.31 $121,581

Table 2: Publisher breakout. Publisher names have been anonymized. The table shows the percent of ad block users and the percentage of ad
requests blocked. AVads and AVads are the number of ads shown per page view to ad block users and non-users. The ad blocker exposure
rate is the number of ads delivered to an ad block user per ad delivered to a non-blocker. Lastly, estimated potential revenue lost due to ad
block usage is shown.

Feb-16
sults of this analysis are shown in Table 3. The percentage Country Ad Blocker Exposure Rate
of users with an ad blocker installed are highest on XXX Canada 0.46
Adult (24.74%) and lowest on Portals (17.77%). Notably, France 0.35
Germany 0.51
ad blocker install rates among users visiting Sports, Tech- Italy 0.45
nology, Entertainment, and News/Information segments is UK 0.49
quite similar. These segments represent sites where users US 0.60

consume various forms of media and are therefore likely to


Table 4: Ad blocker exposure rate. The ad blocker exposure rate is
be confronted with advertisements that disrupt the consump- the number of ads delivered to an ad block user per ad delivered to
tion of this media. a non-ad block user.

4.5 Ad Blocker Impact Analysis


It is clear from the previous subsection that a significant 5. compute blocked impressions as Z×(AVads −AVblock )
proportion of users employ ad blockers. However, only con- The potential revenue lost Rlost using a $1 CPM (cost per
sidering users ignores two important factors: (i) ad block thousand ads shown, selected arbitrarily) for the ten publish-
users (could) still be exposed to a significant number of ads ers is found in Table 2. The intermediary values for AVad
due to whitelisting and disabling and (ii) different classes of and AVblock are also found in Table 2. Even with a modest
users have different browsing behaviors. $1 CPM, it is clear that ad blockers have a significant impact
To capture these effects, we considered the ad blocker ex- on revenue lost for publishers (e.g., $3.9M/mo. for Publisher
posure rate, which is interpreted as the number of ads shown C to $120K/mo. for Publisher I).
to an ad block user per ad shown to a non-block user. Table
3 and Table 2 show the ad blocker exposure rate, computed
as AVblock /AVads , where AVblock and AVad are the number 5. RELATED WORK
of ads per page view shown to ad block users and non-users. A study of general aspects of the ad serving ecosystem
Additionally, we calculated the percentage of ad requests is reported in [2]. That paper highlights the impact of user-
blocked and the potential revenue lost for ten publishers. targeting in online advertising as well as the broad range of
The revenue lost assumes a modest $1 CPM (cost per thou- ads that are delivered to different types of publishers. Nath
sand ads shown, selected arbitrarily). The calculation of the presents a similar study, which is focused on advertising
ad requests blocked is outlined here: in the mobile app space [4]. More specific to ad block-
1. compute X, the percentage of ad block users, as panel ers, Walls et al. [9] provide a comprehensive analysis of
percent ad block on publisher A multiplied by the MPE Adblock Plus’s Acceptable Ads (i.e., whitelisting) program.
projection factor for the geo of interest This study directly informs our work with respect to the issue
2. multiply X by the total number of users on publisher of whitelisting by ad blockers. In one of the few academic
A, which gives an estimate of the number of ad block studies on the topic, Pujol et al. [7] use passive measure-
users on publisher A, denoted Y ments on a residential broadband network to infer ad block
3. from panel data, compute the average page views per users and classify the impact of ad blockers on HTTP traf-
ad block user for publisher A, and multiply by Y to get fic. Our work is based on different datasets and employs
Z, the number of page views from ad block users on different methods to infer ad blocker use. Recently, Post
publisher A and Sekharan investigated the capabilities of 3 popular ad
4. from panel data, compute the average number of ads blocker via source code analysis [6]. Finally, there continue
shown per page view to ad block users (AVblock ) and to be many reports in the popular press on ad blocker preva-
the average number of ads shown per page view to non- lence. Most of these are based on data provided (e.g., [3])
block users (AVad ) by entities that work with publishers to deploy a page-based
detector. Our work provides a complementary perspective experiments. In future work, we intend to address these lim-
based on instrumentation deployed widely throughout the itations by continuing to refine our analytic capabilities, and
internet. to consider how ad blocker prevalence might be measured in
a way that does not rely on proprietary data.
6. SUMMARY AND CONCLUSIONS
In this paper we used two unique data sets to estimate the 7. REFERENCES
prevalence and impact of ad blockers in the internet. We use [1] A. Agresti and M. Kateri. Categorical Data Analysis.
mixture proportion estimation to remove bias from a world- Springer, 2011.
wide user panel that provides ground truth on ad blocker in- [2] P. Barford, I. Canadi, D. Krushevskaja, Q. Ma, and
stalls. S. Muthukrishnan. Adscape: Harvesting and Analyzing
Online Display Ads. In Proceedings of the 23rd World
Our results show that in the US, 18% of users have ad Wide Web Conference, Seoul, Korea, April 2014.
blockers installed; up to 37% of users have ad blockers in- [3] S. Blanchfield. PageFair, 2016. https://pagefair.com.
stalled in other key geographies. Males 18-34 are the most [4] S. Nath. MAdScope: Characterizing Mobile In-App
likely users of ad blockers in all geographies studied, with Targeted Ads. In Proceedings of the 13th Annual
49% using ad blocking software in Germany. Users with the International Conference on Mobile Systems,
highest income levels are the most likely users of ad block- Applications, and Services, Florence, Italy, May 2015.
ers in all geographies studied. Results show that Adblock [5] W. Palant. Adblock plus, 2016. http://adblockplus.org.
Plus is the most widely used ad blocker. Ad blockers are [6] E. Post and C. Sekharan. Comparative Study and
most prevalent on Chrome, followed by Firefox. Users with Evaluation of Online Ad-Blockers. In Proceedings of
ad blockers see roughly half as many ads that users without the 2nd International Conference on Information
Science and Security, Seoul, Korea, December 2015.
ad blockers see. Estimated monthly revenue lost due to ad [7] E. Pujol, O. Hohlfeld, and A. Feldmann. Annoyed
blockers on 10 large publisher sites varies between $3.9M Users: Ads and Ad-Block Usage in the Wild. In
and $120K (assumes $1 CPM). Proceedings of the ACM Internet Measurement
While the the results in this paper help further the under- Conference, Tokyo, Japan, October 2015.
standing of ad block prevalence and impact, we would be re- [8] C. Scott. A Rate of Convergence for Mixture
miss to not mention two inherent limitations. First, internet- Proportion Estimation, with Application to Learning
wide population or behavior estimation can never be shown from Noisy Labels. In Proceedings of the Eighteenth
to be completely accurate due to scale, complexity and dy- International Conference on Artificial Intelligence and
Statistics, San Diego, CA, May 2015.
namics. However, we argue that our methodology that in- [9] R. Walls, E. Kilmer, N. Lageman, and P. McDaniel.
cludes confidence ranges, enables results to be more effec- Measuring the Impact and Perception of Acceptable
tively judged and interpreted. Second, our reliance on pro- Advertisements. In Proceedings of the ACM Internet
prietary datasets by definition limits the repeatability of our Measurement Conference, Tokyo, Japan, October 2015.

You might also like