Fake News Detection in Social Networks Via Crowd Signals: Sebastian Tschiatschek Adish Singla Manuel Gomez Rodriguez
Fake News Detection in Social Networks Via Crowd Signals: Sebastian Tschiatschek Adish Singla Manuel Gomez Rodriguez
ABSTRACT generating and spreading fake news, for instance, making polit-
Our work considers leveraging crowd signals for detecting fake ical gains, harming the reputation of businesses, as clickbait for
news and is motivated by tools recently introduced by Facebook increasing advertising revenue, and for seeking attention1 . As a
that enable users to flag fake news. By aggregating users’ flags, our concrete example, Starbucks recently fell victim to fake news with
goal is to select a small subset of news every day, send them to a hoax advertisement claiming that the coffee chain would give free
an expert (e.g., via a third-party fact-checking organization), and coffee to undocumented immigrants2 . While Starbucks raced to
stop the spread of news identified as fake by an expert. The main deny this claim by responding to individual users on social media,
objective of our work is to minimize the spread of misinformation the lightening speed of the spread of this hoax news in online social
by stopping the propagation of fake news in the network. It is media highlighted the seriousness of the problem and the critical
especially challenging to achieve this objective as it requires de- need to develop new techniques to tackle this challenge. To this
tecting fake news with high-confidence as quickly as possible. We end, Facebook has recently announced a series of efforts towards
show that in order to leverage users’ flags efficiently, it is crucial tackling this challenge [10, 11].
to learn about users’ flagging accuracy. We develop a novel algo- Detection via expert’s verification. Fake news and misinforma-
rithm, Detective, that performs Bayesian inference for detecting tion have historically been used as tools for making political or
fake news and jointly learns about users’ flagging accuracy over business gains [9]. However, traditional approaches based on veri-
time. Our algorithm employs posterior sampling to actively trade fication by human editors and expert journalists do not scale to the
off exploitation (selecting news that maximize the objective value volume of news content that is generated in online social networks.
at a given epoch) and exploration (selecting news that maximize In fact, it is this volume as well as the lightening speed of spread in
the value of information towards learning about users’ flagging these networks that makes this problem challenging and requires
accuracy). We demonstrate the effectiveness of our approach via ex- us to develop new computational techniques. We note that such
tensive experiments and show the power of leveraging community computational techniques would typically complement, and not
signals for fake news detection. replace, the expert verification process—even if a news is detected
as fake, some sort of expert verification is needed before one would
ACM Reference Format:
actually block it. This has given rise to a number of third-party
Sebastian Tschiatschek, Adish Singla, Manuel Gomez Rodriguez, Arpit
fact-checking organizations such as Snopes3 and Factcheck.org4 as
Merchant, and Andreas Krause. 2018. Fake News Detection in Social Networks
via Crowd Signals. In WWW ’18 Companion: The 2018 Web Conference well as a code of principles [25] that should be followed by these
Companion, April 23–27, 2018, Lyon, France. ACM, New York, NY, USA, organizations.
8 pages. https://doi.org/10.1145/3184558.3188722 Detection using computational methods. There has been a re-
cent surge in interest towards developing computational methods
1 INTRODUCTION for detecting fake news (cf., [7] for a survey)—we provide a more
detailed overview of these methods in the Related Work section.
Fake news (a.k.a. hoaxes, rumors, etc.) and the spread of misinfor-
These methods are typically based on building predictive models
mation have dominated the news cycle since the US presidential
to classify whether a news is fake or not via using a combina-
election (2016). Social media sites and online social networks, for
tion of features related to news content, source reliability, and
example Facebook and Twitter, have faced scrutiny for being unable
network structure. One of the major challenges in training such
to curb the spread of fake news. There are various motivations for
predictive models is the limited availability of corpora and the
∗ Work performed while at ETH Zurich. subjectivity of labelling news as fake [27, 33]. Furthermore, it is
difficult to design methods based on estimating source reliability
This paper is published under the Creative Commons Attribution 4.0 International
(CC BY 4.0) license. Authors reserve their rights to disseminate the work on their
1 Snopes compiles a list of top 50 fake news stories:
personal and corporate Web sites with the appropriate attribution.
WWW ’18 Companion, April 23–27, 2018, Lyon, France http://www.snopes.com/50-hottest-urban-legends/
2 http://uk.businessinsider.com/fake-news-starbucks-free-coffee-to-undocumented-
© 2018 IW3C2 (International World Wide Web Conference Committee), published
under Creative Commons CC BY 4.0 License. immigrants-2017-8
3 http://www.snopes.com/
ACM ISBN 978-1-4503-5640-4/18/04.
4 http://factcheck.org/
https://doi.org/10.1145/3184558.3188722
and network structure as the number of users who act as sources minimize the spread of misinformation, i.e., how many users end up
is diverse and gigantic (e.g., over one billion users on Facebook); seeing a fake news before it is blocked. We design our algorithm
and the sources of fake news could be normal users who unin- Detective, which implements a Bayesian approach for learning
tentionally share a news story without realizing that the news is about users’ accuracies over time as well as for performing inference
fake. A surge of interest in the problem and in overcoming these to find which news are fake with high confidence. In short, our
technical challenges has led to the establishment of a volunteer- main contributions include:
ing based association—FakeNewsChallenge5 —comprising over 100
• We formalize the problem of leveraging users’ flagging ac-
volunteers and 70 teams which organizes machine learning compe-
tivity for detection of fake news. We showcase the need to
titions related to the problem of detecting fake news.
learn about users’ accuracy in order to effectively leverage
their flags in a robust way.
1.1 Leveraging users’ flags. • We develop a tractable Bayesian algorithm, Detective, that
Given the limitation of the current state-of-the-art computational actively trades off between exploitation (selecting news that
methods, an alternate approach is to develop hybrid human-AI directly maximize the objective value) and exploration (se-
methods via engaging users of online social networks by enabling lecting news that helps towards learning about users’ flag-
them to report fake news. In fact, Facebook has recently taken ging accuracy).
steps towards this end by launching a fake news reporting tool in • We perform extensive experiments using a publicly available
Germany [11], as shown in Figure 1. The idea of this tool is that as Facebook dataset to demonstrate the effectiveness of our
news propagates through the network, users can flag the news as approach. We plan to make the code publicly available so
fake. that other researchers can build upon our techniques for this
important and timely problem of detecting fake news.
2 RELATED WORK
Contemporary results. Kim et al. [16] explored the idea of detect-
ing fake news via leveraging users’ flagging activity. In particular,
they introduce a flexible representation of the above problem using
the framework of marked temporal point processes. They develop
an algorithm, Curb, to select which news to send for fact-checking
via solving a novel stochastic optimal control problem. The key
technical differences of the approach by Kim et al. [16] to ours are:
(1) we learn about the flagging accuracy of individual users in an
online setting; in contrast, they consider all users to be equally
reliable and estimate the flagging accuracy of the population of
users from historical data; (2) our algorithms are agnostic to the
actual propagation dynamics of news in the network; they model
Figure 1: Facebook has launched tools in Germany to report the actual propagation dynamics as a continuous-time dynamical
fake news. Image source: [11]. system with jumps and arrive at an algorithm by casting the prob-
lem as an optimal control problem; and (3) we use discrete epochs
with a fixed budget per epoch (i.e., the number of news that can
As proposed by Facebook [11], the aggregated users’ flags as well as
be sent to an expert for reviewing); they use continuous time and
well as other available signals can be used to identify a set of news
consider an overall budget for their algorithm.
which potentially is fake. These news can then be sent to an expert
Computational methods for detecting fake news. There is a
for review via a third-party fact-checking organization. If an expert
large body of related work on rumor detection and information
labels the news as fake, it could be removed from the network or
credibility evaluation (with a more recent focus on fake news de-
marked as disputed making it appear lower in news-feed ranking.
tection) that are applicable to the problem of detecting fake news.
The contemporary work by Kim et al. [16] explored the idea of
These methods are typically based on building predictive models
detecting fake news via leveraging users’ flagging activity by using
to classify whether a news is fake. At a high-level level, we can cat-
the framework of marked temporal point processes. We highlight
egorize these methods as follows: (i) based on features using news
the key differences of their approach to ours in the next section.
content via natural language processing techniques [13, 31, 34, 38];
(ii) via learning models of source reliability and trustworthiness
1.2 Our Contributions
[20, 22, 28]; (iii) by analyzing the network structure over which a
In this paper, we develop algorithmic tools to effectively utilize the news propagated [6]; and (iv) based on a combination of the above-
power of the crowd (flagging activity of users) to detect fake news. mentioned features, i.e., linguistic, source, and network structure
Given a set of news, our goal is to select a small subset of k news, [1, 17, 18, 35]. As we pointed out in the Introduction, there are
send them to an expert for review, and then block the news which several key challenges in building accurate predictive models for
are labeled as fake by the expert. We formalize our objective as to identifying fake news including limited availability of corpora, sub-
5 http://www.fakenewschallenge.org/ jectivity in ground truth labels, and huge variability in the sources
who generate fake news (often constituting users who do it unin- 3.1 News Generation and Spread
tentionally). In short, these methods alone have so far proven to be We assume that new news, denoted by the set X t , are generated
unsuccessful in tackling the challenge of detecting fake news. at the beginning of every epoch t (cf., line 4).6 In this paper, we
Leveraging crowd signals for web applications. Crowdsourcing consider a setting where each news has an underlying label (un-
has been used in both industrial applications and for research stud- known to the algorithm) of being “fake" (f ) or “not fake" ( f¯). We
ies in the context of different applications related to web security. use random variable Y ∗ (x) to denote this unknown label for a news
For instance, [23] and [5] have evaluated the potential of leveraging x and its realization is given by y ∗ (x) ∈ { f , f¯}. The label y ∗ (x) can
the wisdom of crowds for assessing phishing websites and web only be acquired if news x is sent to an expert for review who would
security. Their studies show a high variability among users—(i) then provide the true label. We maintain a set of “active" news At
the participation rates of users follows a power-law distribution, (cf., line 5) which consists of all news that have been generated
and (ii) the accuracy of users’ reports vary, and users with more by the end of epoch t but for which expert’s label have not been
experience tend to have higher accuracy. The authors also discuss acquired yet.
the potential of voting fraud when using users’ reports for security Each news x is associated with a source user who seeded this news,
related applications. Wang et al. [32] performed a crowdsourcing denoted as o x (cf., line 4). We track the spread of news in the set
study on Amazon’s Mechanical Turk for the task of sybil detection At via a function π t : At → 2U . For a news a ∈ At , the function
in online social networks. Their studies show that there is a huge π t (a) returns the set of users who have seen the news a by the end
variability among crowd users in terms of their reporting accura- of epoch t. During epoch t, let u t (a) ⊆ U \ π t −1 (a) be the set of
cies that needs to be taken into account for building a practical additional users (possibly the empty set) to whom news a ∈ At
system. Chen et al. [3], Zheleva et al. [39] present a system sim- propagates in epoch t, hence π t (a) = π t −1 (a) ∪ u t (a) (cf., line 9).
ilar to that of ours for the task of filtering email spam and SMS
spam, respectively. The authors discuss a users’ reputation system 3.2 Users’ Activity of Flagging the News
whereby reliable users (based on history) can be weighted more
In epoch t, when a news a ∈ At propagates to a new user u ∈ u t (a),
when aggregating the reports. However, their work assumes that
this user can flag the news to be fake. We denote the set of users
users’ reputation/reliability is known to the system, whereas the fo-
who flag news a as fake in epoch t via a set l t (a) ⊆ u t (a) (cf.,
cus of our paper is on learning users’ reputation over time. Freeman
line 10). Furthermore, the function ψ t (a) returns the complete set
[12] discusses the limitations of leveraging user feedback for fake
of users who have flagged the news a as fake by the end of epoch
account detection in online social networks—via data-driven stud-
t.7 For any news x and any user u ∈ U , we denote the label user
ies using Linkedin data, the authors show that there is only a small
u would assign to x via a random variable Yu (x). We denote the
number of skilled users (who have good accuracy that persists over
realization of Yu (x) as yu (x) ∈ { f , f¯} where yu (x) = f signifies
time) for detecting fake accounts.
that user has flagged the news as fake. In this paper, we consider a
Crowdsourcing with expert validation On a technical side, our
simple, yet realistic, probabilistic model of a user’s flagging activity
approach can be seen as that of a semi-supervised crowdsourcing
as discussed below.
technique where users’ answers can be validated via an external
User abstaining from flagging activity. Reflecting the behavior
expert. Hung et al. [14], Liu et al. [21] present probabilistic models
of real-world users, user u might abstain from actively reviewing
to select specific news instances to be labeled by experts that would
the news content (and by default, does not flag the news)—we model
maximize the reduction in uncertainty about users’ accuracy. With a
this happening with a probability γu ∈ [0, 1]. Intuitively, we can
similar flavor to ours, Zhao et al. [36] presents a Bayesian approach
think of 1 − γu as the engagement of user u while participating in
to aggregate information from multiple users, and then jointly
this crowdsourcing effort to detect fake news: γu = 1 means that
infer users’ reliability as well as ground truth labels. Similar to our
the user is not participating at all.
approach, they model users’ accuracy via two separate parameters
User’s accuracy in flagging the news. With probability (1 − γu ),
for false positive and false negative rates. However, their approach
user u reviews the content of news x and labels the news. We model
is studied in an unsupervised setting where no expert validation
the accuracy/noise in the user’s labels, conditioned on that the user
(ground truth labels) are available.
is reviewing the content, as follows:
• αu ∈ [0, 1] denotes the probability that user u would not flag
the news as fake, conditioned on that news x is not fake and
the user is reviewing the content.
3 THE MODEL • βu ∈ [0, 1] denotes the probability that user u would flag the
We provide a high-level specification of our model in Protocol 1. news as fake, conditioned on that news x is fake and the user
There is an underlying social network denoted as G = (U , E) where is reviewing the content.
U is the set of users in the network. We divide the execution into User’s observed activity. Putting this together, we can quantify
different epochs denoted as t = 1, 2, . . . ,T , where each epoch could the observed flagging activity of user u for any news x with the
denote a time window, for instance, one day. Below, we provide
details of our model—the process of news generation and spread, 6 Forsimplicity of presentation, we consider every news generated in the network to
users’ activity of flagging the news, and selecting news to get ex- be unique. In real-world settings, the same news might be posted by multiple users
pert’s labels. because of externalities, and it is easy to extend our model to consider this scenario.
7 Note that as per specification of Protocol 1, for any news x , the source user o doesn’t
x
participate in flagging x .
Protocol 1: High-level specification of our model
1 Input: social network graph G = (U , E); labeling budget per epoch k.
2 Initialize: active news A0 = {} (i.e., news for which expert’s label is not acquired yet).
3 foreach t = 1, 2, . . . ,T do
/* At the beginning of epoch t */
4 News X t are generated with o x ∈ U as the origin/source of x ∈ X t .
5 Update the set of active news as At = At −1 ∪ X t . ∀x ∈ X t , do the following:
6 Initialize users exposed to the news x as π t −1 (x) = {}.
7 Initialize users who flagged the news x as ψ t −1 (x) = {}.
/* During the epoch t */
8 News At continue to propagate in the network. ∀a ∈ At , do the following:
9 News a propagates to more users u t (a) ⊆ U \ π t −1 (a); i.e., π t (a) = π t −1 (a) ∪ u t (a).
10 News a is flagged as fake by users l t (a) ⊆ u t (a); i.e., ψ t (a) = ψ t −1 (a) ∪ l t (a).
/* At the end of epoch t */
11 Algorithm Algo selects a subset S t ⊆ At of up to size k to get expert’s labels given by y ∗ (s) ∈ { f , f¯} ∀ s ∈ S t .
12 Block the fake news, i.e., ∀s ∈ S t s.t. y ∗ (s) = f , remove s from the network.
13 Update the set of active news as At = At \ S t
Note that news s ∈ S t s.t. y ∗ (s) = f¯ remain in the network, continue to propagate, and being flagged by users
following matrix defined by variables (θu, f¯, θu, f ): 3.4 Objective: Minimizing the Spread of Fake
News
Let’s begin by quantifying the utility of blocking a news a ∈ At
" # " # " #
θu, f¯ 1 − θu, f 11 αu 1 − βu
= γu + (1 − γu ) at epoch t—it is important to note that, by design, only the fake
1 − θu, f¯ θu, f 00 1 − αu βu
news are being blocked in the network. Recall that |π t (a)| denotes
the number of users who have seen news a by the end of epoch t.
where
We introduce |π ∞ (a)| to quantify the number of users who would
θu, f¯ ≡ P Yu (x) = f¯ | Y ∗ (x) = f¯ eventually see the news a if we let it spread in the network. Then, if
a news a is fake, we define the utility of blocking news a at epoch
1 − θ ≡ P Yu (x) = f | Y ∗ (x) = f¯
u, f¯
t as valt (a) = |π ∞ (a)| − |π t (a)|, i.e., the utility corresponds to the
θu, f ≡ P Yu (x) = f | Y ∗ (x) = f
number of users saved from being exposed to fake news a. If an
1 − θu, f ≡ P Yu (x) = f¯ | Y ∗ (x) = f algorithm Algo selects set S t in epoch t, then the total expected
utility of the algorithm for t = 1, . . . ,T is given by
The two parameters (αu , βu ) allow us to model users of different
T
types that one might encounter in real-world settings. For instance,
1 {y ∗ (s)=f } valt (s)
Õ h Õ i
Util(T , Algo) = E (1)
• a user with (αu ≥ 0.5, βu ≤ 0.5) can be seen as a “news t =1 s ∈S t
lover" who generally tends to perceive the news as not fake; where the expectation is over the randomness of the spread of news
on the other hand, a user with (αu ≤ 0.5, βu ≥ 0.5) can be and the randomness in selecting S t ∀t ∈ {1, . . . ,T }.
seen as a “news hater" who generally tends to be skeptical In this work, we will assume that the quantity valt (a) in Equation 1
and flags the news (i.e., label it as fake). can be estimated by the algorithm. For instance, this can be done by
• a user with (αu = 1, βu = 1) can be seen as an “expert” who fitting parameters of an information cascade model on the spread
always labels correctly; a user with (αu = 0, βu = 0) can be π t (a) seen so far for news a, and then simulating the future spread
seen as a “spammer” who always labels incorrectly. by using the learnt parameters [8, 26, 37].
Given the utility values valt (·), we can consider an oracle Oracle
3.3 Selecting News to Get Expert’s Label that has access to the true labels y ∗ (·) for all the news and maxi-
At the end of every epoch t, we apply an algorithm Algo—on behalf mizes the objective in Equation 1 by simply selecting k fake news
of the network provider—which selects news S t ⊆ At to send to an with highest utility. In the next section, we develop our algorithm
expert for reviewing and acquiring the true labels y ∗ (s) ∀s ∈ S t (cf., Detective that performs Bayesian inference to compute y ∗ (·) using
line 11). If a news is labeled as fake by the expert (i.e., y ∗ (s) = f ), the flagging activity of users as well as via learning users’ flagging
this news is then blocked from the network (cf., line 12). At the accuracy {θu, f¯, θu, f }u ∈U from historic data.
end of the epoch, the algorithm updates the set of active news as
At = At \ S t (cf., line 13). We will develop our algorithm in the next 4 OUR METHODOLOGY
section; below we introduce the formal objective of minimizing the In this section we present our methodology and our algorithm
spread of misinformation via fake news in the network. Detective. We start by describing how news labels can be inferred
for the case in which users’ parameters are fixed. Next, we consider Algorithm 2: Algorithm TopX
the case in which users’ parameters are unknown and employ a
1 Input:
Bayesian approach for inferring news labels and learning users’ pa-
rameters. Given a prior distributions on the users’ parameters and • Active news At ; information valt (·), l t (·), π t (·)
a history of observed data (users’ flagging activities and experts’ • budget k; news prior ω
labels obtained), one common approach is to compute a point esti- • users’ parameters {θu, f¯, θu, f }u ∈U .
mate for the users’ parameters (such as MAP) and use that. However, 2 Compute p(a) for all a ∈ At as
this can lead to suboptimal solutions because of limited exploration P(Y ∗ (a) = f | {θu, f¯, θu, f }u ∈U , ω, l t (a), π t (a))
towards learning users’ parameters. In Detective, we overcome
3 Select
this issue by employing the idea of posterior sampling [24, 29].
S t = arg maxS ⊆At , |S | ≤k a ∈S p(a)valt (a)
Í
utility
utility
0.6
0.4 0.4 F IXED -CM
0.4 N O -L EARN N O -L EARN N O -L EARN
0.2 0.2 0.2
R ANDOM R ANDOM R ANDOM
0.0 0.0 0.0
1 20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
time t engagement fraction of good users
(a) Learning about users (b) Users’ engagement in flagging. (c) Robustness against spammers
Figure 2: Experimental results. (a) Learning about users: Detective achieves average utility competitive compared to that
of Oracle (which knows the true news labels). The average utility of Detective converges to that of Opt as Detective
progressively learns the users’ parameters. (b) Users’ engagement in flagging: even with low engagement Detective can
effectively leverage crowd signals to detect fake news. (c) Robustness against spammers: Detective is effective even if the
majority of users is adversarial, highlighting the importance of learning about users’ flagging accuracy for robustly leveraging
crowd signals.