Predicting Meme Success with Linguistic Features in a Multilayer
Backpropagation Network
                                      Keith T. Shubeck (kshubeck@memphis.edu)
                                       Stephanie Huette (shuette@memphis.edu)
                                     Department of Psychology, 202 Psychology Building
                                                   Memphis, TN 38152 USA
                           Abstract                                individuals, the more we will understand cultural trends that
                                                                   may have been previously considered bewilderingly
  T he challenge of predicting meme success has gained
  attention from researchers, largely due to the increased         anomalous. The challenge then becomes for researchers to
  availability of social media data. Many models focus on          develop robust and valid methods for detecting memes,
  structural features of online social networks as predictors of   tracking their mutations, and predicting their success. The
  meme success. T he current work takes a different approach,      current model attempts to develop a method for predicting
  predicting meme success from linguistic features. We propose     meme success by analyzing its linguistic features. The
  predictive power is gained by grounding memes in theories of     linguistic features used represent dimensions known to be
  working memory, emotion, memory, and psycholinguistics.
  T he linguistic content of several memes were analyzed with
                                                                   important in communication and encompass emotion and
  linguistic analysis tools. T hese features were then trained     arousal the language may elicit, as well as more basic
  with a multi-layer supervised backpropagation network. A set     features such as length, concreteness, and orthographic
  of new memes was used to test the generalization of the          features such as misspellings .
  network. Results indicated the network was able to generalize       The challenge of detecting and tracking memes has been
  the linguistic features in order to predict success at greater   approached in a variety of ways, with varying success. The
  than chance levels (80% accuracy). Linguistic features appear    broad and encompassing nature of the definition for meme
  to be enough to predict meme transmission success without
  any information about social network structure.
                                                                   has resulted in the term being operationalized differently
                                                                   from study to study. In addition to the changing operational
  Keywords: meme prediction; distributed             cognition;    definitions, the domains of meme studies also vary. For
  evolutionary psychology; neural networks                         example, some studies focus on visual or video content such
                                                                   as YouTube memes (Shifman, 2012; Xie, Nastev, Kender,
                       Introduction                                Hill & Smith, 2011), and others on textual memes, like
  The term “meme” was originally coined by Richard                 quoted text in the news cycle (Simmons, Adamic, & Adar,
Dawkins in his book, The Selfish Gene. Dawkins, an                 2011; Leskovec, Backstrom, & Kleinberg, 2009). Other
evolutionary biologist, describes “meme” as a unit for             research has focused on microblogging memes in social
carrying cultural ideas or behavior, similar to how genes          networks such as Twitter or Yahoo! Meme (Ratkiewicz et
carry genetic information from one generation to the next.         al., 2010; Adamic, Lento, Adar & Ng, 2014; Tsur &
Just as genes propagate from organism to organism, memes           Rappoport, 2012; Ienco, Bonchi, & Castillo, 2010).
propagate from mind to mind by way of communication and            For our purposes here, we will focus on popular text-based
social learning (Dawkins, 1989). Under this lens, memes are        memes, of which some have visual components that were
also subject to mutations, where each mutation either              not included in the model, and others simply contain text.
strengthens or weakens the meme’s fitness. Susan                      Not all memes are retransmitted: the process of deciding
Blackmore (1998) argues that memes compete with one                whether to retransmit a meme is at the level of the
another for an individual’s limited cognitive resources for        individual user. With this in mind, memes that are shorter
the chance to replicate again. Thus, some memes will fall          and easier to remember should have inherent advantages
into obscurity where others will flourish. With this in mind,      over memes that are lengthier and are therefore more
successful memes should be those that are easily memorable         difficult to remember. Copy-and-paste memes often contain
(Blackmore, 1998). Analyzing the properties and features of        instructions within the text that encourage others to copy
memes that may influence their fitness has proven to be a          and paste the meme, or to “repost” the meme in its entirety
challenging endeavor, especially prior to the establishment        (Adamic et al., 2014). The copy-and-paste method for meme
of various online social networks.                                 spreading circumvents some of the cognitive memory
  The internet, and more specifically social media, provides       constraints, allowing for lengthier memes to succeed and
researchers interested in the study of information diffusion,      become popular. However, there still must be a decision to
meme propagation, and cultural transmission a means to             retransmit or repost the meme which involves cognition and
observe these concepts in an ecologically valid setting and        action.
on a massive scale. Our understanding of meme propagation             Another recent study set out with the goal of predicting
runs parallel with our understanding of human culture; the         meme success by observing the meme’s early spreading
more we understand about memes and their mutations, their          patterns within Twitter (Weng, Menczes, Ahn, 2014). The
origins, and how quickly thes e are accepted by other
authors chose to focus on the structure of the meme’s            essential feature, as they contain a collection of words that
environment because previous research has shown that the         fit into 80 validated word categories, ranging from emotion
structure of underlying networks impacts the spreading           word categories to deception word categories. Using a
process of information (Daley & Kendall 1964; Barrat,            regression model, with the above mentioned features, they
Barthelemy, and Vespignani, 2008). Design features of the        found that the cognitive category of words from LIWC was
website itself (i.e., user voting feature on Digg) can also be   positively correlated with the hashtag’s popularity, when the
used to improve meme prediction (Hogg & Lerman, 2012).           hashtag’s content was also taken into account. For example,
Weng et al. (2014) operationalize meme success by                the word “think”, a cognitive process, would predict
observing the meme’s overall popularity, relative to the         increased popularity compared to a non-cognitive word, like
other memes in their dataset. They operationalize “meme”         “ball”. They also found that lengthier hashtags were not as
as any hashtag observed in their dataset. Hashtags are           popular as shorter hashtags. They attributed this finding to
strings of text following a “#” users insert into their tweets   cognitive load theory and physical constraints for tweets
(i.e., short user submitted posts within Twitter) for labeling   (i.e., 140 character limit per tweet). Cognitive load theory
purposes. Popular hashtags are tracked by Twitter and said       posits that during an instance of complex learning, an
to be “trending”. Here, the definition of a successful meme      individual may be underloaded or overloaded with
is determined by the frequency of usage and overall              information, due to the working memory limitations. While
popularity of that meme.                                         these findings are promising, Tsur and Rappoport (2012)
   Weng et al. (2014) found that using topographic, or           point out that future studies using the content of memes to
structural, features of the network enabled their model to       predict success should delve deeper into the
accurately predict a meme’s popularity up to two months in       psycholinguistic aspects of the content and the cognitive
advance. These topographical features included “community        constraints of the receiver of the meme.
size”, where a community is a set of nodes (i.e., individual        These models often posit the relevant connections of
users) who are followers of one another, and “network            meme transmission are between people, but this neglects
surface” (i.e., neighbors of the audience of users).The model    what happens within an individual’s mind when a meme is
used by Weng et al. (2014) is similar to other studies that      encountered. The factors contributing to whether the meme
include user influence in understanding information              is transmitted, or not transmitted, is entirely a factor of the
diffusion (see Romero, Meeder, & Kleinberg, 2011).               individual, and thus relies more on cognitive factors than
   Unfortunately, studies that include user influence (i.e.,     number of connections to others in a social network. If the
number of followers a given user has, number of those            person decides to not transmit the meme further, the number
followers’ followers, etc.) as a key component of their          of connections to the user no longer matter. The current
meme predicting model add little to our understanding of         work is at the cognitive level of analysis, where connections
why certain memes are selected to be spread, and why other       constitute an information space inside of an individual, and
memes are not chosen to be spread. We argue that an              “success” is whether or not the individual is likely to engage
important question remains unanswered: are there linguistic      in further transmitting the meme.
features and aspects of cognition that can predict the              The advantage of neural networks over other rule-based
ultimate success of a meme, outside of the characteristics of    systems is they are able to solve more complex problems
the social network?                                              and carve up the solution’s space in unanticipated ways. For
   Tsur and Rappoport (2012) attempt to answer that              example, cognitive process words may somewhat predict
question by taking a closer look at the content of Twitter       meme success, but a combination of cognitive process
hashtags in order to predict their popularity. They explain      words, emotion words, concreteness, etc. might be
that their work provides two major contributions to the          interacting in non-intuitive ways that contribute to
current meme literature. First, their study places emphasis      transmission or non-transmission of the meme. To
on the content features of a meme in determining its             demonstrate this, we predicted a binary logistic regression
popularity, something that prior to their 2012 study, has        would not yield as much predictive power as the neural
been largely ignored. Secondly, by stepping away from the        network model. Neural networks are able to come up with
costly graph based algorithms, used in the studies               solutions that do not rely on linear or singular relationships
mentioned above, Tsur and Rappoport (2012) provide a             or causality, allowing for complex interactions which are
simple and more global approach for modeling meme                well known to be commonplace in thinking,
acceptance and popularity. The content features that were        communication, and behavior. Performance of a binary
examined included: hashtag length (number of characters          logistic regression will be compared to neural network
and words), hashtag orthography, emotional content and           performance to test this prediction.
linguistic cognitive features taken from the Linguistic
Inquiry and Word Count Tool, or LIWC. LIWC                                                 Model
(http://www.liwc.net/) is a linguistic tool that counts the      Meme Corpus
number of words in various categories that have been built         Memes were collected from the meme wiki-style website,
upon relevant communicative dimensions (Tausczik &               knowyourmeme.com, and were represented as 15 input
Pennebaker, 2010). The categories of the program are the         nodes with binary values. Each element of the input vector
represented a linguistic or cognitive variable of the meme       Psycholinguistic Features. Eight psycholinguistic features
that was theoretically and empirically motivated to have an      were chosen as meme features. These features were selected
impact on the meme’s popularity. The target outputs              based on current cognitive psychology and psycholinguistic
consisted of two binary winner-takes-all nodes, where one        theories centered on sentence recall, working memory, and
represented “successful” and the other represented               how emotion and arousal affect memory.
“unsuccessful”. Meme success was determined by using the            Mean word concreteness was determined through the use
number of Google search results of a meme phrase,                of Coh-Metrix, (http://cohmetrix.com/) a validated linguistic
verbatim. This was similar to the way that hashtag searches      analysis tool that is able to automatically analyze text for
were used in the aforementioned Twitter meme studies.            features such as text cohesion, parts of speech, word
   In order to reduce noise in the number of inaccurate result   frequency, lexical diversity, and syntactic complexity
hits, a time range filter was placed on each meme search,        (McNamara, Kulikowich, & Graesser, 2011). Concreteness
based on the month the meme search queries first spiked.         was chosen as a psycholinguistic feature for the current
This was determined by using Google Trends, which allows         model because previous research has shown that concrete
users to show how often a particular search term is entered      words are easier to recall than abstract words during a short-
in Google search, over time. If a meme’s search queries first    term serial recall task (Walker & Hulme, 1999). Memes that
began to spike in October of 2009, then the search was           are easier to recall and more concrete should have a distinct
limited to October 2009 to the present date. Many of the         advantage over memes that are more difficult to recall. If a
memes in the current dataset are words or word phrases that      given meme had more concrete terms than abstract terms
were initially esoteric, but were in use prior to the spike in   then it was coded as concrete (1), if it contained no concrete
popularity of the new or updated meaning. After a certain        terms, or more abstract terms, then it was coded as abstract
event (e.g., popular YouTube video, viral social networking      (0).
post, etc.) the meaning of the word phrase shifted. For the         The overall emotional arousal of a meme was determined
current model, we were only interested in the most current       through the use of the LIWC (Linguistic Analysis and Word
generation of the meme’s meaning. After determining the          Count; Pennebaker, Francis, & Booth, 2001). LIWC’s affect
total number of search results provided for each individual      dictionaries were based on the emotion rating scales
meme, a median split was applied to the data to separate         developed by Watson, Clark, and Tellegen (1988). For this
successful memes from unsuccessful memes. For this               feature, if a meme included an emotional word, either
particular data set, memes that had 37,400 or more search        positive or negative, it was considered an emotional meme
results were considered successful, and any memes below          (1), and if the meme contained no emotion words then it
that threshold were considered unsuccessful. Of course all       was considered a non-emotion meme (0). The emotional
memes were retransmitted to some degree, so this label           arousal feature was included in the current model because
might be something more akin to “more popular” and “less         previous research has shown emotional arousal, in general,
popular” when discussing memes as a whole.                       has an impact on long term declarative memory (Cahill &
                                                                 McGaugh, 1998).
Training set. The dataset used to train the network                 Four other finer-grained emotional features were also
consisted of 268 established memes collected from                recorded for each meme. These features were used to
knowyourmeme.com, a meme encyclopedia, which uses the            determine 1) whether or not positive emotion was present,
wiki web application to collect and categorize various           2) whether or not negative emotion was present, 3) whether
internet memes. The memes included in our corpus contain         there was more positive emotion than negative emotion and,
hashtag memes (e.g., #YOLO), copy-and-paste memes (e.g.,         4) whether there was more negative emotion than positive
Repost this if you're a big black woman who don't need no        emotion. Negative emotion has been found to enhance
man), as well as lesser known memes commonly used in             memory accuracy for specific details during a recall task
smaller online communities (e.g., burst into treats). The        (Kensinger, 2007). However, the broaden-and-build
average meme word length was roughly four words per              hypothesis posits that positive moods broaden an
meme, with the longest meme having 31 words. Copy-and-           individual’s scope of attention and thought-action
paste memes were divided into smaller chunks of text, each       repertoires, whereas negative moods tend to narrow an
chunk having at most one complete sentence. In general, the      individual’s scope of attention and associations between
memes used for the current study are phenotypic memes,           thoughts and actions (Fredrickson & Branigan, 2005).
meaning their raw text contains the best estimate of the            In their study, Tsur & Rappoport (2012) chose to include
“original” meme. Variants of these phenotypic memes were         LIWC’s “cognitive” categories. They explained that this
not included. If it could not be clearly determined which        category should contain words that prompt or encourage
meme came first, then both memes were included separately        specific behaviors (e.g., cause, know, ought). However,
in the dataset. The linguistic and cognitive properties of the   overall Tsur & Rappoport found that the more general
meme text were broken down into 15 binary features ,             cognitive category only marginally improved the MSE over
broken down into four categories: psycholinguistic features,     the baseline. For the current study we chose to include the
physical features, orthographical features and meme type.        more specific “CogMech” LIWC category (i.e., cognitive
                                                                 mechanism) with the hope of improving the overall model.
  The last psycholinguistic feature included involves the
presence (1) or absence (0) of curse words, or taboo words,
in the meme. LIWC was used to determine the presence of
curse words in the set memes. LIWC’s swear word category
includes a set of socially proscribed derogatory or profane
words. A slew of previous research has shown that
emotionally arousing words, particularly taboo words, are
remembered better than neutral or nonarousing words (Jay,
Caldwell-Harris, & King, 2008, Kensinger, 2007; Kinsinger
& Corkin, 2003; LaBar & Phelps, 1998). Memes with curse
words should have a distinct advantage over memes without
curse words, in terms of the meme’s ability to be recalled.
Physical & Orthographical Features. Two physical
features of the meme text were also recorded. Intuitively,
memory span is inversely related to word length, and words       Figure 1: An example of a template meme. The text varies
that take longer to read or speak are more difficult to recall    from iteration to iteration, but the image remains static.
in simple recall tasks (Baddeley, Thomson, & Buchanan,               Text here emphasizes awkward social behaviors.
1975). Memes that contained less than four words were
considered short (1) and memes that contained four or more                         Network Structure
words were considered long (0). Additionally, memes that            The current model used a 4-layer backpropagation
contained words that all had less than three syllables were      network that was designed to take linguistic features as
considered short (1), and memes that contained a word with       inputs and classify them as either successful or
3 or more syllables were considered long (0). Shorter and        unsuccessful. The neural network used to predict meme
less complex memes should be easier to recall, improving         success consists of four layers: an input layer with 15 nodes,
their fitness and overall success.                               two hidden layers with 20 nodes each, and an output layer
  Two orthographical features were included based on the         with two winner-takes-all nodes that represents the
intuition that slang terms, purposeful word misspellings, or     probability of success of the meme. The output nodes are
purposeful incorrect grammar usage should set some memes         mutually exclusive. One node “fires” or activates whenever
apart from others. Words with incorrect spelling, or novel       it thinks a meme is successful and the other node “fires” or
words and phrases should stand out more than correct word        activates whenever it thinks a meme is unsuccessful. For
spellings and established words and phrasings. If memes are      any given meme, if its output activation is higher on the
competing for attention, then memes with novel words or          successful output node than the unsuccessful output node,
phrases should tend to be more popular or successful than        then that meme is considered successful, and vice versa.
memes using traditional spelling and phrasing.                   There was a total of 268 memes used to train the network.
                                                                 Network weights were trained on each meme 3000 times
Meme Type. Finally, three meme type features were coded.
                                                                 and were modified using the delta rule. Network weights
The three meme types consist of template memes, copy-and-        were initially random and trained with the delta rule, a
paste memes, and game memes. These were three different          gradient descent learning algorithm. The target activation
features all mutually exclusive and determined during the        value of 1 represented “successful” and values of 0
search process. Examples of game memes are “The object to        represent “unsuccessful”. This value was determined by
your left will be your only weapon during a zombie               using a median split on the popularity of the meme, where
apocalypse” or “You are now manually breathing”. An              highly transmitted memes were considered successful and
example of a template meme is provided in Figure 1.              frequently retransmitted at the level of the user, and more
                                                                 infrequent memes while likely observed by many, were less
                                                                 likely to be retransmitted. Learning rate was set to .001, and
                                                                 the momentum term was set to 0.2. The network was trained
                                                                 to convergence. The network was trained on all training
                                                                 items 3000 times, at which the MSE reached an average of
                                                                 .228. Matlab coding of the network is available from the
                                                                 first author upon request.
                                                                                          Results
                                                                   In order to test the accuracy of the network, a random
                                                                 subset of 25 coded memes was left out of the training set to
                                                                 test generalization to new items using a fully trained set of
                                                                 connection weights. This is a test of the network’s
predictive power and generalization to new memes . The           may contribute more or less to the prediction of success in
resulting output activation values were compared to the          the network, and as with other neural networks it is difficult
expected target values. If the meme’s output activation on       to see what is driving these results. However, comparing the
the “successful” output node was greater than the output         network’s results with a binary logistic regression helped to
activation on the “unsuccessful” output node then the            provide some insight. Meme length, whether or not a meme
classification was considered accurate. If the meme’s output     is a template meme, and the presence or absence of swear
activation on the “unsuccessful” output node was greater         words within the meme contributed significantly to
than the output activation on the “successful” output node       predicting success in the logistic model. However, the
then the classification was considered inaccurate. The           logistic model did not have prediction accuracy as high as
network achieved 80% prediction accuracy, or 20% higher          the neural network model, pointing to the potential
than chance. Specifically, the network was able to               contribution of other variables that on their own are not
accurately predict a successful meme to be successful with       predictive in a regression, but in an interactive context like a
73% accuracy, and was able to accurately predict an              neural network, have some predictive power. Further
unsuccessful meme to be unsuccessful with 90% accuracy.          analysis of the network’s principle components is clearly
While the model appears able to learn with an overall MSE        needed in future work.
of .228, it struggles to generalize accurately.                     The neural network model presented here has several
                                                                 major limitations. The first limitation is the operationalized
Regression analysis. In addition, a binary logistic              definition of success. Google search results offer a quick
regression was performed. The target values (successful or       rough grained estimate for overall meme usage, but
unsuccessful) were considered the dependent variable and         searching for specific phrases can still sometimes include
each input node was considered an independent variable.          inaccurate search results. Without extensive and
Because all data is binary, binary logistic regression is        computationally expensive web-crawlers, determining
appropriate for analyzing the factors that contribute to         meme context from Google search results can be extremely
predicted success of a meme. The overall logistic regression     difficult. Memes that can be used in multiple domains can
model was statistically significant, X2(14) = 48.893, p <        be considered “flexible memes”, a quality that is likely
.0005. The model explained 22.3% (Nagelkerke R2) of the          related to overall meme fitness. Another limitation to the
variance in meme success and correctly classified 54.1% of       current study is the input set and test set are relatively small.
the successful memes as successful and 80.6% of the              Many studies attempting to predict meme success have
unsuccessful memes as unsuccessful. Overall the binary           access to millions of memes, albeit with a broader
logistic regression model had a prediction accuracy of           operational definition. If the success of textual memes is
67.4%. Three predictor variables were statistically              largely dependent on the average person’s ability to
significant. First, shorter memes were significant (p <.005),    remember them, then many more cognitive variables can
and 2.802 times more likely to contribute to success. Memes      and should be included.
that contained a swear word were .177 times less likely to
be successful than unsuccessful (p <.05), a small but                                    Conclusion
significant contribution. Finally, template memes were              The ability to detect and track memes and to predict their
2.223 times more likely to be successful than unsuccessful       success is essential in order to improve our understanding
(p <.05).                                                        cultural evolution. Observing textual memes in particular
                                                                 offers unique insights into the evolution of language. Social
                        Discussion                               media provides a petri dish environment for rapid meme
   The results of the current study demonstrate the utility of   generation and mutation. The current study categorized
using linguistic information as a means of predicting            meme content based on 12 features grounded on cognitive
successful transmission of a meme. These preliminary             theories of memory, emotion, and working memory
results warrant more in depth analyses, particularly a           limitations. This experiment helped support the idea that
principle component analysis to determine whether the            meme content should be considered when attempting to
network is easily carving out a solution space and clustering    predict meme success. Future studies on meme prediction
success and non-success, or whether it is unclear based on       should benefit from a more robust operational definition of
the input. A highly clustered solution space that exhibits       success. This can likely be achieved by limiting the scope
linear seperability after learning in its first few principle    from a global internet search to a specific social network. If
components would argue for the issue being with the              a feed-forward backpropagation neural network can achieve
inaccuracy of the input and target information structure,        relative success in predicting meme popularity, then a more
whereas a more overlapping or complex seperability would         robust network that takes into account working memory
mean it is simply a very difficult problem to solve, with the    limitations should provide more accurate results.
linguistic information contributing a rich source of                This model demonstrates that it is not only possible to
information that could be used in models that incorporate        predict overall success of a meme at greater than chance
multiple domains of information (user-level, visual feature,     levels, but also argues for there being important parameters
social structure, etc.). Some of the features in the network     at the level of what other models typically neglect: whether
or not the node transmits the information further. Other        Lobe in Humans. Psychological Science, 9(6), 490–493.
models of meme transmission typically only take into            doi:10.1111/1467-9280.00090
account the change of the meme over time (evolution), the     Leskovec, J., Backstrom, L., & Kleinberg, J. (2009, June).
rates of transmission (viral) or the number of connections      Meme-tracking and the dynamics of the news cycle. In
(small world networks). By incorporating cognitive              Proceedings of the 15th ACM SIGKDD international
processes into models that also include information about       conference on Knowledge discovery and data mining (pp.
the network at large, greater levels of prediction could be     497-506).
achieved in future instantiations of meme transmission        Pennebaker, J. W., & Francis, M. E. (1999). Linguistic
models.                                                         inquiry and word count (LIWC). Mahwah, NJ: Erlbaum.
                                                              Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B.,
                      References                                Patil, S., Flammini, A., & Menczer, F. (2011, March).
Adamic, L. A., Lento, T. M., Adar, E., & Ng, P. C. (2014).      Truthy: mapping the spread of astroturf in microblog
  Information Evolution in Social Networks.                     streams. In Proceedings of the 20th international
  arXiv:1402.6792       [physics].       Retrieved     from     conference companion on World wide web (pp. 249-252).
  http://arxiv.org/abs/1402.6792                                ACM.
Baddeley, A. D., Thomson, N., & Buchanan, M. (1975).          Romero, D. M., Meeder, B., & Kleinberg, J. (2011).
  Word length and the structure of short-term memory.           Differences in the mechanics of information diffusion
  Journal of Verbal Learning and Verbal Behavior, 14(6),        across topics: idioms, political hashtags, and complex
  575–589. doi:10.1016/S0022-5371(75)80045-4                    contagion on twitter. In Proceedings of the 20th
Barrat, A., Barthelemy, M., & Vespignani, A. (2008).            international conference on World wide web (pp. 695-
  Dynamical processes on complex networks. Cambridge,           704). ACM. doi:10.1145/1963405.1963503
  UK; New York: Cambridge University Press.                   Shifman, L. (2012). An anatomy of a YouTube meme. New
Blackmore, S. (1998). Imitation and the definition of a         Media & Society, 14(2), 187–203.
  meme. Journal of Memetics-Evolutionary Models of            Simmons, M. P., Adamic, L. A., & Adar, E. (2011). Memes
  Information Transmission, 2(11), 159–170.                     online: Extracted, subtracted, injected, and recollected. In
Cahill, L., & McGaugh, J. L. (1998). Mechanisms of              In Proceedings of the Fifth International AAAI
  emotional arousal and lasting declarative memory. Trends      Conference on Weblogs and Social Media.
  in Neurosciences, 21(7), 294–299. doi:10.1016/S0166-        Tausczik, Y. R., & Pennebaker, J. W. (2010). The
  2236(97)01214-9                                               Psychological Meaning of Words: LIWC and
Daley, D. J., & Kendall, D. G. (1964). Epidemics and            Computerized Text Analysis Methods. Journal of
  Rumours.        Nature,       204(4963),       1118–1118.     Language and Social Psychology, 29(1), 24–54.
  doi:10.1038/2041118a0                                       Tsur, O., & Rappoport, A. (2012, February). What's in a
Dawkins, R. (1989). The selfish gene. Oxford; New York:         hashtag?: content based prediction of the spread of ideas
  Oxford University Press.                                      in microblogging communities. In Proceedings of the fifth
Fredrickson, B. L., & Branigan, C. (2005). Positive             ACM international conference on Web search and data
  emotions broaden the scope of attention and thought ‐         mining (pp. 643-652). ACM.
  action repertoires. Cognition & Emotion, 19(3), 313–332.    Walker, I., & Hulme, C. (1999). Concrete words are easier
Hogg, T., & Lerman, K. (2012). Social dynamics of Digg.         to recall than abstract words: Evidence for a semantic
  EPJ Data Science, 1(1). doi:10.1140/epjds5                    contribution to short-term serial recall. Journal of
Ienco, D., Bonchi, F., & Castillo, C. (2010, December). The     Experimental Psychology: Learning, Memory, and
  meme ranking problem: Maximizing microblogging                Cognition, 25(5), 1256–1271. doi:10.1037/0278-
  virality. In Data Mining Workshops (ICDMW), 2010              7393.25.5.1256
  IEEE International Conference (pp. 328-335). IEEE.          Watson, D., Clark, L. A., & Tellegen, A. (1988).
Jay, T., Caldwell-Harris, C., & King, K. (2008). Recalling      Development and validation of brief measures of positive
  Taboo and Nontaboo Words. The American Journal of             and negative affect: The PANAS scales. Journal of
  Psychology, 121(1), 83.                                       Personality and Social Psychology, 54(6), 1063–1070.
Kensinger, E. A. (2007). Negative Emotion Enhances              doi:10.1037/0022-3514.54.6.1063
  Memory Accuracy: Behavioral and Neuroimaging                Weng, L., Menczzer, F., Ahn, Y.-Y. Predicting successful
  Evidence. Current Directions in Psychological Science,        memes using network and community structure. In
  16(4), 213–218.                                               Proceedings of the Eigth International AAAI Conference
Kensinger, E. A., & Corkin, S. (2003). Memory                   on Weblogs and Social Media (ICWSM’14), Ann Arbor,
  enhancement for emotional words: Are emotional words          MI, USA, June 2014.
  more vividly remembered than neutral words? Memory &        Xie, L., Natsev, A., Kender, J. R., Hill, M., & Smith, J. R.
  Cognition, 31(8), 1169–1180. doi:10.3758/BF03195800           (2011). Visual memes in social media: tracking real-world
LaBar, K. S., & Phelps, E. A. (1998). Arousal-Mediated          news in YouTube videos (p. 53). In Proceedings of the
  Memory Consolidation: Role of the Medial Temporal             19th ACM international conference on Multimedia (pp.
                                                                53-62). ACM Press. doi:10.1145/2072298.2072307