skip to main content
research-article
Open access

Hindcasting Violent Events in Colombia Using Internet Data

Published: 11 July 2021 Publication History

Abstract

Colombia experienced a decades-long civil war between the government and many left-wing guerrilla groups. It was marked by violence, kidnappings, and large quantities of human displacement. Monitoring and forecasting civil wars are important to mitigate their potential impact but require access to ground truth data. We examine the use of Internet data streams, namely Google search queries, tweets related to politics, and traditional news sources to retrospectively forecast (i.e., hindcast) state-based armed violence in Colombia. We compare the results of statistical models using three combinations of these features to evaluate the predictive capabilities of each data stream. Our results show that the combination of Internet and traditional news data models perform most consistently, although Internet-only is surprisingly promising. Overall, we are able to produce high-quality models hindcasting the presence or absence of state-based armed violence in Colombia up to 6 months in advance. These results support the use of exogenous data streams to forecast evolving situations around the globe.

References

[1]
Feature Ranking with Recursive Feature Elimination and Cross-Validated Selection (RFECV). Retreived on January, 2021 from https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFECV.html
[2]
Google. Google Flu Trends Data. Accessed November 28, 2020 from https://www.google.org/flutrends/about/.
[3]
Raytheon BBN Technologies. 2015. BBN ACCENT Event Coding Evaluation. Technical Report. Raytheon BBN Technologies.
[4]
Google. 2020. FAQ About Google Trends Data. Retrieved May 18, 2021 from https://support.google.com/trends/answer/4365533?hl=en&ref_topic=6248052.
[5]
Aseel Addawood, Adam Badawy, Kristina Lerman, and Emilio Ferrara. 2019. Linguistic cues to deception: Identifying political trolls on social media. In Proceedings of the 13th International Conference on Web and Social Media.
[6]
Jisun An, Haewoon Kwak, Oliver Posegga, and Andreas Jungherr. 2019. Political discussions in homogeneous and cross-cutting communication spaces. In Proceedings of the 13th International AAAI Conference on Web and Social Media.
[7]
Miriyam Aouragh and Anne Alexander. 2011. The Egyptian experience: Sense and nonsense of the Internet revolution. International Journal of Communication 5 (2011), 1344–1358.
[8]
Bryan Arva, John Beieler, Bejamin Fisher, Gustavo Lara, Philip A. Schrodt, Wonjun Song, Marsha Sowell, and Sam Stehle. 2013. Improving forecasts of international events of interest. In Proceedings of the EPSA 2013 Annual General Conference. Article 78.
[9]
World Bank. 2018. Pathways for Peace: Inclusive Approaches to Preventing Violent Conflict. Technical Report. United Nations, Washington, DC.
[10]
Jiang Bian, Umit Topaloglu, and Fan Yu. 2012. Towards large-scale Twitter mining for drug-related adverse events. In Proceedings of the 2012 International Workshop on Smart Health and Wellbeing (SHB’12). ACM, New York, NY, 25.
[11]
Elizabeth Boschee, Jennifer Lautenschlager, Sean O’Brien, Steve Shellman, and James Starz. 2018. ICEWS Automated Daily Event Data. Retrieved May 18, 2021 from
[12]
Elizabeth Boschee, Premkumar Natarajan, and Ralph Weischedel. 2013. Automatic extraction of events from open source text for predictive forecasting. In Handbook of Computational Approaches to Counterterrorism, V. S. Subrahmanian (Ed.). Springer, New York, NY.
[13]
Ashlynn R. Daughton, Chrysm Watson Ross, Geoffrey Fairchild, and Sara Y. Del Valle. 2019. Topic modeling to contextualize event-based datasets: The Colombian peace process. In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances on Resilient and Intelligent Cities.
[14]
Munmun De Choudhury, Michael Gamon, Scott Counts, and Eric Horvitz. 2013. Predicting depression via social media. In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media.
[15]
Munmun De Choudhury, Scott Counts, and Eric Horvitz. 2013. Social media as a measurement tool of depression in populations. In Proceedings of the 5th Annual ACM Web Science Conference (WebSci’13). ACM, New York, NY, 47–56.
[16]
Department of Peace and Conflict Research. n.d. UCDP Definitions. Retrieved May 18, 2021 from https://www.pcr.uu.se/research/ucdp/definitions/.
[17]
Karsten Donnay. 2017. Big data for monitoring political instability. International Development Policy 8, 1 (Jan. 2017), Article 8.1.
[18]
Nils Petter Gleditsch, Peter Wallensteen, Mikael Eriksson, Margareta Sollenberg, and Håvard Strand. 2002. Armed conflict 1946-2001: A new dataset. Journal of Peace Research 39, 5 (Sept. 2002), 615–637.
[19]
Jack A. Goldstone, Robert H. Bates, David L. Epstein, Ted Robert Gurr, Michael B. Lustik, Monty G. Marshall, Jay Ulfelder, and Mark Woodward. 2010. A global model for forecasting political instability. American Journal of Political Science 54, 1 (Jan. 2010), 190–208.
[20]
Håvard Hegre, Marie Allansson, Matthias Basedau, Michael Colaresi, Mihai Croicu, Hanne Fjelde, Frederick Hoyles, et al. 2019. ViEWS: A political violence early-warning system. Journal of Peace Research 56, 2 (March 2019), 155–174.
[21]
Håvard Hegre, Nils W. Metternich, Håvard Mokleiv Nygård, and Julian Wucherpfennig. 2017. Introduction: Forecasting in peace research. Journal of Peace Research 54, 2 (March 2017), 113–124.
[22]
Jill E. Hopke, Itay Gabay, Sojung C. Kim, and Hernando Rojas. 2016. Mobile phones and political participation in Colombia: Mobile Twitter versus mobile Facebook. Communication and the Public 1, 2 (June 2016), 159–173.
[23]
Philip N. Howard, Aiden Duffy, Deen Freelon, Muzammil M. Hussain, Will Mari, and Marwa Mazaid. 2011. Opening closed regimes: What was the role of social media during the Arab spring?SSRN Electronic Journal.Retrieved May 18, 2021 from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2595096.
[24]
Ana María Ibáñez and Carlos Eduardo Vélez. 2008. Civil conflict and forced migration: The micro determinants and welfare losses of displacement in Colombia. World Development 36, 4 (April 2008), 659–676.
[25]
Sarah Joseph. 2011. Social media, human rights and political change. SSRN Electronic Journal.Retrieved May 18, 2021 from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1856880.
[26]
Gilad Lotan, Erhardt Graeff, Mike Ananny, Devin Gaffney, Ian Pearce, and Danah Boyd. 2011. The revolutions were tweeted: Information flows during the 2011 Tunisian and Egyptian revolutions. International Journal of Communication 5 (2011), 1375–1405.
[27]
Aila M. Matanock. 2018. How elections can lead to peace: Making negotiated settlements last. Foreign Affairs.Retrieved May 18, 2021 from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3205656.
[28]
Aila M. Matanock and Miguel García-Sánchez. 2017. The Colombian paradox: Peace processes, elite divisions & popular plebiscites. Daedalus 146, 4 (Oct. 2017), 152–166.
[29]
David E. McNabb. 2015. Research Methods for Political Science: Quantitative and Qualitative Approaches. Routledge.
[30]
Alan Mislove, Sune Lehmann, Yong-Yeol Ahn, Jukka-Pekka Onnela, and J. Niels Rosenquist. 2011. Understanding the demographics of Twitter users. In Proceedings of the 2011 International Conference on Weblogs and Social Media (ICWSM’11).
[31]
Desirée Nilsson. 2012. Anchoring the peace: Civil society actors in peace accords and durable peace. International Interactions 38, 2 (April 2012), 243–266.
[32]
Alexandra Olteanu, Carlos Castillo, Jeremy Boy, and Kush R. Varshney. 2018. The effect of extremist violence on hateful speech online. In Proceedings of the 12th ernational Conference on Weblogs and Social Media (ICWSM’18).
[33]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, et al. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
[34]
Therése Pettersson. 2019. UCDP/PRIO Armed Conflict Dataset Codebook. Technical Report Version 19.1. Department of Peace and Conflict Research, Uppsala University and Centre for the Study of Civil Wars, International Peace Research Institute, Oslo.
[35]
Therése Pettersson, Stina Högbladh, and Magnus Öberg. 2019. Organized violence, 1989– 2018 and peace agreements. Journal of Peace Research 56, 4 (July 2019), 589–603.
[36]
Reid Priedhorsky, David A. Osthus, Ashlyn R. Daughton, Kelly Moran, Nicholas Generous, Geoffrey Fairchild, Alina Deshpande, and Sara Y. Del Valle. 2017. Measuring global disease with Wikipedia: Success, failure, and a research agenda. In Proceedings of the Conference on Computer-Supported Cooperative Work (CSCW’17).
[37]
Naren Ramakrishnan, Patrick Butler, Sathappan Muthiah, Nathan Self, Rupinder Khandpur, Parang Saraf, Wei Wang, et al. 2014. ‘Beating the news’ with EMBERS: Forecasting civil unrest using open source indicators. arXiv:1402.7035.
[38]
Matthew J. Salganik. 2018. Asking questions. In Bit by Bit: Social Research in the Digital Age. Princeton University Press, Princeton, NJ, 85–146. H62 .S3189 2018
[39]
Jacob R. Scanlon and Matthew S. Gerber. 2015. Forecasting violent extremist cyber recruitment. IEEE Transactions on Information Forensics and Security 10, 11 (Nov. 2015), 2461–2470.
[40]
Philip A. Schrodt. 2013. CAMEO: Conflict and Mediation Event Observations Event and Actor Codebook. Technical Report 1.1b3. Pennsylvania State University.
[41]
Philip A. Schrodt and David Van Brackle. 2013. Automated coding of political event data. In Handbook of Computational Approaches to Counterterrorism, V. S. Subrahmanian (Ed.). Springer, New York, NY, 23–49. MLCM 2015/42354 (H)
[42]
Clay Shirky. 2011. The political power of social media: Technology, the public sphere, and political change. Foreign Affairs 90, 1 (2011), 28–41.
[43]
Ekaterina Stepanova. 2011. The role of information communication technologies in the “Arab spring”: Implications beyond the region. PONARS Eurasia 15 (May 2011), 1–6.
[44]
Yi Tay, Luu Anh Tuan, and Siu Cheung Hui. 2018. COUPLENET: Paying attention to couples with coupled attention for relationship recommendation. In Proceedings of the 12th ernational Conference on Weblogs and Social Media (ICWSM’18).
[45]
UCDP. n.d. Uppsala Conflict Data Program. Retrieved May 18, 2021 from https://ucdp.uu.se/https://ucdp.uu.se/.
[46]
Michael Ward, Andreas Beger, J. Cutler, M. Dickenson, Cassy Dorff, and Benjamin Radford. 2013. Comparing GDELT and ICEWS event data. Analysis 21 (Jan. 2013), 267–297.
[47]
Congyu Wu and Matthew S. Gerber. 2018. Forecasting civil unrest using social media and protest participation theory. IEEE Transactions on Computational Social Systems 5, 1 (March 2018), 82–94.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Digital Government: Research and Practice
Digital Government: Research and Practice  Volume 2, Issue 3
Regular Papers
July 2021
102 pages
EISSN:2639-0175
DOI:10.1145/3474845
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2021
Online AM: 04 May 2021
Accepted: 01 April 2021
Revised: 01 February 2021
Received: 01 April 2020
Published in DGOV Volume 2, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Colombia
  2. Internet data
  3. Social media
  4. political unrest
  5. predictive analyses

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • US Department of Energy through the Los Alamos National Laboratory
  • UC National Laboratory Fees Research Program

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 355
    Total Downloads
  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)21
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media