Skip to main content

Detection and Classification of Spam in Social Media Comments Using Artificial Intelligence – A Case Study

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14968))

Included in the following conference series:

  • 116 Accesses

Abstract

The proliferation of spam content is on the rise due to the widespread use of social media. Users receive numerous text messages via social media platforms, making it challenging to identify spam within these messages. Spam messages often include harmful links, deceptive apps, fraudulent accounts, fake news, misleading reviews, and rumors. In this sense, enhancing social media security necessitates the crucial task of detecting and controlling spam text. This paper presents a practical approach for classification of spam detection in social media comments using python programming. Using large datasets from Facebook, Twitter, YouTube, SMS, Reddit and E-mail and using the Decision Trees, Logistic Regression and Random Forest it was achieved accuracy between 88% and 96% of spam classification. The datasets and implementations are available on the opensource platform Ghitub to be used and improved in future works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Scikit learn - https://scikit-learn.org/stable/.

  2. 2.

    Keras - https://keras.io/.

  3. 3.

    Pythorch - https://pytorch.org/.

  4. 4.

    https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset.

  5. 5.

    https://www.kaggle.com/datasets/emineyetm/fake-news-detection-datasets.

  6. 6.

    https://www.kaggle.com/datasets/venky73/spam-mails-dataset.

  7. 7.

    https://www.kaggle.com/code/annasholokhova/spam-tweets-detection.

  8. 8.

    https://www.kaggle.com/datasets/lakshmi25npathi/images.

  9. 9.

    https://github.com/Vasquiinho/MEI_BAMD-SocialMediaSpamClassification.

References

  1. Abinaya, R., Bertilla, E., Naveen, P.: Spam detection on social media platforms. In: 7th International Conference on Smart Structures and Systems, pp. 1–3 (2020)

    Google Scholar 

  2. Sanjeev, R., Anil Kumar, V., Tarunpreet, B.: A review on social spam detection: challenges, open issues, and future directions. Expert Syst. Appl. 186 (2021)

    Google Scholar 

  3. Yurtseven, I., Bagriyanik, S., Ayvaz, S.: A review of spam detection in social media. In: 6th International Conference on Computer Science and Engineering, pp. 383–388 (2021)

    Google Scholar 

  4. Chrismanto, A., Sari, A., Suyanto, Y.: Critical evaluation on spam content detection n social media. J. Theor. Appl. IT 100(8) (2022)

    Google Scholar 

  5. Kaddoura, S., et al.: A systematic literature review on spam content detection and classification. PeerJ Comput. Sci. (2022)

    Google Scholar 

  6. Amir, A., Amin, M.: An Approach for Spam Detection in YouTube Comments Based on Supervised Learning (2016)

    Google Scholar 

  7. Sharmin, S., Zaman, Z.: Spam detection in social media employing machine learning tool for text mining. In: 13th International Conference on Signal-Image Technology & Internet-Based Systems, pp. 137–142 (2017)

    Google Scholar 

  8. Tingmin, W., et al.: Twitter spam detection based on deep learning. In: Australasian Computer Science Week Multiconference, Association for Computing Machinery, pp. 1–8 (2017)

    Google Scholar 

  9. Gupta, M., et al.: A Comparative Study of Spam SMS Detection Using Machine Learning Classifiers, pp. 1–7 (2018)

    Google Scholar 

  10. Madisetty, S., Desarka, M.: A neural network-based ensemble approach for spam detection in twitter. IEEE Trans. Comput. Soc. Syst. 5(4), 973–984 (2018)

    Google Scholar 

  11. Tammina, S.: A comparative study of deep learning methods for spam detection. In: International Conference on I-SMAC (2020)

    Google Scholar 

  12. Das, R., et al.: Detection of spam in Youtube comments using different classifiers. In: Pati, B., Panigrahi, C., Buyya, R., Li, KC. (eds.) Advanced Computing and Intelligent Engineering. Advances in Intelligent Systems and Computing, vol. 1082, pp. 201–214. Springer, Singapore (2020)

    Google Scholar 

  13. Reddy, K., Reddy, E.: Spam detection in social media networking sites using ensemble methodology with cross validation. Int. J. Eng. Adv. Technol. (2020)

    Google Scholar 

  14. Oh, H.: A YouTube spam comments detection scheme using cascaded ensemble machine learning model. IEEE Access 9, 144121–144128 (2021)

    Google Scholar 

  15. Rodrigues, A., et al.: Real-time twitter spam detection and sentiment analysis using machine learning and deep learning techniques. In: Computational Intelligence and Neuroscience (2022)

    Google Scholar 

  16. Ghosh, A., Senthilrajan, A.: Comparison of machine learning techniques for spam detection. Multimedia Tools Appl. (2023)

    Google Scholar 

  17. Alipour, S., Orji, R., Zincir-Heywood, A.: Behaviour and bot analysis on online social networks: twitter, parler, and reddit. Int. J. Technol. Hum. Interact. (2023)

    Google Scholar 

  18. Malhotra, P., Malik, S.: Spam email detection using machine learning and deep learning techniques. In: International Conference on Innovative Computing & Communication (2022)

    Google Scholar 

  19. Robertson, S.: Understanding inverse document frequency: on theoretical arguments for IDF. J. Doc. 60, 503–520 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Ribeiro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alves, V., Ribeiro, J. (2025). Detection and Classification of Spam in Social Media Comments Using Artificial Intelligence – A Case Study. In: Santos, M.F., Machado, J., Novais, P., Cortez, P., Moreira, P.M. (eds) Progress in Artificial Intelligence. EPIA 2024. Lecture Notes in Computer Science(), vol 14968. Springer, Cham. https://doi.org/10.1007/978-3-031-73500-4_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73500-4_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73499-1

  • Online ISBN: 978-3-031-73500-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics