For my first kernel on Natural Language Processing (NLP), I chose the SMS Spam Collection Dataset. It contains the text of 5572 SMS messages and a label, classifying the message as "spam" or "ham".
-
Used Naive Bayes for classification
The multinomial Naive Bayes classifier is suitable for classification with discrete features (e.g., word counts for text classification).
The multinomial distribution normally requires integer feature counts. However, in practice, fractional counts such as tf-idf may also work.
-
Gained 95% accuracy
🚀 About Me
Hi, I'm Anna! 👋
I am an AI Enthusiast and Data science & ML practitioner