A NLP library for discriminant terms extraction in space and time
# Load H-TFIDF package
from htfidf import htfidfBestTerms
# Load scikit-learn package
from sklearn.feature_extraction.text import CountVectorizer
# Extract occurence of words in dataset
wordCount = countVectorizer.fit_transform(dataset)
# Extract the top_100 H-TFIDF at a country & month level
H-TFIDF-results = htfidfBestTerms(
wordCount,
spatial_information=dataset.geo,
spatial_level = "Country",
temporal_level = "month",
top_n = 100
)pip install htfidf| Conference | paper | description |
|---|---|---|
| AGILE'2021 | Full paper | H-TFIDF application to COVID-19 tweets. For more information, visit the study's repository. The workflow is fully reproducible, see the related report |
| IJID'2022 | Poster | Using H-TFIDF feature for a spatial opinion mining on COVID-19 tweets |
| INSTICC'2022 | Paper | Feature Selection for Sentiment Classification of COVID-19 Tweets: H-TFIDF Featuring BERT |