Skip to content

alex0dd/Document-Subjectivity

Repository files navigation

Document-Subjectivity

Document subjectivity classification using a small Italian newspapers articles dataset called SubjectivITA. The dataset is composed of 74 training documents containing 1614 sentences in total, and 29 testing documents, containing 227 sentences.

In order to deal with small data limitation, sBERT, a library containing pretrained, multi-language sentence embedding models can be employed.

Using these pretrained models, one can easily produce meaningful representations for individual sentences, which in their turn can be used as features for simpler classical machine learning models to predict the document class.

Additional info

This repository includes notebooks for three experiments described in the report file.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published