[MASK]ED - Language Modeling for Explainable Classification and Disentangling of Socially Unacceptable Discourse.
This repository contains the code for Token Importance Assessment, Masked Language Modeling (MLM) Pretraining and Finetuning, Label Noise Removal, and Annotated Corpus relabeling. An inference script is provided to test the tuned models.
- Installation
- Usage
- Token Importance Assessment
- Masked Language Modelling
- Downstream Performance Evaluation
- Acknowledgements
- Contributing
- Contact
Clone the repo using the following command:
git clone https://github.com/rayaneghilene/MLM_Pretraining.git
cd MLM_PretrainingWe recommend creating a virtual environment (optional):
python3 -m venv myenv
source myenv/bin/activate To install the requirements run the following command:
pip install -r requirements.txt
All experiments should be ran using the main.py file. The arguments are as follows:
-
--experiment_name: can be either 'train' for MLM training, or 'finetune_nli'$$\textcolor{red}{required}$$ -
--model_name: can be either 'roberta', 'bert', or 'electra' -
--GPU: Specifies the GPU device number to use. If not set, the training will default to using the CPU. Leave this option unset if you don’t have a GPU or prefer not to use one. -
--pretrained_model_path: is the Path to the pretrained model. -
--dataset_path: is the Path to your dataset. -
--masking_strategy: can be either 'PMI', or 'BERTopic' (PMI is the default option) -
--loss_strategy: is used for optimisation of the loss (with PMI or LDA..), and can be either 'weighted', or for no optimisation 'none' (weighted is the default option) -
--nli_dataset_name: can be either 'mnli', 'qnli', or 'snli' ('mnli' is the default option) -
--save_path: is the Path to save the pretrained model and tokenizer (the default path is ''./Trained_models/')
Here's an example command to train a model for masked Language Modelling:
nohup python main.py
--experiment_name 'train'
--GPU '1'
--model_name 'roberta'
--dataset_path Path_to_the_dataset
--save_path Path_to_save_the_trained_model
--masking_strategy 'BERTopic'
> Pretraining_logs.log 2>&1 &Here's an example command to Fine tune a pretrained model in a supiervised fashion:
nohup python main.py
--pretrained_model_path path_to_your_pretrained_model
--GPU '1'
--data_path Path_to_you_data
--save_path Path_to_save_the_finetuned_model
> Supervised_logs.log 2>&1 &You can visualize the finetuning progress via terminal using the following command
tail -f Supervised_logs.logFor supervized classification, we compare the following seeds [42, 123, 4567, 8910, 13579, 24680, 98765, 54321, 11111, 99999] and aggregate average and standard deviation of F1 scores over seeds.
Run the following command
nohup python utils/Inference_test.py
--model_path path_to_your_pretrained_model
--data_path Path_to_you_data
> Inference_logs.log 2>&1 &You can visualize the inference progress via terminal using the following command
tail -f Inference_logs.logWe compute the Pointwise Mutual Information (PMI) score for each token based on its co-occurrence with a specific class label in a professionally annotated corpus of approximately 470K Tweets. The dataset contains annotations for categories related to Socially Unacceptable Discourse, such as hateful, offensive, and toxic content.
Where: - P(x, y) is the probability of both events x and y occurring together - P(x) and P(y) are the probabilities of the individual events x and y occurring independently.
A higher PMI score indicates a stronger association between the token and the specific class. To obtain a final importance score for each token, we compute its PMI score for all class labels and take the average across them. This approach ensures that tokens frequently associated with socially unacceptable discourse receive higher importance scores, guiding our token selection process during masked language model pre-training.
For a detailed mathematical breakdown of PMI and its role in importance assessment, refer to this link.
The process involves randomly masking a certain percentage of words in a given Tokenized sentence (usually 15%) and training a model to predict the original words based on the surrounding context. The masked tokens are replaced with a [MASK] token for BERT (<mask> for roBERTa), and the original tokens are stored as targets for prediction.
We employ a Static Token masking strategy; The masked tokens are selected once during data preprocessing and remain the same across all training epochs to ensure consistency.
In the dataset, the ground-truth token IDs, masked in the inputs, are present in the label tensor and all other tokens are ignored (set to -100) by the default behaviour of nn.CrossEntropyLoss() as illustrated:
During preprocessing, labels are initialized to -100 for all tokens, indicating they should be ignored during loss calculation. For positions where tokens were masked, their corresponding token IDs are assigned as labels. The dataset is split into training and test sets, and the masked text, along with the labels, is prepared for model training.
During the training, the model is optimized by minimizing the loss between its predictions and the original tokens. The importance of the masked tokens, guided by their Importance scores, is incorporated into the loss function to emphasize learning from socially unacceptable discourse tokens. Specifically, tokens with higher scores are weighted more heavily in the loss calculation, encouraging the model to focus on learning the contextual relationships involving these tokens.
For a detailed mathematical breakdown of weighted loss optimisation, refer to this link.
To assess the impact of our pretraining approach, we fine-tune the trained models on downstream tasks. We evaluate their performance on supervised classification for SUD detection benchmark datasets.
We fine-tune the models on a collection of datasets focused on detecting hateful, offensive, toxic, and other forms of socially unacceptable discourse. Each dataset contains professionally annotated samples, ensuring robust and reliable evaluation. • Task: Given a text input, classify it into predefined categories such as hateful, offensive, or neutral. • Objective: Measure whether pretraining with importance weighted masking improves classification accuracy compared to baseline models trained with standard MLM. • Metrics: We report macro-F1, accuracy, and precision-recall curves to capture overall performance and class-specific behavior.
This work was conducted as part of the European Arenas project, funded by Horizon Europe. Its objective is to characterize, measure, and understand the role of extremist narratives in discourses that have an impact not only on political and social spheres but importantly on the stakeholders themselves. Leading an innovative and ambitious research program, ARENAS will significantly contribute to filling the gap in contemporary research, make recommendations to policymakers, media, lawyers, social inclusion professionals, and educational institutions, and propose solutions for countering extreme narratives for developing more inclusive and respectful European societies.
We welcome contributions from the community to enhance work. If you have ideas for features, improvements, or bug fixes, please submit a pull request or open an issue on GitHub.
Feel free to reach out about any questions/suggestions at rayane.ghilene@ensea.fr