Skip to content

Proyek Skripsi Risma Faoziya mengenai Multimodal Sentiment Analysis pada Ulasan Restoran Berbahasa Indonesia Menggunakan Image Captioning BLIP-LAVIS dan Bidirectional Encoder Representation from Transformer (BERT)

Notifications You must be signed in to change notification settings

faoziyarisma/MSA_Ulasan_Restoran

Repository files navigation

Multimodal Sentiment Analysis on Indonesian Restaurant Reviews

For my final-year machine learning project, I conducted Multimodal Sentiment Analysis on Indonesian Restaurant Reviews using a combination of Image Captioning and BERT (Bidirectional Encoder Representation from Transformer). The project utilized a dataset comprising 2700 reviews gathered through scrapping Google Maps reviews, with each class having an even distribution of 900 reviews. The analysis involved classifying two modalities, namely image and text, into three polarities: positive, neutral, and negative. The process began by converting the image modality to text using an image captioning approach, leveraging the LAVIS API in Google Colab and employing the BLIP Pre-trained model for both image and text fields. The primary goal of this step was to generalize auxiliary sentences or captions to provide additional information to the text modality, ultimately enhancing prediction accuracy. Text preprocessing involved various steps such as data formatting, cleaning, case folding, slang conversion, data extraction, emoji conversion to expressions, and stemming. Once the captions were obtained, the next step involved concatenating them with the text modality and processing the combined data using Large Language Modelling BERT. The experiment encompassed three variations of BERT pre-trained models: BERT specialized in the English language, IndoBERT specialized in the Indonesian language, and Multilingual BERT containing multiple languages. This choice was made due to the combination of languages in actual reviews, featuring both English and Indonesian. Therefore, three experiments with the mentioned pre-trained models were conducted. The project's task was multiclass classification, utilizing Cross Entropy Loss as the loss function, ADAMW as the optimizer, and accuracy as the metric. The experiment focused on finding optimal hyperparameters through Bayesian optimization and identifying the best pre-trained model. The most favorable scenario involved using the IndoBERT pre-trained model with optimal hyperparameters (learning rate: 7.21×10-6, dropout: 0.1337), achieving a 73% accuracy rate for the test data.

About

Proyek Skripsi Risma Faoziya mengenai Multimodal Sentiment Analysis pada Ulasan Restoran Berbahasa Indonesia Menggunakan Image Captioning BLIP-LAVIS dan Bidirectional Encoder Representation from Transformer (BERT)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published