To install the requirements:
pip install -r requirements.txt
The project is based on a reduced version of the Amazon Fine Food Reviews dataset (https://www.kaggle.com/snap/amazon-fine-food-reviews), which originally included about 500000 food reviews coming from a period of over ten years (until October 2012). The smaller version is made up of a 35172 reviews and each of them contains the product's id, the user's id, the rating score given by him and, finally, the review's text.
- script.py includes all the functions used for dataset preprocessing and figure generation.
- main.py includes all the function used to perform sentiment analysis and storing the results in Elasticsearch.
- app.py contains the web-app code.
- preprocess.py, SO_Calc.py and SO_Run.py and the directory Resources are adapted from the SO-CAL python library (https://github.com/sfu-discourse-lab/SO-CAL)
- J. J. McAuley and J. Leskovec, “From amateurs to connoisseurs: modeling theevolution of user expertise through online reviews,” inProceedings of the 22nd in-ternational conference on World Wide Web, 2013, pp. 897–908.
- B. Liuet al., “Sentiment analysis and subjectivity.”Handbook of natural languageprocessing, vol. 2, no. 2010, pp. 627–666, 2010.
- M. Hu and B. Liu, “Mining and summarizing customer reviews,” inProceedings ofthe tenth ACM SIGKDD international conference on Knowledge discovery and datamining, 2004, pp. 168–177.
- M. Eirinaki, S. Pisal, and J. Singh, “Feature-based opinion mining and ranking,”Journal of Computer and System Sciences, vol. 78, no. 4, pp. 1175–1184, 2012.
- M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon-based methodsfor sentiment analysis,”Computational linguistics, vol. 37, no. 2, pp. 267–307, 2011.
- C. Hutto and E. Gilbert, “Vader: A parsimonious rule-based model for sentimentanalysis of social media text,” inProceedings of the International AAAI Conferenceon Web and Social Media, vol. 8, no. 1, 2014.
- R. Campos, V. Mangaravite, A. Pasquali, A. Jorge, C. Nunes, and A. Jatowt,“Yake! keyword extraction from single documents using multiple local features,”Information Sciences, vol. 509, pp. 257–289, 2020.