- pip install -r requirements.txt
- run
_create_directories.pyto automatically create all required folders if they do not already exist
Observations:
- googletrans package uses an older httpx package (0.19.0)
- all the necessary AI packages (e.g., mistralai, openai) require a new version of the httpx package (0.28.1)
To work around this conflict, a manual version switch of the httpx package was performed, depending on which functionality was needed.
- review scraping (
datasets_creation_google.py) - review preprocessing for both LLM (
datasets_preprocessing_LLMs.py) and LDA (datasets_preprocessing_LDA.py)
- LDA implementation ~ topic modelling + classification + coherence score (
topic_modeling_LDA.py) - LDA classification interpretation (
interpretation_classification_LDA.ipynb)
- ChatGPT-4o-mini issue extraction (
issue_modelling_ChatGPT.py) - ChatGPT-4o-mini evaluation ~ coherence score + cosine similarity with LDA (
evaluation_ChatGPT.ipynb) - ChatGPT-4o-mini classification (
classification_ChatGPT.py) - ChatGPT-4o-mini classification interpretation (
interpretation_classification_ChatGPT.ipynb)
supports two models (1.5_pro, 2.0_flash)
- Gemini issue extraction (
issue_modelling_Gemini.py) - Gemini evaluation ~ coherence score + cosine similarity with LDA (
evaluation_Gemini.ipynb) - Gemini classification (
classification_Gemini.py) - Gemini classification interpretation (
interpretation_classification_Gemini.ipynb)
- Claude-3.5-Sonnet issue extraction (
issue_modelling_Claude.py) - Claude-3.5-Sonnet evaluation ~ coherence score + cosine similarity with LDA (
evaluation_Claude.ipynb)
supports two models (large_2411, small_2501)
Although, due to multiple issues encountered for small_2501, this one was issues only for the issue extraction and evaluation
- Mistral issue extraction (
issue_modelling_Mistral.py) - Mistral evaluation ~ coherence score + cosine similarity with LDA (
evaluation_Mistral.ipynb) - Mistral-large-2411 classification (
classification_Mistral.py) - Mistral-large-2411 classification interpretation (
interpretation_classification_Mistral.ipynb)
- issue comparison ~ hierarchical graph ~ clustered graph (
LLMs_comparison_issue_modelling.py) - per-review/classification comparison ~ Jason-Shannon Divergence (on both LLM-Specific Space and Union Space) ~ review agreement ~ Cohen's Kappa ~ Krippendorff Alpha (
LLMs_comparison_classification.py)
- issue impact on star ratings and issue frequency over time (
time_evolution.py) - issue + frequency forecasting with Gemini 2.0 flash (
forecasting_Gemini.py) - forecasting evaluation (
forecasting_evaluation.py)
- effects of issues on the Start Ratings ~ moderating effects of years on the issue-rating relationship (
CLM_evaluation.py)
Governmental Applications: KopieID, Reisapp, MijnOverheid, DigiD
| Application | Preprocessed Data | LDA modelling | LLM issue extraction | LLM classification | LLM comparison | Time Analysis | CLM Analysis |
|---|---|---|---|---|---|---|---|
| KopieID | LDA:✔️ , LLM:✔️ | ✔️ | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large, Claude | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large | ✔️ | star-issue timeline, frequency timeline, Gemini Forecasting | ✔️ |
| Reisapp | LDA:✔️ , LLM:✔️️ | ✔️ | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large, Claude | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large | ✔️ | star-issue timeline, frequency timeline, Gemini Forecasting | ✔️ |
| MijnOverheid | LDA:✔️ , LLM:✔️ | ✔️ | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large, Claude | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large | ✔️ | star-issue timeline, frequency timeline, Gemini Forecasting | ✔️ |
| DigiD | LDA:✔️ , LLM:✔️ | ✔️ | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large, Claude | GPT 4o mini, Gemini 1.5, Gemini 2.0, Mistral large | ✔️ | star-issue timeline, frequency timeline, Gemini Forecasting | ✔️ |
- datasets_raw: contains the unprocessed reviews extracted by google play scraper.
- datasets_preprocessed_LDA: contains the preprocessed reviews for the LDA algorithm.
- datasets_preprocessed_LLM: contains the preprocessed reviews for the LLMs.
- results_LDA: contains five folders, one for the per-review distributions, one for the LDA html visualizations, one for the coherence heatmaps, one for topic + words extractions, and one for the additional plots obtained in the interpretation notebook.
- results_ChatGPT: contains four folders, one for the extracted issues, one for per-review distributions, one for the coherence heatmap, and one for the additional plots obtained in the evaluation and interpretation notebooks.
- results_Gemini: contains four folders, one for the extracted issues, one for per-review distributions, one for the coherence heatmap, and one for the additional plots obtained in the evaluation and interpretation notebooks.
- results_Claude: contains three folders, one for the extracted issues, one for the coherence heatmap and one for the additional plots obtained in the evaluation notebook.
- results_Mistral: contains four folders, one for the extracted issues, one for per-review distributions, one for the coherence heatmap, and one for the additional plots obtained in the evaluation and interpretation notebooks.
- LLM_comparison: contains two folders, one for the extracted issues in which the cluster and hierarchical graphs are saved for each app, and one for the per-review topic distributions in which the heatmaps for both types of JS divergence and the bar plot for the review agreement are presented (+ plot with 3 agreement metrics).
- time_analysis: contains four folders, one with the plots describing the impact of the issues on the star ratings over time, one with the plots showing the frequency of each issue over time, one for the forecasted issues (and suggestions), and one for the plots used in forecasting evaluation.
- CLM_analysis: contains two folders, one with the bar plots reflecting the overall issue effects on the star rating, and one with the moderating effects of years on the issue-rating relationship