This repository contains the code and data for the paper:
"Impact of EU Non-Financial Reporting Regulation on Spanish Companiesβ Environmental Disclosure: A Cutting-Edge Natural Language Processing Approach"
Published in Environmental Sciences Europe
π Read the Paper
This study investigates the impact of EU regulations on environmental disclosure quality among Spanish companies from 2015 to 2022. We leverage Natural Language Processing (NLP), specifically ClimateBERT and a fine-tuned version on ClimaText, to analyze 729 corporate sustainability reports.
The study reveals how mandatory disclosure requirements improve specificity and commitment in climate risk reporting, aligning companies with international sustainability frameworks.
climateBERT-analysis/
βββ Climate_Disclosure_Analysis.ipynb # Main Jupyter Notebook
βββ data/ # Folder for input reports (not included)
βββ results/ # Folder for generated outputs (e.g., graphs, tables)
βββ requirements.txt # Python dependencies
βββ LICENSE # Repository license
βββ README.md # This file
βββ Paper.pdf # Optional: Paper PDF
- π
data/β (Not included) Place corporate reports here before running the notebook. - π
results/β Stores generated figures and tables from the analysis. - π
Climate_Disclosure_Analysis.ipynbβ The core analysis notebook. - π
requirements.txtβ Required dependencies for running the notebook.
git clone https://github.com/villacampaporta/climateBERT-analysis.git
cd climateBERT-analysisEnsure you have Python 3.8+ installed. Then, install the required packages:
pip install -r requirements.txt- Download corporate sustainability reports from:
- Place PDF reports inside the
data/folder.
Start Jupyter and open Climate_Disclosure_Analysis.ipynb:
jupyter notebookRun the notebook step by step to reproduce the analysis.
- Transformers-based models:
- ClimateBERT
- Fine-tuned ClimateBERT using ClimaText
- Text extraction & preprocessing:
PyPDF2,pytesseract,wand,Pillowfor PDF processingnltk,transformers,scikit-learnfor NLP
- Comparing voluntary vs. mandatory disclosure (2015β2022)
- Measuring commitment, specificity, and neutrality in reports
- Assessing alignment with international sustainability standards
π How has environmental disclosure changed under regulation?
| Year | Average Commitment Score | Reports Analyzed |
|---|---|---|
| 2015 | 2.1 | 75 |
| 2018 | 3.4 | 110 |
| 2022 | 4.8 | 130 |
π Model Performance: ClimateBERT vs. Fine-Tuned Model
| Model | Accuracy | Commitment Sensitivity |
|---|---|---|
| ClimateBERT | 85.2% | Medium |
| Fine-Tuned Model | 91.5% | High |
To replicate this study:
- Use Python 3.8+ and install
requirements.txt - Place reports in
data/ - Run
Climate_Disclosure_Analysis.ipynbin Jupyter - Analyze the outputs in
results/
If you use this code or dataset, please cite:
@article{VillacampaPorta2025,
author = {Javier Villacampa-Porta and MarΓa Coronado-Vaca and Eduardo C. Garrido-MerchΓ‘n},
title = {Impact of EU Non-Financial Reporting Regulation on Spanish Companiesβ Environmental Disclosure: A Cutting-Edge Natural Language Processing Approach},
journal = {Environmental Sciences Europe},
year = {2025},
doi = {10.1186/s12302-025-01067-z}
}
This repository is licensed under the MIT License. See LICENSE for details.
Feel free to contribute! Open an issue or pull request if you have suggestions for improvements.
For questions or collaboration inquiries, contact:
π§ [Email]
π [LinkedIn]