Nuclear War Risk Perception: Italian News Media Scraper

A robust Selenium-based web scraper designed to collect news headlines from DuckDuckGo. This tool was originally developed for the study "Apocalypse now or later? Nuclear war risk perceptions mirroring media coverage and emotional tone shifts in Italian news" (Judgment and Decision Making, 2024).

� Installation

Clone the repository:

git clone <repository_url>
cd DuckDuckSelenium

Install dependencies: Ensure you have Python 3.8+ and Google Chrome installed.
```
pip install -r requirements.txt
```

⚙️ Usage

Configure Input Files: The scraper relies on three text files in the root directory:
- Media.txt: List of news websites to search (e.g., repubblica.it).
- Keywords.txt: Search terms (e.g., Ucraina AND guerra).
- Date.txt: Date range for the search (format: YYYY-MM-DD to YYYY-MM-DD).
Run the Scraper:
```
python main.py
```
The script performs two main phases:
1. Search Scraping: Queries DuckDuckGo for each media outlet and keyword, saving results to output/search_results.csv.
2. Article Scraping: Visits the collected URLs to extract the full headline (H1), saving to output/articles_scraped.csv.

� Output

output/search_results.csv: Contains raw search results including URL, date, and snippet.
output/articles_scraped.csv: Contains the final dataset with the extracted article titles.

Note: The tool supports incremental saving and can resume if interrupted.

📄 Citation

If you use this tool in your research, please cite:

Lauriola, M., Di Cicco, G., & Savadori, L. (2024). Apocalypse now or later? Nuclear war risk perceptions mirroring media coverage and emotional tone shifts in Italian news. Judgment and Decision Making, 19(e7), 1–25. doi:10.1017/jdm.2024.2

Complete study materials are available at: https://osf.io/pduwq/overview

⚠️ Disclaimer

This tool is for educational and research purposes. Please ensure compliance with the Terms of Service of the websites you scrape.

📝 License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
output		output
Date.txt		Date.txt
Keywords.txt		Keywords.txt
Media.txt		Media.txt
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
scraper.log		scraper.log
scraper.py		scraper.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nuclear War Risk Perception: Italian News Media Scraper

� Installation

⚙️ Usage

� Output

📄 Citation

⚠️ Disclaimer

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nuclear War Risk Perception: Italian News Media Scraper

� Installation

⚙️ Usage

� Output

📄 Citation

⚠️ Disclaimer

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages