Bulk downloader for all declassified UAP (Unidentified Aerial Phenomena) records published at war.gov/UFO. Downloads all 161 records — PDFs, images, and DVIDS videos — with their original filenames as listed in the official catalog.
- Fetches the official CSV catalog from war.gov on every run (always up to date)
- Builds a manifest of all records before downloading — shows every file, its URL, and whether it already exists on disk
- Downloads PDFs and images using a real browser session to bypass CDN bot protection
- Downloads videos by navigating the war.gov UI and capturing each video through the official download button
- Skips files already on disk — safe to re-run to pick up new releases
Files are saved with their original asset filenames from the catalog (e.g. 65_HS1-834228961_62-HQ-83894_Section_10.pdf, DOW-UAP-PR19, Unresolved UAP Report, Middle East, May 2022.mp4).
| Type | Count | Examples |
|---|---|---|
| 119 | FBI 62-HQ-83894 case file (Sections 1–10), CIA records, NARA archives, DOW mission reports, State Dept cables, NASA Apollo transcripts | |
| IMG | 14 | FBI crime scene photos (A1–A8), NASA Apollo 12 & 17 images |
| VID | 28 | DOW unresolved UAP reports (Middle East, INDOPACOM, Syria, Greece, etc.) |
Install dependencies (one time):
cd /home/user/Projects/ufo
python3 -m venv .venv
source .venv/bin/activate
pip install playwright
playwright install chromiumActivate the environment:
source .venv/bin/activateRun the downloader:
python3 download_ufo.pyThe browser window will open visibly — this is required to pass Akamai bot protection on the CDN.
Just re-run the script. It always fetches a fresh catalog from war.gov and only downloads files not already on disk:
source .venv/bin/activate
python3 download_ufo.pyNew records will appear as [NEW] in the manifest output; already-downloaded files show as [EXISTS] and are skipped.
| File | Description |
|---|---|
download_ufo.py |
Main downloader script |
uap-records.csv |
Local copy of the catalog CSV (refreshed each run) |
uap-manifest.json |
Full manifest from the last run — all records with URLs and filenames |