Skip to content

ookgezellig/SaveToWaybackMachine

Repository files navigation

SaveToWaybackMachine

Scripts and data for archiving KB-managed websites to the Internet Archive's Wayback Machine.

Maintained by KB, national library of the Netherlands

Website

View the live site →

This repository has a companion GitHub Pages website with screenshot galleries, interactive navigation, and comprehensive documentation.

Purpose

Some websites managed by the KB have been discontinued. To preserve their content for Wikipedia sourcing and cultural heritage purposes, the KB actively archives websites to the Wayback Machine at web.archive.org.

Archived sites

Site Archive date URLs
Literatuurplein.nl Dec 2019 69,599
Leesplein.nl Jun 2018 23,784
LezenVoorDeLijst.nl Aug 2018 12,456
kb.nl Dec 2021 & Mar 2022 7,633
GidsVoorNederland.nl Nov 2018 1,415
Literatuurgeschiedenis.org Mar 2022 464
Literaireprijzen.nl Oct 2018 452

Total: ~115,000+ URLs archived

How this site was built

This project was transformed in December 2025 through an intensive AI-human collaboration:

  • 10+ hours of development across Dec 2-3, 2025
  • 33+ commits reorganizing and enhancing the repository
  • Built using Claude Opus 4.5 AI assistant via Claude Code CLI

Key achievements

  1. Repository reorganization - Clean hierarchical folder structure
  2. Screenshot galleries - 36 Wayback Machine screenshots captured via Python/Playwright
  3. GitHub Pages website - Responsive site with navigation, lightbox, and breadcrumbs
  4. AI vision recognition - Used multimodal AI to extract meaningful captions from screenshots
  5. EU compliance - GDPR, WCAG 2.1 Level AA, comprehensive accessibility features

Read the full story →

Scripts

wbm-archiver

Location: scripts/wbm-archiver/

Python script with three modes:

  1. Save pages to the Wayback Machine
  2. Retrieve the latest archived version
  3. Retrieve the oldest archived version

Requirements: Python 3.x, waybackpy

Alternative method

Archive pages without Python: archive.org/services/wayback-gsheets/

Compliance

The companion website meets European standards:

  • GDPR/AVG - No cookies, no tracking, no personal data
  • WCAG 2.1 Level AA - Full accessibility compliance
  • Responsive design - Desktop, tablet, mobile support
  • SEO optimized - Schema.org, Open Graph, Twitter Cards

View compliance documentation →

License

The source code and text content of this project are dedicated to the public domain under CC0 1.0.

Note: This license does not apply to:

  • Wayback Machine screenshots (third-party copyrights)
  • KB logo (CC BY-SA 3.0)
  • Social media brand icons (respective trademarks)

See Image credits & copyrights for details.

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •