🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
Updated
May 19, 2025 - Python
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
Wayback Machine API interface & a command-line tool
Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.
Official ArchiveBox MITM proxy: saves URLs of all requests passing through to an ArchiveBox server for archival.
Home of the official apt/deb package for Ubuntu/Debian-based systems.
Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻
Submit URLs listed inside a file to website archival services
Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.
FeedVault is an open-source web application that allows users to archive and search their favorite web feeds.
Download and archive RSS feeds to Wayback Machine. Save a list of archived feed in locad db.
Add a description, image, and links to the internet-archiving topic page so that developers can more easily learn about it.
To associate your repository with the internet-archiving topic, visit your repo's landing page and select "manage topics."