🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
Updated
May 19, 2025 - Python
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
Wayback Machine API interface & a command-line tool
Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.
Home of the official apt/deb package for Ubuntu/Debian-based systems.
Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC
Official ArchiveBox MITM proxy: saves URLs of all requests passing through to an ArchiveBox server for archival.
Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻
Download and archive RSS feeds to Wayback Machine. Save a list of archived feed in locad db.
Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.
Submit URLs listed inside a file to website archival services
FeedVault is an open-source web application that allows users to archive and search their favorite web feeds.
Add a description, image, and links to the internet-archiving topic page so that developers can more easily learn about it.
To associate your repository with the internet-archiving topic, visit your repo's landing page and select "manage topics."