internet-archiving

Here are 11 public repositories matching this topic...

ArchiveBox / ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Updated Nov 15, 2025
Python

akamhy / waybackpy

Star

Wayback Machine API interface & a command-line tool

osint internet-archive web-archiving wayback-machine webarchiving cdx-api internet-archiving savepagenow archive-webpage archive-webpages wayback-machine-api wayback-machine-python

Updated Feb 26, 2024
Python

Own-Data-Privateer / hoardy-web

Star

Passively capture, archive, and hoard your web browsing history, including the contents of the pages you visit, for later offline viewing, replay, mirroring, data scraping, and/or indexing. Your own personal private Wayback Machine that can also archive HTTP POST requests and responses, as well as most other HTTP-level data.

cli backups internet archiving snapshot self-hosted archive browser-extension archiver web-archiving wayback-machine web-browsing web-archive website-archive auto-save offline-reading internet-archiving

Updated Oct 18, 2025
Python

ArchiveBox / debian-archivebox

Sponsor

Star

Home of the official apt/deb package for Ubuntu/Debian-based systems.

package debian apt ubuntu web-archiving aptitude digipres internet-archiving archivebox stdeb

Updated Oct 5, 2024
Python

mikwielgus / forum-dl

Sponsor

Star

Scrape posts, threads from forums, news aggregators, mail archives, export to JSONL, mailbox, WARC

python scraper forum discourse phpbb warc data-fetching simplemachines internet-archiving

Updated Jun 27, 2024
Python

ArchiveBox / archivebox-proxy

Sponsor

Star

Official ArchiveBox MITM proxy: saves URLs of all requests passing through to an ArchiveBox server for archival.

proxy https-proxy web-archiving web-proxy digital-preservation mitmproxy digipres internet-archiving archivebox

Updated Jul 12, 2024
Python

itsliamdowd / WaybackBrowserWindows

Star

Pick a date and explore websites from the early days of the internet to now all in an easy-to-use browser format! 💻

Updated Jun 14, 2022
Python

Fooftilly / RSS_archiver

Star

Download and archive RSS feeds to Wayback Machine. Save a list of archived feed in locad db.

rss archive internet-archive rss-feed archiver wayback-machine webarchive link-archiver internet-archiving rss-archive link-archive

Updated Oct 19, 2023
Python

httpreserve / conventoarchiver

Star

Repository for collecting scripts to help capture MyConvento newsroom press-releases from the MyConvento PR management suite. The README provides an analysis of the MyConvento URL architecture for users hoping to develop a solution for themselves.

internet-archive web-archiving digipres webarchives internet-archiving press-releases myconvento pr-newsroom my-convento