Skip to content

dmitrys99/waybacker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A tool for updating stale links on web pages.

Theory:

Web links often get stale, particularly on old blog posts.  This tool will take a web page, find dead links, and find substitutes for them using the Wayback Machine facility on archive.org.

The Wayback Machine (surprisingly) lacks an API, so this relies on screen-scraping, which is likely to break in the future.

Caveat: the archive picked is the latest available, which, given the way websites degenerate before they die, is not necessarily the best substitute.  So you might want to go pick one by hand.

Usage:
1) Start a Lisp (tested in Clozure)
2) Load Quicklisp
3) Load the wayback.lisp file
4a) (process-page "http://xenia.media.mit.edu/~mt") 

Output:
A lot of verbosity.
The value returned is a list of bad urls and their backups on archive.org:
(("http://el.www.media.mit.edu/groups/el/projects/fishtank/"
  "http://web.archive.org/web/20100618022707/http://el.www.media.mit.edu/groups/el/projects/fishtank/")
 ("http://mt.www.media.mit.edu/people/mt/papers/chi94/chi94.html"
  "http://web.archive.org/web/20060901124153/http://mt.www.media.mit.edu/people/mt/papers/chi94/chi94.html")
 ("http://www.ddj.com/documents/s=889/ddj0001l/0001l.htm"
  "http://web.archive.org/web/20070529043833/http://www.ddj.com/documents/s=889/ddj0001l/0001l.htm"))

OR,
4b) (transform-url "http://xenia.media.mit.edu/~mt") 
Will output the transformed contents of the page on the console. 


Todo:
- a web UI, duh
- exclude self URLs
- capture and report errors
- handle "bad" URLs (eg those that contain a ?)
- blogger interface that actually rewrites the entry,
  and similarly for other services


Blog repair setup:
load .asd
(asdf:operate 'asdf:load-op :waybacker)
-- will start on port 8080
Use localtunnel to open up the port
> localtunnel 8080
edit *callback-uri* to reflect the assigned host
go to [host]/obtain

About

A tool to fix dead URLs by replacing them with pointers to archive.org

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors