Stars
Free and Open Source, Distributed, RESTful Search Engine
Wrong project! You should head over to http://github.com/sshuttle/sshuttle
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
🦘 The Grouparoo Monorepo - open source customer data sync framework
Builds Lucene/Solr indexes out of NutchWAX segments and revisit records via Hadoop.
(T)he (N)ew (H)otness. Improved full-txt search of archival web data.