-
amazon-athena-user-guide Public
Forked from awsdocs/amazon-athena-user-guideThe open source version of the Amazon Athena documentation. To submit feedback & requests for changes, submit issues in this repository, or make proposed changes & submit a pull request.
Other UpdatedJul 8, 2022 -
vcpkg Public
Forked from microsoft/vcpkgC++ Library Manager for Windows, Linux, and MacOS
CMake Other UpdatedFeb 11, 2019 -
-
scrapy-source-ip Public
Simple scrapy downloader implemented using what is described in http://web.archive.org/web/20120316092048/http://dev.scrapy.org/ticket/153
-
-
mongo-hadoop Public
Forked from mongodb/mongo-hadoopMongoDB adapter for Hadoop. Small mongo hadoop pig patch to allow to use mongodb fields that starts with an underscore by prefixing them with u_ (e.g., u__id instead of _id).
-
scrapy-mongodb-pipeline Public
MongoDB pipeline for Scrapy. It allows to update existing entries (set new values or add elements to array) when item values are spread over multiple pages
-
scrapy-proxynova Public
Use scrapy with a list of proxies generated from proxynova.com
-
scrapy-mongodb-queue Public
Use scrapy with mongodb to store the request queues (FIFO or LIFO)
-
scrapy-simple-http-queue Public
Scrapy Plugin to use the simple http queue as the queue for the URLs in order to allow distributed crawling
-
requests Public
Forked from psf/requestsPython HTTP Requests for Humans™. Small update to add source_address support
-
scrapy-redis Public
Forked from rmax/scrapy-redisRedis-based components for scrapy that allows distributed crawling. Small update to make it work for Scrapy 0.16+ and added QUEUE_TYPE and DUPE_FILTER options
-
simple-http-queue Public
Simple HTTP queue (FIFO and LIFO) implemented using Python, SQLite3 and Tornado. It supports multiple queues.