Skip to content

hoangkimminh/sponge

 
 

Repository files navigation

Sponge

Hassle-free web scraping service.

FEATURES

Core Features

  • Render client-side-rendered web pages
  • Auto extract metadata and article content
  • Extract DOM elements via CSS selectors
  • Domain blocking (when BLOCKLIST_URL environment variable provided)
  • Forward request headers like user-agent, cookies,...
  • HTTP proxy

Live Version Features

  • Bundled with a blocklist of over 57,000 adware and malware domains

INSTALLATION

Requirements

  • Node.js >= 14
  • Environment variables specified in .env.example

Instructions

Without Docker (dev environment)

$ npm i             # yarn install
$ npm run start:dev # yarn start:dev

With Docker (prod environment)

$ npm run docker:build:app  # yarn docker:build:app
$ npm run docker:start:prod # yarn docker:start:prod

USAGE

Start the app and go to /docs for interactive API documentation.

CHANGELOG

Read more here.

TODO

Read more here.

About

Hassle-free web scraping service.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • TypeScript 96.2%
  • Dockerfile 3.8%