🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
-
Updated
Nov 7, 2025 - TypeScript
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
⚡ Easiest no code web data extraction platform • Instantly turn any website into API or spreadsheet ⚡
Self-hosted webscraper.
A JavaScript library for generating random user agents with data that's updated daily.
A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.
Model Context Protocol (MCP) Server for Graphlit Platform
⚡ Ayakashi.io - The next generation web scraping framework
n8n node to interact with browserless instance
estela, an elastic web scraping cluster 🕸
Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.
Send webhook messages from your browser (and so much more)
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
📥 Bot for downloading any media from Instagram, Twitter and videos from TikTok and Youtube
A VSCode extension that generates markdown documentation from web pages and GitHub repositories.
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Paperback - Vietnamese - Extensions
📰 Read RSS feed from LeMonde.fr and display news inside the App
Mapping sites to AniList and back.
Generic REST API for scraping websites. Drop-in replacement for ScrapingBee, ScrapingAnt, and ScraperAPI services. And it is open-source!
Add a description, image, and links to the web-scraping topic page so that developers can more easily learn about it.
To associate your repository with the web-scraping topic, visit your repo's landing page and select "manage topics."