n8n community node for extracting main content from webpages using Defuddle library
-
Updated
Oct 29, 2025 - TypeScript
n8n community node for extracting main content from webpages using Defuddle library
A Model Context Protocol (MCP) server that provides intelligent web reading capabilities using the Jina AI Reader API. It extracts clean, LLM-ready content from any URL.
🚀 mcp-web-scrape — Clean, cache-aware web content fetcher for AI agents. Fetch any URL → extract readable content → return Markdown/JSON with citations. ⚡ Fast caching, 🤝 robots.txt compliant, 📝 Markdown-ready output, �� works with ChatGPT/Claude Desktop.
The Ultimate Web Content Extraction & Conversion Tool for AI/LLM Applications. Convert almost any web content into clean Markdown with intelligent AI processing.
📸 Crawell – 网页图片/正文一键提取、Markdown 转换与批量下载的浏览器扩展,本地化,免费 Crawell browser extension for one-click image & article extraction, Markdown conversion and bulk download – 100 % local processing.
Next.js template for seamless PDF parsing using pdf2json and FilePond. Ideal for developers seeking a ready-to-use solution for PDF content extraction in Next.js projects.
Readability2 converts HTML to plain text.
Model Context Protocol (MCP) Server for Graphlit Platform
Add a description, image, and links to the content-extraction topic page so that developers can more easily learn about it.
To associate your repository with the content-extraction topic, visit your repo's landing page and select "manage topics."