Article extraction, content scraping
-
Updated
Nov 10, 2025 - JavaScript
Article extraction, content scraping
Automatically scrape metadata from websites
Marky helps you convert things into Markdown 📝
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
Quora Search Bot – automated Android control
Extract content from reddit, tiktok, articles, youtube
n8n community node for extracting main content from webpages using Defuddle library
Live Web Access for Your Local AI — Tunable Search & Clean Content Extraction
A Chrome extension that summarizes articles using Gemini API
A userscript that adds a button to YouTube video pages for copying the transcript with or without timestamps.
🚀 mcp-web-scrape — Clean, cache-aware web content fetcher for AI agents. Fetch any URL → extract readable content → return Markdown/JSON with citations. ⚡ Fast caching, 🤝 robots.txt compliant, 📝 Markdown-ready output, �� works with ChatGPT/Claude Desktop.
🔄 Extract and convert WordPress export files to Markdown, CSV, and JSON formats with intelligent HTML parsing and code block detection
Lightweight Dash app to summarize YouTube & Reddit content with Ollama.
The Ultimate Web Content Extraction & Conversion Tool for AI/LLM Applications. Convert almost any web content into clean Markdown with intelligent AI processing.
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Model Context Protocol (MCP) Server for Graphlit Platform
A privacy-focused, client-side web application that extracts clean, readable content from any webpage and converts it to PDF format. Built with pure HTML, CSS, and JavaScript—no backend required, no tracking, complete privacy.
Article parser for Habr, Proglib, and vc.ru that extracts main content, removes ads and unnecessary elements, preserving proper formatting
DOM Based Content Extraction via Text Density
Add a description, image, and links to the content-extraction topic page so that developers can more easily learn about it.
To associate your repository with the content-extraction topic, visit your repo's landing page and select "manage topics."