#

news-crawler

Here are 17 public repositories matching this topic...

adbar / trafilatura

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Updated Sep 12, 2025
Python

news-please

fhamborg / news-please

news-please - an integrated web crawler and information extractor for news that just works

Updated Apr 14, 2026
Python

flairNLP / fundus

A very simple news crawler with a funny name

python nlp rss sitemap crawler scraper corpus text-extraction web-scraping image-classification datasets news-crawler corpus-tools commoncrawl web-corpus news-scraping cc-news image-extraction

Updated Apr 17, 2026
Python

lumyjuwon / KoreaNewsCrawler

A korean news crawler built to ingest large amounts of news data.

crawler scrapy-crawler news-crawler newscrawler koreanewscrawler

Updated Apr 27, 2024
Python

LuChang-CS / news-crawler

A news crawler for BBC News, Reuters and New York Times.

crawler bbc reuters news-crawler nytimes

Updated Dec 8, 2022
Python

nploi / news_crawler

News crawler là một công cụ giúp bạn có thể crawl dữ liệu của một trang tin tức.

python crawler news scrapy news-crawler vietnam-crawl

Updated Oct 24, 2019
Python

SecondDim / crawler-news

Use python scrapy build crawler for real-time Taiwan NEWS website.

mysql python docker crawler circleci news database docker-compose taiwan scrapy news-crawler python-scrapy taiwan-news-website

Updated Mar 26, 2026
Python

sakshamssr / GNews-API

A Fast and lightweight Python API that search for articles on Google News and returns a JSON response.

python api web-scraper python3 rss-feed newsapi news-crawler google-news google-news-scraper google-crawler gnews google-news-homepage fastapi google-news-api fast-api gnews-api google-news-rss google-news-scrapper

Updated Jan 31, 2024
Python

divkakwani / webcorpus

Generate large textual corpora for almost any language by crawling the web

multilingual nlp datasets news-crawler indic-languages nlp-datasets

Updated Nov 18, 2021
Python

arian-askari / persian_news_websites_crawler

Crawler (Scraper) for several well-known persian news for scraping public data

crawler python-bot scrapper news-crawler python-crawler news-crawler-website

Updated Apr 20, 2022
Python

aufamiri / berita-crawler

a web crawler to take all the latest indonesian news from many sources

python crawler news indonesian-language news-crawler berita

Updated Jan 3, 2025
Python

karimhabush / TheguardianScrapper

A Scrapy webscraper that can scrape and store articles of theguardian.com

scraper scrapy news-crawler

Updated May 1, 2023
Python

anjanatiha / News-Fetch

web-crawler news-feed news-crawler

Updated Apr 27, 2018
Python

zhichzhang / NewsCrawler

News Site Crawler is a multi-threaded web crawling system designed to ingest, traverse, and analyze large-scale news websites in a controlled and reliable manner.

python3 data-analysis news-crawler beautifulsoup4

Updated Jan 31, 2026
Python

tunahanoguz / news-crawler

News crawler project written in Python.

crawler web-crawler web-scraper news-crawler

Updated Oct 8, 2022
Python

prasangapokharel / news-crawler

AI-powered news crawler with RAG pipeline — scrapes, processes and summarises news articles using LLMs

python nlp ai web-scraping news-crawler rag llm

Updated Mar 11, 2026
Python

cikay / kurdish_scrapy

A Scrapy package based web scraper for collecting Kurdish text data from websites. The tool recursively crawls specified domains, extracts article content using Trafilatura, and filters results by language using Facebook's FastText language identification model.

Updated Mar 29, 2026
Python

Improve this page

Add a description, image, and links to the news-crawler topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the news-crawler topic, visit your repo's landing page and select "manage topics."