thenxkk

thenxkk thenxkk

22 followers · 151 following

Hong Kong

Highlights

Stars

爬虫

爬虫相关

34 repositories

CrawlScript / WebCollector

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.

Java 3,092 1,444 Updated Sep 5, 2025

xtuhcy / gecco

Easy to use lightweight web crawler（易用的轻量化网络爬虫）

Java 2,520 883 Updated Dec 3, 2025

internetarchive / heritrix3

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Java 3,110 779 Updated Dec 11, 2025

yasserg / crawler4j

Open Source Web Crawler for Java

Java 4,615 1,921 Updated Nov 4, 2021

JrCx7scC / bm_data

监控 bm 库存上新的一个工具

JavaScript 1 Updated May 30, 2021

brianway / webporter

基于 webmagic 的 Java 爬虫应用

Java 2,786 854 Updated Jan 8, 2022

code4craft / webmagic

A scalable web crawler framework for Java.

Java 11,682 4,160 Updated Nov 10, 2025

scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Python 59,243 11,190 Updated Dec 16, 2025

NaiboWang / EasySpider

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

JavaScript 43,657 5,362 Updated Dec 1, 2025

wistbean / learn_python3_spider

python爬虫教程系列、从0到1学习python爬虫，包括浏览器抓包，手机APP抓包，如 fiddler、mitmproxy，各种爬虫涉及的模块的使用，如：requests、beautifulSoup、selenium、appium、scrapy等，以及IP代理，验证码识别，Mysql，MongoDB数据库的python使用，多线程多进程爬虫的使用，css 爬虫加密逆向破解，JS爬虫逆向，…

Python 21,050 3,905 Updated Jul 29, 2024

gocolly / colly

Elegant Scraper and Crawler Framework for Golang

Go 24,909 1,836 Updated Dec 4, 2025

binux / pyspider

A Powerful Spider(Web Crawler) System in Python.

Python 17,000 3,678 Updated Apr 30, 2024

SeleniumHQ / selenium

A browser automation framework and ecosystem.

Java 33,785 8,631 Updated Dec 17, 2025

bonigarcia / webdrivermanager

Automated driver management and other helper features for Selenium WebDriver in Java

Java 2,677 696 Updated Dec 15, 2025

mabinogi233 / UndetectedChromedriver

Custom Selenium Chromedriver for Java can pass almost all selenium check. It's the Java version for undetected-chromedriver

Java 82 16 Updated Apr 27, 2024

SeleniumHQ / selenium-ide

Open Source record and playback test automation for the web.

TypeScript 3,060 826 Updated Dec 1, 2025

pyppeteer / pyppeteer

Headless chrome/chromium automation library (unofficial port of puppeteer)

Python 3,932 342 Updated Jun 29, 2024

puppeteer / puppeteer

JavaScript API for Chrome and Firefox

TypeScript 93,081 9,344 Updated Dec 17, 2025

ultrafunkamsterdam / undetected-chromedriver

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Python 12,127 1,304 Updated Jul 5, 2025

fysh711426 / UndetectedChromeDriver

C# 208 70 Updated Feb 17, 2024

dataabc / weibo-crawler

新浪微博爬虫，用python爬取新浪微博数据，并下载微博图片和微博视频

Python 4,246 869 Updated Dec 3, 2025

dataabc / weiboSpider

新浪微博爬虫，用python爬取新浪微博数据

Python 9,351 2,058 Updated Sep 21, 2025

SeleniumHQ / docker-selenium

Provides a simple way to run Selenium Grid with Chrome, Firefox, and Edge using Container Platform, making it easier to perform browser automation at scale

Shell 8,569 2,575 Updated Dec 16, 2025

microsoft / playwright-java

Java version of the Playwright testing and automation library

Java 1,408 258 Updated Dec 4, 2025

lewis-007 / MediaCrawler

Python 502 1,042 Updated Mar 11, 2024

g1879 / DrissionPage

Python based web automation tool. Powerful and elegant.

Python 10,913 1,024 Updated Aug 8, 2025

crawlab-team / crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Go 12,098 1,885 Updated Dec 5, 2025

sml2h3 / ddddocr

带带弟弟通用验证码识别OCR pypi版

Python 13,217 2,163 Updated Jun 9, 2025

Fenger7923 / ddddocr-jar

这是一个图像识别的java项目，底层是ddddocr（从gitee上fork过来并改动了）运行生成的jar即可

Java 6 2 Updated Mar 19, 2024

xishandong / crawlProject

python爬虫项目合集，从基础到js逆向，包含基础篇、自动化篇、进阶篇以及验证码篇。案例涵盖各大网站(xhs douyin weibo ins boss job，jd...)，你将会学到有关爬虫以及反爬虫、自动化和验证码的各方面知识

JavaScript 1,627 331 Updated Sep 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly