Skip to content
View thenxkk's full-sized avatar

Highlights

  • Pro

Block or report thenxkk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

爬虫

爬虫相关
34 repositories

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.

Java 3,092 1,444 Updated Sep 5, 2025

Easy to use lightweight web crawler(易用的轻量化网络爬虫)

Java 2,520 883 Updated Dec 3, 2025

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Java 3,110 779 Updated Dec 11, 2025

Open Source Web Crawler for Java

Java 4,615 1,921 Updated Nov 4, 2021

监控 bm 库存上新的一个工具

JavaScript 1 Updated May 30, 2021

基于 webmagic 的 Java 爬虫应用

Java 2,786 854 Updated Jan 8, 2022

A scalable web crawler framework for Java.

Java 11,682 4,160 Updated Nov 10, 2025

Scrapy, a fast high-level web crawling & scraping framework for Python.

Python 59,243 11,190 Updated Dec 16, 2025

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

JavaScript 43,657 5,362 Updated Dec 1, 2025

python爬虫教程系列、从0到1学习python爬虫,包括浏览器抓包,手机APP抓包,如 fiddler、mitmproxy,各种爬虫涉及的模块的使用,如:requests、beautifulSoup、selenium、appium、scrapy等,以及IP代理,验证码识别,Mysql,MongoDB数据库的python使用,多线程多进程爬虫的使用,css 爬虫加密逆向破解,JS爬虫逆向,…

Python 21,050 3,905 Updated Jul 29, 2024

Elegant Scraper and Crawler Framework for Golang

Go 24,909 1,836 Updated Dec 4, 2025

A Powerful Spider(Web Crawler) System in Python.

Python 17,000 3,678 Updated Apr 30, 2024

A browser automation framework and ecosystem.

Java 33,785 8,631 Updated Dec 17, 2025

Automated driver management and other helper features for Selenium WebDriver in Java

Java 2,677 696 Updated Dec 15, 2025

Custom Selenium Chromedriver for Java can pass almost all selenium check. It's the Java version for undetected-chromedriver

Java 82 16 Updated Apr 27, 2024

Open Source record and playback test automation for the web.

TypeScript 3,060 826 Updated Dec 1, 2025

Headless chrome/chromium automation library (unofficial port of puppeteer)

Python 3,932 342 Updated Jun 29, 2024

JavaScript API for Chrome and Firefox

TypeScript 93,081 9,344 Updated Dec 17, 2025

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Python 12,127 1,304 Updated Jul 5, 2025

新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频

Python 4,246 869 Updated Dec 3, 2025

新浪微博爬虫,用python爬取新浪微博数据

Python 9,351 2,058 Updated Sep 21, 2025

Provides a simple way to run Selenium Grid with Chrome, Firefox, and Edge using Container Platform, making it easier to perform browser automation at scale

Shell 8,569 2,575 Updated Dec 16, 2025

Java version of the Playwright testing and automation library

Java 1,408 258 Updated Dec 4, 2025
Python 502 1,042 Updated Mar 11, 2024

Python based web automation tool. Powerful and elegant.

Python 10,913 1,024 Updated Aug 8, 2025

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架

Go 12,098 1,885 Updated Dec 5, 2025

带带弟弟 通用验证码识别OCR pypi版

Python 13,217 2,163 Updated Jun 9, 2025

这是一个图像识别的java项目,底层是ddddocr(从gitee上fork过来并改动了)运行生成的jar即可

Java 6 2 Updated Mar 19, 2024

python爬虫项目合集,从基础到js逆向,包含基础篇、自动化篇、进阶篇以及验证码篇。案例涵盖各大网站(xhs douyin weibo ins boss job,jd...),你将会学到有关爬虫以及反爬虫、自动化和验证码的各方面知识

JavaScript 1,627 331 Updated Sep 23, 2024