- Czech Republic
- https://www.linkedin.com/in/ondra-urban/
- @mnmkng
Starred repositories
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply moni…
I Don't Care About Cookies extension compiled for use with Playwright/Puppeteer
Drag & drop UI to build your customized LLM flow
A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
A standalone version of the readability lib
The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
🚀 Fast and simple Node.js version manager, built in Rust
The best way to write secure and reliable applications. Write nothing; deploy nowhere.
estela, an elastic web scraping cluster 🕸
A JavaScript library for generating random user agents with data that's updated daily.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…
A comment system powered by GitHub Discussions. 💬 💎
Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
Easy to maintain open source documentation websites.
🥧 HTTPie CLI — modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more.
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Quick Look extension for highlight source code files on macOS 10.15 and later.
A macOS app for customizing which browser to start
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
HTTP client made for scraping based on got.
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
The web scraper that's nearly impossible to block - now called @ulixee/hero
House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.