The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
Nov 11, 2025 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
🐙 GitLab data extraction turns issues into a dimensional analytics model for IT teams and OTRS integration. It enables dashboards in Metabase and Power BI.
Python library and web service for Open Source Software Health and Sustainability metrics & data collection. You can find our documentation and new contributor information easily here: https://oss-augur.readthedocs.io/en/main/
📥 Collect and analyze posts from Russian pro-war Telegram channels efficiently, perfect for historical research and monitoring.
📹 Automate video summaries and monitor Bilibili updates with this smart Feishu bot for efficient message delivery and AI integration.
Fast and differentiable particle accelerator optics simulation for reinforcement learning and optimisation applications.
🔍 Extract and organize job listings effortlessly with JobMiner, a flexible Python web scraping toolkit for detailed job market insights.
🚀 Automate your submissions to OML with ease. Streamline your workflow while adhering to site rules and efficiently managing input files.
🔍 Evaluate web search APIs with our framework, testing accuracy and relevance across multiple AI agents and benchmarks for better information retrieval.
🔧 Process sensor data effectively with Kova Sensor Utils, a Python library for image, point cloud, and IMU data analysis in the Kova robotics network.
Quora data extraction and analytics tool
99+ CLI tools to build, browse, and blend your media library
A collection of anime data from MyAnimeList, Anilist and Jikan
📈 Sistema automatizado de recolección, almacenamiento y trazabilidad de datos financieros históricos del Grupo Aval usando Python, yfinance y GitHub Actions.
JobMiner – A Python-based web scraping toolkit for extracting and organizing job listings from multiple websites into structured data.
Open-source voice data collection platform for building inclusive voice datasets. Collaborative transcription with quality consensus. FastAPI + React + PostgreSQL.
Extract detailed Kaufland product reviews
python crawler script for automated file downloads
data mining scraping automation tool
Add a description, image, and links to the data-collection topic page so that developers can more easily learn about it.
To associate your repository with the data-collection topic, visit your repo's landing page and select "manage topics."