Starred repositories
💫 Toolkit to help you get started with Spec-Driven Development
FUSE-based file system backed by Amazon S3
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
An extremely fast Python package and project manager, written in Rust.
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
A Database Change Management tool for Snowflake
Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared …
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Always know what to expect from your data.
📊 Cube Core is open-source semantic layer for AI, BI and embedded analytics
Data Engineering with Python, published by Packt
Terraform provider for managing Snowflake accounts
A curated list of data engineering tools for software developers
A list of useful resources to learn Data Engineering from scratch
A curated list of awesome frameworks, libraries and software for the Java programming language.
Roadmap to becoming a Java developer in 2026
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Roadmap to becoming a data engineer in 2021
Change data capture for a variety of databases. Please log issues at https://github.com/debezium/dbz/issues.
Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.
Free and Open Source, Distributed, RESTful Search Engine
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Apache Beam is a unified programming model for Batch and Streaming data processing.
Python composable command line interface toolkit
🙃 A delightful community-driven (with 2,500+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
FastAPI framework, high performance, easy to learn, fast to code, ready for production