Starred repositories
Postgres change data capture to streams, queues, and search indexes like Kafka, SQS, Elasticsearch, HTTP endpoints, and more
A high-quality collection of study skills built for high school and college students, teachers, and TAs. Use it directly in Kael.im for free without installing skills
The Metadata Driven framework for Databricks Lakeflow Declarative Pipelines (formerly Delta Live Tables). Metadata framework that generates production ready Pyspark code for Lakeflow Declarative Pi…
Skills for Real Engineers. Straight from my .claude directory.
The Open Context Layer for Data and AI , OpenMetadata is the open platform for building trusted data context and business semantics for humans, AI assistants, and agents.
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
Open, Multi-modal Catalog for Data & AI
DuckDB is an analytical in-process SQL database management system
Open-Source API Development Ecosystem • https://hoppscotch.io • Offline, On-Prem & Cloud • Web, Desktop & CLI • Open-Source Alternative to Postman, Insomnia
A collection of beautiful, accessible and performant Astro blog templates.
a mutiple processes timed rotate logging file handler(base logging.RotatingFileHandler, ConcurrentLogHandler)
CNCF Jaeger, a Distributed Tracing Platform
general purpose extensions to golang's database/sql
A toolkit with common assertions and mocks that plays nicely with the standard library
A curated list of awesome Go frameworks, libraries and software
Magelang is a programming language targeting webassembly
databricks / tpcds-kit
Forked from gregrahn/tpcds-kitTPC-DS benchmark kit with some modifications/fixes
Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
🖼️ Create beautiful maps from OpenStreetMap data in a streamlit webapp
Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for Lustre, Slurm, Moab, Torque. LSF, Flux, and more.
📄 Awesome CV is LaTeX template for your outstanding job application
Synthetic data generators for tabular and time-series data