Stars
mwojtyczka / airflow
Forked from apache/airflowApache Airflow - A platform to programmatically author, schedule, and monitor workflows
Review Databricks write up with Databricks Skills and FMAPI
Community Grafana datasource for Databricks - visualize costs, jobs and SDP pipelines in your dashboards.
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, P…
Different pieces of code related to doing cybersecurity on Databricks
Accelerates migrations to Databricks by automating key migration activities
PySpark test helper methods with beautiful error messages
Monitoring Databricks using Prometheus, Grafana and Pyroscope
Lightweight SQL execution wrapper only on top of Databricks SDK
Databricks dbt factory library for creating Databricks Job definition where individual dbt models are run as separate tasks.
Databricks framework to validate Data Quality of pySpark DataFrames and Tables
Different snippets of Terraform code. Primarily around Databricks
Cost calculator to allocate billing usage of Databricks "Shared" SQL Warehouses to individual users and their respective organisational entities (e.g. cost center, departments / business units).
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
GoMetric / statsd-http-proxy
Forked from sokil/statsd-http-proxyStatsD HTTP proxy server with REST interface for using in browsers