-
-
datapods-oss Public
This is Open Source project for student/develop for self-hosting as a data platform engineer.
-
bird_strike_prediction Public
Use 3 classification models and ensemble to predict an impact of birds strike in aviation
-
airflow-spark-data-pipeline Public
Simple ETL Pipeline with Spark. Data Engineering 101
-
-
qdrant-spark Public
Forked from qdrant/qdrant-sparkQdrant's Apache Spark connector
Java Apache License 2.0 UpdatedDec 3, 2024 -
-
arrow Public
Forked from apache/arrowApache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
C++ Apache License 2.0 UpdatedJul 27, 2024 -
-
dbt-project-structuring Public
Simple Data Modeling with dbt-lab. Data Engineering 101
UpdatedJun 16, 2024 -
-
playbook Public
Forked from dwarvesf/playbookGuides for getting things done, programming well, and programming in style.
UpdatedMar 28, 2024 -
brain Public
Forked from dwarvesf/researchThe Dwarves second brain
Dockerfile Creative Commons Zero v1.0 Universal UpdatedMar 15, 2024 -
-
terraform-databricks-lakehouse-blueprints Public
Forked from databricks/terraform-databricks-lakehouse-blueprintsSet of Terraform automation templates and quickstart demos to jumpstart the design of a Lakehouse on Databricks. This project has incorporated best practices across the industries we work with to d…
HCL Other UpdatedFeb 13, 2023 -
-
superset Public
Forked from apache/supersetApache Superset is a Data Visualization and Data Exploration Platform
TypeScript Apache License 2.0 UpdatedJan 3, 2023 -
mage-ai Public
Forked from mage-ai/mage-ai🧙 Mage is an open-source data pipeline tool for transforming and integrating data.
TypeScript Apache License 2.0 UpdatedOct 6, 2022 -
data-lineage-as-network Public
Building data lineage with graph database, graph modeling
Python MIT No Attribution UpdatedAug 30, 2022 -
tiki_scrapper_db Public
In this repo, this is how to config MySQL and store all data that extracting in front-end of tiki.
-
automated-infra-testing Public archive
To run test on localstack whenever deployment
HCL UpdatedOct 31, 2021 -
airflow-spark Public
Forked from cordon-thiago/airflow-sparkDocker with Airflow and Spark standalone cluster
Python UpdatedJul 10, 2021 -
localstack-dev Public
This repo for how to use localstack as Mock AWS Cloud.
HCL UpdatedJun 28, 2021 -
-
demo-scene Public
Forked from confluentinc/demo-sceneScripts and samples to support Confluent Platform talks. May be rough around the edges. For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/
Shell UpdatedDec 10, 2020 -
system-design-primer Public
Forked from donnemartin/system-design-primerLearn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Python Other UpdatedNov 4, 2020 -
Cookbook Public
Forked from andkret/CookbookThe Data Engineering Cookbook
Apache License 2.0 UpdatedSep 16, 2020