Skip to content
#

apache-superset

Here are 37 public repositories matching this topic...

An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS application using Apache Airflow as an orchestration tool and various data warehouse technologies and finally using Apache Superset to connect to DWH for generating BI dashboards for weekly reports

  • Updated Dec 7, 2022
  • Python
aws-data-pipeline

A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from locally hosted Airflow containers. The end product is a Superset dashboard and a Postgres database, hosted on an EC2 instance at this address (powered down):

  • Updated May 14, 2022
  • Python

A Smart Traffic Management System for Ho Chi Minh City, Vietnam leveraging batch and real-time data processing, intuitive dashboards, and monitoring tools to optimize traffic flow, enhance safety, and support sustainable urban mobility through advanced analytics and user-friendly applications.

  • Updated Jan 17, 2025
  • Python

This project was created as part of an assessment for DigitalXC AI. It demonstrates a cloud-based ELT pipeline using AWS MWAA, Airflow, dbt, PostgreSQL, and Superset. The pipeline automates data ingestion from S3, transformation with dbt, and visualization through Superset, following modern data engineering practices on a scalable AWS architecture.

  • Updated Jul 1, 2025
  • Python

Improve this page

Add a description, image, and links to the apache-superset topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the apache-superset topic, visit your repo's landing page and select "manage topics."

Learn more