Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
-
Updated
Nov 5, 2025 - Java
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
Implementing best practices for PySpark ETL jobs and applications.
Streaming data platform. Real-time stream processing, low-latency serving, and Iceberg table management.
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.
Build data pipelines, the easy way 🛠️
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
A simplified, lightweight ETL Framework based on Apache Spark
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.
Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!
TP d'architecture décisionnel à destination des étudiants de l'EPSI et DC Paris. Le but est de déployer une architecture data dès la récupération de la donnée vers la restitution sous la forme de dataviz en passant par un Datalake, Data Warehouse et d'un Data Mart
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Azure Data Factory Hands On Lab - Step by Step - A Comprehensive Azure Data Factory and Mapping Data Flow step by step tutorial
A blazingly fast general purpose blockchain analytics engine specialized in systematic mev detection
Near real time ETL to populate a dashboard.
Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.
an app engine for your business. Seamlessly implement business logic with a powerful API. Out of the box CMS, blog, forum and email functionality. Developer friendly & easily extendable for your next SaaS/XaaS project. Built with Rails 6, Devise, Sidekiq & PostgreSQL
This is a template you can use for your next data engineering portfolio project.
Add a description, image, and links to the etl-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the etl-pipeline topic, visit your repo's landing page and select "manage topics."