-
Modern e-commerce analytics stack: MySQL → S3 → Snowflake → dbt → Dagster. Implements incremental ingestion, SCD handling, data quality checks, and enterprise-grade governance.
-
databricks-template Public
Forked from andre-salvati/databricks-templateA production-ready PySpark project template with medallion architecture, Python packaging, unit tests, integration tests, CI/CD automation, Databricks Asset Bundles, and DQX data quality framework.
Python UpdatedNov 13, 2025 -
Advanced-QA-and-RAG-Series Public
Forked from Farzad-R/Advanced-QA-and-RAG-SeriesThis repository contains advanced LLM-based chatbots for Q&A using LLM agents, and Retrieval Augmented Generation (RAG) and with different databases. (VectorDB, GraphDB, SQLite, CSV, XLSX, etc.)
Jupyter Notebook UpdatedOct 25, 2025 -
-
A real-time change data capture (CDC) pipeline using Debezium, Kafka, and PostgreSQL to stream database changes and send alerts via Telegram. The project also includes a PostgreSQL trigger-based au…
-
aws-data-pipeline-terraform Public
This project provisions a modular AWS data pipeline using Terraform. Each AWS service lives in its own directory under infrastructure/services, so you can provision and manage them independently.
-
chapter-7-implementing-data-contracts Public template
Forked from data-contract-book/chapter-7-implementing-data-contractsPython UpdatedAug 22, 2025 -
-
AI-Driven Email Automation for Customer Support
Python UpdatedJul 7, 2025 -
-
mlops-zoomcamp-main Public
Forked from DataTalksClub/mlops-zoomcampFree MLOps course from DataTalks.Club
Jupyter Notebook UpdatedMay 9, 2025 -
Exercise: Build my application with Copilot agent mode
Shell MIT License UpdatedApr 8, 2025 -
-
-
TEMP-TRACKER-ETL Public
Temp Track ETL is a data pipeline that extracts weather data from an external API, processes it using Apache Spark, and loads the transformed data into a PostgreSQL database.
Python UpdatedMar 19, 2025 -
data-engineering-zoomcamp Public
Forked from DataTalksClub/data-engineering-zoomcampData Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.
Jupyter Notebook UpdatedMar 17, 2025 -
ai-engineer-toolkit Public
Forked from break-into-data/ai-engineer-toolkitProjects & Resources to help you become a better AI Developer.
TypeScript UpdatedMar 17, 2025 -
A repository for python for data science resources in jupyter notebooks.
-
tuberculosis classification in chest radiographs using convolutional neural network and Fastai python library
-
-
opennem Public
Forked from opennem/opennemEnergy market data access platform
Python MIT License UpdatedFeb 8, 2021 -
A tutorial on how to build a visualization dashboard using dash plotly
-
-
plotlydash-flask-tutorial Public
Forked from toddbirchard/plotlydash-flask-tutorial📊📉Embed Plotly Dash into your Flask applications.
Less UpdatedJan 27, 2021 -
Data-Analysis Public
Forked from WillKoehrsen/Data-AnalysisData Science Using Python
Jupyter Notebook MIT License UpdatedJan 11, 2021 -
Machine-Learning-with-Python Public
Forked from susanli2016/Machine-Learning-with-PythonPython code for common Machine Learning Algorithms
-
website Public
Forked from kubeflow/websiteKubeflow's public website
HTML Creative Commons Attribution 4.0 International UpdatedDec 21, 2020 -
-
-
kfserving Public
Forked from kserve/kserveServerless Inferencing on Kubernetes