Data Engineering framework written in Python based in Polars.
-
Updated
May 1, 2024 - Python
Data Engineering framework written in Python based in Polars.
Examples of using Aerospike's complex data types (Map and List) to implement common patterns
Dataset Boston Housing Price prediction
Companion code for Aerospike Modeling: User Profile Store
Production-style ETL pipeline ingesting GitHub's public events API into a dimensional data warehouse using Medallion Architecture (Bronze → Silver → Gold)
🚀 1 Line that saves $1K: Incremental Aggregations with dbt + DuckDB
Data modelling with DynamoDB (Target stores)
This project performs analysis on a loan portfolio dataset to forecast cash flows and evaluate portfolio present value. It includes code written in Python using the pandas and NumPy libraries.
This project fetches Near-Earth Object (NEO) data from NASA's API and stores it into a PostgreSQL database. It includes scripts for fetching data, inserting/updating the database, validating data consistency, and logging pipeline activities.
This projec about reseveation and manage Hotel
Comparison of various Machine Learning algorithms for Heart Diseases (Heart Attack) prediction.
Implemented SVC on the Olivetti dataset to predict if a person is wearing glasses or not by using cross-validation techniques in depth.
A comprehensive LLM data processing system designed to transform raw multi-format data into high-quality training datasets optimized for Large Language Models.
An OLTP banking system is implemented, emphasizing data integrity and atomicity, and leveraging constraints, triggers, stored procedures, views, and indexing to enhance query performance.
Built a data model, data warehouse and pipeline for extracting transforming and loading data into a star schema-based data model in a redshift database
An Apache Airflow data pipeline is designed to perform ELT operations, utilizing Amazon S3 and Amazon Redshift Serverless.
Data Warehouse project on corona data. Analytics is done on country level, region level, and sub_region_level and visualized using python matplotlib.
End-to-end BI pipeline: CMS data → Snowflake star schema → Power BI KPI dashboard
Add a description, image, and links to the datamodeling topic page so that developers can more easily learn about it.
To associate your repository with the datamodeling topic, visit your repo's landing page and select "manage topics."