Explore and build data engineering solutions using AWS services with practical projects covering SQL, Big Data, and modern architecture.
-
Updated
Apr 10, 2026
Explore and build data engineering solutions using AWS services with practical projects covering SQL, Big Data, and modern architecture.
TypeScript library that lets you run and test your AWS Step Functions locally! 🚀
End-to-end serverless data pipeline on AWS that ingests healthcare staffing data from Google Drive, processes it through a multi-layer data lake (raw → refined → curated), enforces data quality, and delivers analytics via Athena and QuickSight with full monitoring, alerting, and failure handling.
Some of my tools and sample code for building with AWS.
Weekly-automated AWS data engineering pipeline that spatially joins GBIF beaver sighting records with USGS water quality data (dissolved oxygen, temperature, pH, turbidity) to identify anomalous monitoring stations near beaver habitat.
Large scale e2e batch (web scraping) pipeline that crawls Avature career portals, exports job listings utilizing AWS CDK, Athena, Glue, ECS Fargate, S3, DynamoDB, Step Function, EventBridge Scheduler and Cloudwatch.
AWS Comprehend is an event-driven, serverless data processing pipeline that leverages AWS services to perform natural language processing and analysis on user-submitted text files.
AWS Rekognition is an event-driven, serverless data processing pipeline that leverages AWS services to perform image processing and analysis on user-submitted image files.
Terraform AWS Serverless Feedback Router
A guidebook to developing simple machine learning workflows with Amazon Web Services (AWS). As you dive in, you will learn how to leverage AWS services to build, deploy, and monitor your machine learning models efficiently.
Master the AWS Data Stack! 🚀 This repository features 15+ Industrial Data Engineering Projects covering Serverless ETL, Real-Time Streaming, & Data Warehousing. Hands-on labs for S3, Lambda, Spark, Airflow, Snowflake, Redshift, Kinesis, & Glue. Includes production-grade CICD pipelines. A complete roadmap to becoming a top Data Professional.
End-to-end data pipeline that ingests popular movies & TV series from IMDB/TMDB APIs, enriches and transforms the data using PySpark on a serverless AWS pipeline (Step Functions, Glue, Lambda, S3) following a medallion data architecture (raw → staging → gold), and delivers insights via a Power BI analytics dashboard powered by custom DAX measures.
A simple workflow for developing AWS Step Functions to demonstrate how you can combine AWS Step Functions with AWS Lambda using .NET 8 and the Serverless Application Model (SAM), and expose your workflow via an API Gateway!
Scalable, robust and production ready Medallion architecture integrating AWS services with Databricks Delta Lake, featuring dynamic contract validation, idempotency and automated rejection handling for high-integrity Gold layer generation.
An unit testing toolkit for Amazon States Language
End-to-end serverless AI pipeline on AWS to analyze customer reviews.
🚀 Event-driven ML inference pipeline using AWS Step Functions and Lambda. Orchestrates a SageMaker image classification workflow with automated confidence-threshold filtering and state machine error handling.
🌳 A sustainable Terraform Package which creates Lambda & Step Functions resources on AWS
This project is a core component of the Udacity AWS Machine Learning Engineer Nanodegree.
Add a description, image, and links to the aws-step-functions topic page so that developers can more easily learn about it.
To associate your repository with the aws-step-functions topic, visit your repo's landing page and select "manage topics."