Skip to content

mjcaldev/aws-img-pl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

28 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ–ΌοΈ Smart Image Processing Pipeline (AWS Serverless)

A production-ready serverless image processing pipeline built with AWS, Terraform, and Vue 3. Users upload images directly to S3, which triggers an automated workflow that resizes images, detects labels using AWS Rekognition, and stores metadata in DynamoDB.


πŸ› οΈ Tech Stack

Frontend

  • Vue 3 (Composition API) - Modern reactive UI
  • Vite - Fast build tool and dev server
  • Vanilla JavaScript - No external dependencies

Backend

  • AWS Lambda (Python 3.11, ARM64) - Serverless compute
  • AWS Step Functions - Workflow orchestration
  • AWS S3 - Object storage with presigned URLs
  • AWS Rekognition - Image label detection
  • AWS DynamoDB - Metadata storage
  • API Gateway HTTP API - RESTful endpoints

Infrastructure

  • Terraform - Infrastructure as Code
  • IAM - Secure role-based access control

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Vue Frontendβ”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚ POST /upload-url
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚API Gateway  │────▢│ Lambda       β”‚
β”‚             β”‚     β”‚ (Presigned)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β”‚ PUT (Presigned URL)
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   S3 Bucket β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚ ObjectCreated Event
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Lambda      β”‚
β”‚ (Trigger)    β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚ Start Execution
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Step Functions State Machine     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚ Resize   │─▢│Rekognition│─▢│Storeβ”‚β”‚
β”‚  β”‚ Image    β”‚  β”‚  Labels   β”‚  β”‚Meta β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                    β”‚
       β–Ό                    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   S3        β”‚     β”‚  DynamoDB   β”‚
β”‚ (processed)  β”‚     β”‚  (metadata) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β”‚ GET /results
                            β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚ Vue Frontendβ”‚
                    β”‚  (Polling)  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Components

Component Purpose
S3 Bucket Stores original uploads and processed images. Triggers pipeline on upload.
API Gateway Exposes REST endpoints for presigned URL generation and results retrieval.
Lambda Functions 5 functions: presigned URL, trigger, resize, Rekognition, store metadata, get results
Step Functions Orchestrates the 3-stage pipeline with retry logic and error handling.
Rekognition Detects up to 10 labels per image with 70% confidence threshold.
DynamoDB Stores image metadata (key, bucket, labels, timestamps) for frontend polling.

πŸ”„ End-to-End Flow

  1. Upload Request: Frontend requests presigned URL from API Gateway
  2. Direct Upload: User uploads image directly to S3 using presigned URL
  3. Event Trigger: S3 ObjectCreated event invokes trigger Lambda
  4. Pipeline Execution: Step Functions orchestrates:
    • ResizeImage: Copies image to processed/ prefix
    • RekognitionLabels: Detects labels using AWS Rekognition
    • StoreMetadata: Saves results to DynamoDB
  5. Polling: Frontend polls GET /results endpoint until processing completes
  6. Display: Results displayed with detected labels

πŸ› Top 3 Critical Errors & Fixes

1. DynamoDB Primary Key Mismatch + Missing Environment Variable

Error:

  • DynamoDB table defined with hash_key = "imageId" but Lambda code wrote "image_key"
  • store_metadata Lambda referenced TABLE_NAME environment variable that wasn't configured

Impact:

  • Runtime failures: DynamoDB operations rejected due to missing primary key
  • Lambda crashes: TABLE_NAME was None, causing Table(None) initialization errors

Fix:

  • Changed DynamoDB table hash key from "imageId" to "image_key" to match Lambda code
  • Added TABLE_NAME environment variable to store_metadata Lambda in Terraform
  • Moved DynamoDB table initialization inside handler to avoid import-time failures

Files Changed:

  • infrastructure/dynamodb.tf (line 5)
  • infrastructure/lambda.tf (lines 55-59)
  • lambdas/store_metadata/handler.py (moved table creation inside handler)

2. S3 Presigned URL Signature Mismatch (403 Forbidden)

Error:

  • Presigned URL generated without ContentType in Params
  • Frontend sent Content-Type header in PUT request
  • S3 rejected uploads with 403 Forbidden due to signature mismatch

Impact:

  • All browser uploads failed silently
  • Users couldn't upload images

Fix:

  • Added ContentType parameter to presigned URL generation in Lambda
  • Frontend sends matching Content-Type header value
  • Implemented dynamic content type support (JPEG, PNG, WEBP)

Files Changed:

  • lambdas/get_presigned_url/handler.py (line 19, added ContentType to Params)
  • frontend/src/App.vue (line 64, sends Content-Type header)

3. Step Functions Schema Validation Error

Error:

  • Terraform apply failed with: "States.ALL must appear alone and at end of list"
  • Retry blocks combined States.ALL with other error types: ["States.TaskFailed", "States.Timeout", "States.ALL"]

Impact:

  • Infrastructure deployment failures
  • State machine couldn't be created

Fix:

  • Changed all Retry blocks to use only ["States.ALL"] (covers all error types)
  • Removed redundant error type specifications

Files Changed:

  • infrastructure/stepfunctions.tf (lines 51, 71, 91)

βœ… Additional Improvements Made

  • Error Handling: Added try/except blocks and logging to all Lambda functions
  • Step Functions Resilience: Added retry policies (3 attempts, exponential backoff) and catch blocks
  • S3 Security: Enabled server-side encryption (AES256) and CORS configuration
  • Lambda Configuration: Set timeouts (120s) and memory (512MB) for image processing
  • Frontend UX: Implemented two-phase polling (active β†’ background) with timeout protection
  • CORS: Fixed API Gateway and S3 CORS for browser-based uploads

πŸ“ Project Structure

aws-img-pl/
β”œβ”€β”€ infrastructure/          # Terraform IaC
β”‚   β”œβ”€β”€ main.tf             # Provider configuration
β”‚   β”œβ”€β”€ variables.tf         # Input variables
β”‚   β”œβ”€β”€ outputs.tf          # Output values
β”‚   β”œβ”€β”€ s3.tf               # S3 bucket, CORS, encryption
β”‚   β”œβ”€β”€ dynamodb.tf         # DynamoDB table
β”‚   β”œβ”€β”€ lambda.tf           # Lambda function definitions
β”‚   β”œβ”€β”€ stepfunctions.tf    # Step Functions state machine
β”‚   β”œβ”€β”€ api-gateway.tf      # API Gateway routes
β”‚   └── iam.tf              # IAM roles and policies
β”œβ”€β”€ lambdas/                # Lambda function code
β”‚   β”œβ”€β”€ get_presigned_url/  # Generate S3 presigned URLs
β”‚   β”œβ”€β”€ trigger_step_function/ # S3 event β†’ Step Functions
β”‚   β”œβ”€β”€ resize_image/       # Copy image to processed/
β”‚   β”œβ”€β”€ rekognition_labels/ # AWS Rekognition label detection
β”‚   β”œβ”€β”€ store_metadata/     # Save to DynamoDB
β”‚   └── get_results/        # Query DynamoDB for results
└── frontend/               # Vue 3 application
    └── src/
        └── App.vue         # Main application component

πŸš€ Planned Next Steps

High Priority

  1. Fix Key Preservation Issue: Update rekognition_labels Lambda to pass through original key field to prevent data loss in Step Functions state
  2. Add Error Handling: Complete error handling for resize_image and rekognition_labels Lambdas (currently missing try/except blocks)
  3. CloudWatch Monitoring: Add alarms for Lambda errors, Step Functions failures, and Rekognition throttling

Medium Priority

  1. Implement Actual Image Resizing: Replace file copy with actual image resizing using PIL/Pillow
  2. Add Dead-Letter Queues: Configure DLQs for failed Lambda invocations and Step Functions executions
  3. Tighten IAM Permissions: Scope wildcard permissions to specific resources (Step Functions ARN, etc.)
  4. Add Input Validation: Validate image size, format, and quality before processing

Nice to Have

  1. User Authentication: Add AWS Cognito for user management
  2. Image Preview: Display uploaded images in frontend
  3. Batch Processing: Support multiple image uploads
  4. CloudFront CDN: Add CDN for optimized image delivery
  5. S3 Lifecycle Policies: Automate cleanup of old processed images

πŸ”§ Setup & Deployment

Prerequisites

  • AWS CLI configured
  • Terraform >= 1.5.0
  • Node.js >= 20.19.0
  • Python 3.11

Deploy Infrastructure

cd infrastructure
terraform init
terraform plan
terraform apply

Build Lambda Packages

cd lambdas/<function-name>
zip -r build.zip handler.py

Run Frontend

cd frontend
npm install
npm run dev

πŸ“Š Current Status

βœ… Working Features:

  • End-to-end image upload and processing
  • Presigned URL generation with dynamic content types
  • Step Functions orchestration with retry logic
  • Rekognition label detection
  • DynamoDB metadata storage
  • Frontend polling with background processing
  • CORS configuration for browser uploads

⚠️ Known Issues:

  • Original upload key (uploads/<uuid>.jpg) is dropped in RekognitionLabels step
  • Some Lambda functions lack comprehensive error handling
  • No monitoring/alarms configured

πŸ“œ License

MIT

About

I created this for an AWS interview to show familiarity with core cloud functionalities.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors