Please read this disclaimer before
proceeding:
This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document through
email in error, please notify the system manager. This document contains
proprietary information and is intended only to the respective group / learning
community as intended. If you are not the addressee you should not disseminate,
distribute or copy through e-mail. Please notify the sender immediately by e-mail
if you have received this document by mistake and delete this document from your
system. If you are not the intended recipient you are notified that disclosing,
copying, distributing or taking any action in reliance on the contents of this
information is strictly prohibited.
22CS908
CLOUD ARCHITECTING
Department : CSE
Batch/Year : 2022-2026 / III Year
Created by:
Ms. A. Jasmine Gilda, Assistant Professor / CSE
Date : 25.08.2024
1. CONTENTS
S. No. Contents
1 Contents
2 Course Objectives
3 Pre-Requisites
4 Syllabus
5 Course outcomes
6 CO- PO/PSO Mapping
7 Lecture Plan
8 Activity based learning
9 Lecture Notes
10 Assignments
11 Part A Questions & Answers
12 Part B Questions
13 Online Certifications
14 Real Time Applications
15 Assessment Schedule
16 Text Books & Reference Books
17 Mini Project suggestions
2. COURSE OBJECTIVES
▪ To make architectural decisions based on AWS architectural principles
and best practices.
▪ To describe the features and benefits of Amazon EC2 instances, and
compare and contrast managed and unmanaged database services.
▪ To create a secure and scalable AWS network environment with VPC,
and configure IAM for improved security and efficiency.
▪ To use AWS services to make infrastructure scalable, reliable, and
highly available.
▪ To use AWS managed services to enable greater flexibility and
resiliency in an infrastructure.
3. PRE REQUISITES
• Pre-requisite Chart
20CS908 – CLOUD ARCHITECTING
20CS907 – CLOUD FOUNDATIONS
4. SYLLABUS
CLOUD ARCHITECTING L T P C
22CS908
2 0 2 3
UNIT I INTRODUCING CLOUD ARCHITECTING AND STORAGE LAYER 6 + 6
Cloud architecting - The AWS Well-Architected Framework - AWS global infrastructure -
Amazon S3 - Amazon S3 Versioning - Storing data in Amazon S3 - Moving data to and from
Amazon S3 - Amazon S3 Transfer Acceleration - Choosing Regions for your architecture.
List of Exercise/Experiments:
1. Creating a Static Website for the Café.
2. Configure an S3 bucket to automatically encrypt all uploaded objects.
3. Set up a cross-region replication configuration for an S3 bucket.
UNIT II COMPUTE LAYER AND DATABASE LAYER 6+6
Adding compute with Amazon EC2 - Choosing an Amazon Machine Image (AMI) to launch
an Amazon EC2 instance - Selecting an Amazon EC2 instance type - Using user data to
configure an EC2 instance - Adding storage to an Amazon EC2 instance - Amazon EC2 pricing
options - Amazon EC2 considerations - Database layer considerations - Amazon Relational
Database Service (Amazon RDS) - Amazon DynamoDB - Database security controls -
Migrating data into AWS databases.
List of Exercise/Experiments:
1. Creating a Dynamic Website for the Café.
2. Creating an Amazon RDS database.
3. Migrating a Database to Amazon RDS.
4. Create a web application that stores data in a managed database using EC2 instances
and Amazon RDS.
UNIT III CREATING AND CONNECTING NETWORKS 6+6
Creating an AWS networking environment - Connecting your AWS networking environment
to the internet - Securing your AWS networking environment - Connecting your remote
network with AWS Site-to-Site VPN - Connecting your remote network with AWS Direct
Connect - Connecting virtual private clouds (VPCs) in AWS with VPC peering - Scaling your
VPC network with AWS Transit Gateway - AWS Transit Gateway - Connecting your VPC to
supported AWS services. Securing User and Application Access: Account users and AWS
Identity and Access Management (IAM) - Organizing users - Federating users - Multiple
accounts.
List of Exercise/Experiments:
1. Creating a Virtual Private Cloud.
2. Creating a VPC Networking Environment for the Café.
3. Creating a VPC Peering Connection.
4. Configure a VPC with subnets, an internet gateway, route tables, and a
security group, and connect an on-premises network to the VPC.
UNIT IV RESILIENT CLOUD ARCHITECTURE 6+6
Scaling your compute resources - Scaling your databases - Designing an environment that’s
highly available – Monitoring - Reasons to automate - Automating your infrastructure -
Automating deployments - AWS Elastic Beanstalk - Overview of caching - Edge caching -
Caching web sessions - Caching databases.
List of Exercise/Experiments:
1. Controlling Account Access by Using IAM.
2. Creating Scaling Policies for Amazon EC2 Auto Scaling.
3. Creating a Highly Available Web Application.
4. Creating a Scalable and Highly Available Environment for the Café.
5. Streaming Dynamic Content Using Amazon CloudFront.
UNIT V BUILDING DECOUPLED ARCHITECTURES, MICROSERVICES 6 + 6
AND SERVERLESS ARCHITECTURE
Decoupling your architecture - Decoupling with Amazon Simple Queue Service (Amazon SQS)
- Decoupling with Amazon Simple Notification Service (Amazon SNS) - Sending messages
between cloud applications and on-premises with Amazon MQ. Introducing microservices -
Building microservice applications with AWS container services - Introducing serverless
architectures - Building serverless architectures with AWS Lambda - Extending serverless
architectures with Amazon API Gateway - Orchestrating microservices with AWS Step
Functions - Disaster planning strategies - Disaster recover patterns.
List of Exercise/Experiments:
1. Breaking a Monolithic Node.js Application into Microservices.
2. Implementing a Serverless Architecture on AWS.
3. Implementing a Serverless Architecture for the Café.
4. Creating an AWS Lambda Function and explore using AWS Lambda with Amazon S3.
TOTAL: 60 PERIODS
5. COURSE OUTCOME
At the end of this course, the students will be able to:
CO1: Explain cloud architecture principles and AWS storage solutions.
CO2: Deploy and manage AWS compute and database resources
securely.
CO3: Design and configure secure AWS networks using VPC and IAM.
CO4: Implement scalable and resilient AWS architectures with high
availability.
CO5: Build decoupled and serverless applications using AWS services
like Lambda.
CO6: Develop disaster recovery strategies for AWS environments.
6. CO - PO / PSO MAPPING
PROGRAM OUTCOMES PSO
PSO-1
PSO-2
PSO-3
PO-10
PO-11
PO-12
PO-2
PO-4
PO-1
PO-3
PO-5
PO-6
PO-7
PO-8
PO-9
CO HKL
CO1 K3 2 1 - - 2 - - 3 - 2 - 2 3 2 2
CO2 K3 2 2 - - 2 - - 2 2 2 - 2 3 2 2
CO3 K3 2 2 - - 2 - - 2 2 2 - 2 3 2 2
CO4 K3 2 2 3 - 2 - - 2 2 2 - 2 3 3 2
CO5 K3 2 2 3 - 2 - - - 2 2 - 2 3 3 3
C06 K3 2 2 3 - 2 - - - 2 2 - 2 3 3 3
Correlation Level:
1. Slight (Low)
2. Moderate (Medium)
3. Substantial (High)
If there is no correlation, put “-“.
7. LECTURE PLAN
Number Actual
S. Proposed Taxonomy Mode of
Topic of Lecture CO
No. Date Level Delivery
Periods Date
Scaling your
compute resources -
1 1 CO4 K2 PPT/
Scaling your
Demo
databases
Designing an
environment that’s PPT/
2 1 CO4 K2
highly available – Demo
Monitoring
Reasons to
automate - PPT/
3 1 CO4 K2
Automating your Demo
infrastructure
Automating
deployments - AWS
PPT/
4 1 CO4 K2
Demo
Elastic Beanstalk
Overview of caching
5 1 CO4 K2 PPT/
- Edge caching
Demo
Caching web
6 sessions - Caching 1 CO4 K2 PPT/
databases Demo
8. ACTIVITY BASED LEARNING
Database Scaling Role Play:
• Activity: Divide students into groups representing different database scaling strategies (vertical,
horizontal, Amazon Aurora, etc.). Each group will present their strategy and its pros/cons.
• Outcome: Students will gain insights into various database scaling methods and their applications.
Load Balancer Comparison Debate:
• Activity: Hold a debate on the merits of different types of load balancers. Students will research
and present their findings on which balancer suits specific scenarios.
• Outcome: Critical thinking on the best load balancing solutions for varying workloads.
9. UNIT IV - LECTURE NOTES
SCALING YOUR COMPUTE RESOURCES
What is elasticity?
One key characteristic of a reactive architecture is elasticity, which refers to the ability of the
infrastructure to dynamically expand and contract in response to changing capacity needs. This flexibility
allows you to acquire resources when demand increases and release them when they are no longer
required.
Elasticity empowers you to:
• Scale up the number of web servers when your application experiences a spike in traffic.
• Reduce the write capacity of your database when traffic decreases, optimizing costs.
• Manage daily fluctuations in demand across your architecture effectively.
For instance, in a café scenario, elasticity becomes crucial after the business is featured on a TV show.
The website might experience a sudden surge in traffic, which could return to normal levels after a week
or spike again during holiday seasons. Elasticity ensures that the infrastructure can handle these
variations smoothly, maintaining performance without over-provisioning resources.
Examples:
• Increasing the number of web servers when traffic spikes
• Lowering write capacity on your database when traffic goes down
• Handling the day-to-day fluctuation of demand throughout your architecture
What is scaling?
Scaling is a key technique used to achieve elasticity, allowing you to adjust the compute capacity of your
application by increasing or decreasing resources based on demand.
There are two types of scaling:
• Horizontal Scaling (Scaling Out/In): This approach involves adding or removing resources to
meet demand. For example, you might add more servers to support an application or expand a
storage array by adding more hard drives. Scaling out refers to adding resources, while scaling in
means terminating them. Horizontal scaling is ideal for building internet-scale applications,
leveraging the cloud's elasticity to dynamically adjust resource availability.
• Vertical Scaling (Scaling Up/Down): This method involves increasing or decreasing the
specifications of a single resource, such as upgrading a server with a larger hard drive or a faster
CPU. In AWS, Amazon EC2 allows you to stop an instance and resize it to a more powerful instance
type with enhanced RAM, CPU, I/O, or networking capabilities. Although vertical scaling is easy to
implement and suitable for many short-term use cases, it has limitations in scalability, cost
efficiency, and availability compared to horizontal scaling.
Fig: Horizontal Scaling
Fig: Vertical Scaling
Amazon EC2 Auto Scaling:
Amazon EC2 Auto Scaling automatically manages the scaling of your application to maintain high
availability and performance. It adjusts the number of EC2 instances based on predefined conditions,
schedules, or health checks, launching or terminating instances as demand changes.
Key features of Amazon EC2 Auto Scaling include:
• Dynamic Instance Management: Launches or terminates instances automatically based on
your scaling policies, ensuring your application always has the right capacity.
• Integration with Elastic Load Balancing (ELB): Automatically registers new instances with
load balancers, evenly distributing traffic to maintain performance.
• Multi-Availability Zone Deployment: Supports deployments across multiple Availability Zones
within a region, enhancing resilience. If an Availability Zone becomes unhealthy, Auto Scaling
launches instances in other zones and redistributes them when the affected zone recovers.
Amazon EC2 Auto Scaling enables you to build resilient, highly available applications that adapt to
changing demand without manual intervention.
Scaling options:
Amazon EC2 Auto Scaling offers various ways to automatically adjust the scaling of your applications
based on different needs:
1. Scheduled Scaling: This method allows you to schedule scaling actions based on specific dates
and times, making it ideal for predictable workloads. For example, if your web app traffic spikes
every Wednesday, stays high on Thursday, and drops on Friday, you can set up scheduled scaling
to handle these patterns by creating a scheduled action.
2. Dynamic (On-Demand) Scaling: Dynamic scaling automatically adjusts your resources in
response to real-time demand changes. For instance, if your web application runs on two EC2
instances and you want to maintain around 50% CPU utilization, dynamic scaling will automatically
add or remove instances to keep performance balanced during traffic spikes without over-
provisioning idle resources.
3. Predictive Scaling: Predictive scaling uses AWS machine learning models to forecast demand
based on historical usage patterns and billions of data points. It predicts expected traffic, including
daily and weekly trends, to create a scaling plan that automatically adjusts EC2 instances ahead
of time. This model is updated every 24 hours to optimize scaling for the next 48 hours.
Dynamic scaling policy types:
Dynamic scaling adjusts your application’s capacity to meet changing demand, optimizing availability,
performance, and cost. The type of scaling policy you choose determines how scaling actions are
performed:
1. Simple Scaling: Makes a single scaling adjustment when a threshold is breached. It's useful for
new or spiky workloads.
2. Step Scaling: Adjusts capacity based on the size of the alarm breach. Larger breaches lead to
bigger adjustments. It’s ideal for predictable workloads.
3. Target Tracking Scaling: Automatically adjusts capacity to maintain a specific target value for
a chosen metric. It works like a thermostat—set the target, and the system adjusts capacity to
keep the metric close to that target. This is especially useful for horizontally scalable applications,
like load-balanced or batch processing apps.
With step scaling and simple scaling, you define metrics and threshold values for CloudWatch alarms,
which trigger scaling when conditions are met. The key difference is that step scaling makes varying
adjustments depending on how much the metric exceeds the threshold, while simple scaling makes one
fixed adjustment.
Target tracking scaling automates the process, adjusting capacity based on a set target metric, such
as CPU utilization. Amazon EC2 Auto Scaling manages the alarms and scaling adjustments to keep your
application running smoothly as demand fluctuates.
Auto Scaling groups:
An Auto Scaling group in Amazon EC2 Auto Scaling defines three key parameters:
• Minimum Capacity: The minimum number of instances that the group will always maintain.
• Maximum Capacity: The maximum number of instances the group can scale up to.
• Desired Capacity: The target number of instances that should be running at any given time,
which can change in response to scaling events.
Amazon EC2 Auto Scaling ensures you have the right number of EC2 instances to handle your application’s
load by managing these parameters.
• The desired capacity reflects the current number of running instances and can adjust based on
scaling events, such as when a threshold is breached. It always stays between the specified
minimum and maximum values.
• Scaling policies automatically adjust the desired capacity based on real-time conditions, ensuring
the group dynamically scales to meet demand.
When you initially set the desired capacity, it indicates how many instances you want running. However,
the actual number of running instances may vary until Amazon EC2 Auto Scaling adjusts to match your
desired setting by launching or terminating instances.
Amazon EC2 Auto Scaling: Purchasing options:
Amazon EC2 Auto Scaling helps you automatically scale your infrastructure up or down in response to
changing conditions. When setting up an Auto Scaling group, you can define the EC2 instance types it
uses and specify the mix of On-Demand, Reserved, and Spot Instances to meet your desired capacity at
the lowest possible cost.
Key points:
• You can select a single instance type, but it’s best to use multiple types to avoid capacity shortages.
• You can specify the percentage of desired capacity to be fulfilled by each purchasing option (On-
Demand, Reserved, Spot).
• Auto Scaling prioritizes the lowest-cost combination of instances based on your settings.
• If a Spot Instance request cannot be fulfilled in one pool, the group will try other Spot Instance
pools before turning to On-Demand Instances, ensuring better availability and cost efficiency.
This approach ensures that your Auto Scaling group is flexible, cost-effective, and resilient against
capacity constraints.
Automatic scaling considerations:
When using Amazon EC2 Auto Scaling to manage your architecture, consider the following key aspects:
1. Types of Automatic Scaling: Use a combination of scaling methods like scheduled, dynamic,
and predictive scaling to adapt to varying workloads.
2. Dynamic Scaling Policies:
o Simple Scaling: Adjusts capacity with a single scaling action based on an alarm.
o Step Scaling: Adjusts capacity with multiple scaling steps depending on how much the
metric breaches the threshold.
o Target Tracking Scaling: Adjusts capacity to maintain a specific target metric, such as
average CPU utilization or requests per target.
3. Multiple Metrics: Scale on more than just CPU utilization. Use target tracking policies to scale
based on metrics that vary inversely with capacity, such as load balancer request counts, to closely
match the demand curve.
4. Scaling Timing: Scale out quickly to meet demand spikes and scale in slowly to avoid resource
shortages.
5. Lifecycle Hooks: Lifecycle hooks allow custom actions during instance launch or termination.
Instances can be paused in a wait state for up to an hour, giving you time to perform tasks like
configuration or cleanup before the scaling action is finalized.
SCALING YOUR DATABASES
Vertical scaling with Amazon RDS: Push-button scaling:
When scaling relational databases with Amazon RDS, you can vertically scale your database instance by
changing its instance class to meet the growing demands of your application. Here's what to keep in
mind:
• Vertical Scaling: You can scale your Amazon RDS DB instance up or down by adjusting the
instance class (e.g., from micro to 24xlarge) to increase or decrease compute resources.
• Minimal Downtime: Scaling operations involve brief downtime, usually lasting only a few
minutes, typically during your maintenance window unless you opt to apply changes immediately.
• Storage Remains Unchanged: Scaling the instance class does not affect your storage size. To
increase storage capacity or improve performance, you can modify the allocated storage or change
the storage type (e.g., upgrading from General Purpose SSD to Provisioned IOPS SSD).
• Automatic Storage Scaling: Instead of manually adjusting storage, use Amazon RDS Storage
Autoscaling to automatically expand storage in response to increased database workloads.
Horizontal scaling with Amazon RDS: Read replicas:
In addition to vertical scaling, Amazon RDS allows you to horizontally scale your database to handle read-
heavy workloads. Here are the key points:
• Read Replicas: You can create up to five read replicas (and up to 15 for Amazon Aurora) from
a source DB instance using built-in replication functionality in engines like MariaDB, MySQL, Oracle,
and PostgreSQL.
• Asynchronous Replication: Updates made to the source DB instance are asynchronously copied
to the read replicas, allowing for reduced load on the source by routing read queries to the replicas.
• Improved Performance: By distributing read traffic across multiple replicas, you can increase
the aggregate read throughput, making it easier to handle high-volume application demands.
• Disaster Recovery: In case of a failure, you can promote a read replica to become a standalone
DB instance, enhancing your database's availability.
Scaling with Amazon Aurora:
Amazon Aurora enhances the benefits of read replicas with a purpose-built, SSD-backed virtual storage
layer designed for database workloads. It is a cloud-native, MySQL and PostgreSQL-compatible relational
database engine managed by Amazon RDS, which handles tasks like provisioning, patching, backups,
recovery, failure detection, and repair.
Key features of Amazon Aurora include:
• Scalable Storage: The storage system can automatically scale up to 64 TB per database
instance, ensuring high performance and availability.
• Fault-Tolerant Architecture: Aurora has a distributed, self-healing storage system that
provides continuous backup to Amazon S3 and replication across three Availability Zones for
improved reliability.
• DB Cluster Structure: An Aurora DB cluster consists of one or more DB instances and a cluster
volume that manages data for those instances.
The two types of instances in an Aurora cluster are:
1. Primary DB Instance: Handles both read and write operations and performs all data
modifications. Each cluster has one primary instance.
2. Aurora Replicas: Connect to the same storage volume as the primary instance and support only
read operations. You can have up to 15 replicas per cluster to increase read throughput.
You can choose the instance class size and adjust the number of Aurora replicas based on your workload.
This flexibility is particularly effective for predictable workloads, allowing you to manually scale capacity
as needed.
Amazon Aurora Serverless:
In environments with unpredictable workloads, such as retail websites during sales events or
development and testing environments, it can be challenging to configure the right database capacity.
This often leads to paying for unused capacity, especially during long periods of low activity.
Aurora Serverless offers a solution for these scenarios. It is an on-demand, automatically scaling
configuration for Amazon Aurora that allows your database to start, stop, and adjust capacity based on
your application’s needs without requiring you to manage database instances.
Key features of Aurora Serverless include:
• Automatic Scaling: You create a database endpoint without specifying the instance class size.
Instead, you define Aurora Capacity Units (ACUs), which combine processing and memory
capacity. The database can automatically scale from 10 GiB to 64 TiB of storage.
• Proxy Fleet: The database endpoint connects to a proxy fleet that routes workloads to a pool of
resources, scaling automatically within your specified minimum and maximum capacity limits.
• Cost Efficiency: You pay on a per-second basis for the capacity used while the database is active,
allowing you to save on costs during idle times. You can also easily migrate between standard and
serverless configurations.
This makes Aurora Serverless ideal for infrequent, intermittent, or unpredictable workloads, providing
flexibility and cost savings without the need for constant capacity management.
Horizontal scaling: Database sharding:
Sharding, or horizontal partitioning, is a widely used technique to enhance write performance in relational
databases. Here’s a simplified overview:
• What is Sharding? Sharding involves dividing data into smaller subsets called shards, which are
distributed across multiple database servers. Each shard operates independently, typically using
the same hardware, database engine, and data structure for consistent performance.
• Example: Without sharding, all data (like employee IDs) is stored in one database. With sharding,
you might place even-numbered employee IDs in one shard and odd-numbered IDs in another.
• Benefits: Sharding improves scalability and fault tolerance. If one shard encounters hardware
issues, it does not affect the others, isolating potential failures and performance slowdowns.
• Considerations: While sharding enhances performance, it requires careful engineering for data
mapping and routing logic. Queries that need to read or join data from multiple shards can
experience higher latency compared to a non-sharded setup.
Scaling with Amazon DynamoDB: On-Demand:
For unpredictable workloads needing a nonrelational database, Amazon DynamoDB On-Demand
offers a flexible billing option. Here’s a simplified overview:
• Flexible Pricing: Unlike the provisioned pricing model, DynamoDB On-Demand uses a pay-per-
request pricing structure, allowing you to serve thousands of requests per second without needing
to plan for capacity.
• Automatic Scaling: DynamoDB can quickly adapt to spikes in traffic. If your workload
experiences sudden increases, it scales up rapidly to handle the demand, making it ideal for
unpredictable usage patterns.
• Capacity Changes: You can switch a table from provisioned capacity to on-demand once per
day. However, you can change from on-demand to provisioned capacity as often as you need.
Scaling with Amazon DynamoDB: Auto scaling:
Amazon DynamoDB includes an auto scaling feature that is enabled by default. Here’s a simplified
overview:
• Automatic Capacity Adjustment: DynamoDB auto scaling automatically adjusts read and write
throughput based on changing request volumes, ensuring zero downtime. You set your desired
utilization target along with minimum and maximum limits, and DynamoDB handles the scaling for
you.
• Monitoring: The auto scaling feature works with Amazon CloudWatch to continuously monitor
actual throughput. If usage deviates from your target, DynamoDB automatically scales the
capacity up or down.
• No Extra Costs: There are no additional charges for using DynamoDB auto scaling; you only pay
for your regular DynamoDB usage and any CloudWatch alarms.
This makes managing throughput capacity easy and efficient for your applications.
How to implement DynamoDB auto scaling:
Amazon DynamoDB auto scaling utilizes the Application Auto Scaling service to automatically adjust
provisioned throughput based on actual traffic patterns. Here’s a simplified overview of how it works:
• Dynamic Adjustment: The service allows a table or global secondary index (GSI) to increase its
read and write capacity to handle sudden traffic spikes without throttling. When traffic decreases,
it reduces capacity, so you only pay for what you use.
Steps to Implement DynamoDB Auto Scaling:
1. Create a Scaling Policy: Define whether to scale read, write, or both capacities, and set the
minimum and maximum provisioned capacity limits for your table or index.
2. Monitor Capacity: DynamoDB sends consumed capacity metrics to Amazon CloudWatch.
3. Trigger Alarms: If the consumed capacity exceeds or falls below your target utilization for a set
period, CloudWatch triggers an alarm. You can set up Amazon SNS to receive notifications.
4. Evaluate Policy: The CloudWatch alarm activates Application Auto Scaling to check your scaling
policy.
5. Adjust Throughput: Application Auto Scaling sends an UpdateTable request to DynamoDB to
adjust your table's provisioned throughput accordingly.
6. Dynamic Changes: DynamoDB processes the request, increasing or decreasing the throughput
to meet your target utilization.
Scaling throughput capacity: DynamoDB adaptive capacity:
DynamoDB Adaptive Capacity helps manage uneven access patterns by allowing continued reading
and writing to hot partitions without throttling. Here’s how it works:
• Hot Partitions: When data access is imbalanced, certain partitions may receive significantly more
read and write traffic, leading to what's known as a hot partition. If a single partition exceeds
3,000 read capacity units (RCUs) or 1,000 write capacity units (WCUs), throttling can occur.
• Automatic Adjustment: Adaptive capacity automatically increases the throughput for partitions
experiencing higher traffic, ensuring your application can keep functioning smoothly.
• Capacity Limits: While it helps prevent throttling, the total traffic still cannot exceed the table's
overall provisioned capacity or the maximum capacity for that partition.
• Default Feature: Adaptive capacity is enabled by default for all DynamoDB tables, so you don’t
need to manually turn it on or off.
This feature allows for better handling of unpredictable workloads without compromising performance.
Adaptive capacity example:
This diagram shows how DynamoDB's adaptive capacity functions:
• Provisioned Capacity: The example table has 400 write capacity units (WCUs) evenly distributed
across four partitions, allowing each partition to handle up to 100 WCUs per second.
• Traffic Distribution: Partitions 1, 2, and 3 receive 50 WCUs per second each, while Partition 4
experiences higher traffic at 150 WCUs per second. Initially, Partition 4 can manage this load due
to unused burst capacity but will eventually throttle if the traffic exceeds 100 WCUs.
• Adaptive Response: DynamoDB's adaptive capacity automatically increases the capacity of
Partition 4, enabling it to handle the higher workload of 150 WCUs per second without throttling.
DESIGNING AN ENVIRONMENT THAT’S HIGHLY AVAILABLE
Highly available systems:
To ensure a system is highly available, it should meet the following criteria:
• Resilience: It can endure some degradation while still being accessible.
• Minimal Downtime: Downtime is kept to a minimum.
• Low Human Intervention: It requires little manual oversight.
• Effective Recovery: It can recover from failures or switch to a backup source within an
acceptable timeframe, even if performance is slightly degraded.
Best practices in architecture design emphasize avoiding single points of failure. A highly available system
is designed to maintain service continuity despite challenges like increased load, attacks, or component
failures, ensuring a robust and resilient experience.
Elastic Load Balancing:
Elastic Load Balancing (ELB) is a managed service that distributes incoming application traffic across
multiple targets, such as EC2 instances, containers, IP addresses, and Lambda functions. It can be set
up as either an external-facing load balancer for public traffic or an internal-facing one for private traffic.
Key features of ELB include:
• Automatic Traffic Distribution: ELB evenly distributes traffic across targets in one or multiple
Availability Zones, adapting to varying loads.
• Health Monitoring: It performs health checks on registered targets by sending pings or requests
to ensure they are functioning. Only healthy targets receive traffic.
• DNS Integration: Each load balancer is assigned a default Domain Name System (DNS) name
for easy access.
Using ELB is essential for building a highly available architecture, as it ensures that your application
remains responsive and resilient to failures.
Types of load balancers:
Elastic Load Balancing (ELB) offers three types of load balancers, all designed to ensure high availability,
automatic scaling, and robust security for fault-tolerant applications:
1. Application Load Balancer (ALB):
o Operates at layer 7 of the OSI model (application level).
o Routes traffic based on the content of requests, ideal for HTTP and HTTPS traffic.
o Supports advanced features like request routing for microservices and container-based
applications.
o Ensures up-to-date SSL/TLS security protocols.
2. Network Load Balancer (NLB):
o Operates at layer 4 of the OSI model (network transport level).
o Routes TCP and UDP traffic and can handle millions of requests per second with low latency.
o Optimized for sudden, unpredictable traffic patterns.
3. Classic Load Balancer:
o Provides basic load balancing for HTTP, HTTPS, TCP, and SSL traffic.
o Operates at both the application and network levels.
o Considered an older option; AWS recommends using ALB or NLB when possible.
Additionally, you can access internal load balancers from another VPC using VPC peering, which supports
both intra-Region and inter-Region connectivity.
Implementing high availability:
To create a highly available application, it's best to deploy resources across multiple Availability Zones
and use a load balancer to manage traffic. This approach ensures greater availability in case of a data
center failure.
In a basic setup, two web servers run on EC2 instances in different Availability Zones. These instances
are connected to an Elastic Load Balancer, which distributes incoming traffic between them. If one server
goes down, the load balancer stops sending traffic to the unhealthy instance and directs it to the healthy
one, keeping the application available even during a failure.
Most applications can effectively utilize two Availability Zones within an AWS Region. However, if your
data sources only support primary/secondary failover, additional Availability Zones may not provide
significant benefits. Since Availability Zones are physically separated, having resources in three or more
zones might not yield much advantage unless you're using services like Amazon DynamoDB or heavily
relying on EC2 Spot Instances.
Example of a highly available architecture:
In this architecture, EC2 instances are placed behind a load balancer that distributes incoming public
traffic among them. If one server becomes unavailable, the load balancer automatically stops sending
traffic to the unhealthy instance and redirects it to the healthy ones.
You can include a second load balancer in your architecture to route inbound traffic from the instances
in the public subnets to the instances in the private subnets.
If you have resources in multiple Availability Zones and they share one NAT gateway—and if the NAT
gateway’s Availability Zone is down—resources in the other Availability Zones lose internet access. It’s a
best practice to have NAT gateways in both Availability Zones to ensure high availability.
Amazon Route 53:
Amazon Route 53 is a scalable and highly available cloud DNS service that connects user requests to
applications both within and outside of AWS. It translates domain names, like example.com, into IP
addresses (e.g., 192.0.2.1) that computers use to communicate.
With Route 53, you can configure DNS health checks to ensure traffic is routed to healthy endpoints and
monitor your application's health. The service also provides domain name registration, allowing you to
purchase and manage domain names while automatically setting up DNS for them.
Route 53 offers various routing options, which can be combined with DNS failover to create low-latency,
fault-tolerant architectures.
Amazon Route 53 supported routing:
Amazon Route 53 supports various routing policies to manage how it responds to DNS queries:
1. Simple Routing: Distributes requests evenly across all servers.
2. Weighted Round Robin Routing: Assigns weights to different servers to control the frequency
of responses. For example, if one server has a weight of 3 and another a weight of 1, the first will
receive 75% of the traffic, while the second gets 25%. This is useful for A/B testing.
3. Latency-Based Routing (LBR): Routes users to the AWS region with the lowest latency,
improving response times by directing traffic to the fastest endpoint.
4. Geolocation Routing: Routes traffic based on users' geographic locations. This allows for
localized content delivery and can ensure compliance with distribution rights.
5. Geoproximity Routing: Routes traffic based on the physical distance between users and
resources, allowing you to adjust traffic distribution by specifying a bias.
6. Failover Routing: Provides active-passive failover, redirecting users to alternate locations if a
primary site goes down. Route 53 monitors the health of endpoints to ensure availability.
7. Multivalue Answer Routing: Routes traffic approximately randomly to multiple resources. You
can associate health checks with each record, allowing Route 53 to respond with healthy server
IPs when resolving DNS queries.
These routing options help enhance availability, performance, and user experience for your applications.
Multi-Region high availability and DNS:
With DNS failover routing, Amazon Route 53 can detect when your website is down and automatically
redirect users to alternate locations where your application is running properly. By enabling this feature,
Route 53 uses health-checking agents to monitor the availability of each endpoint. This helps improve
the availability of your customer-facing applications.
MONITORING
Monitoring usage, operations, and performance:
Monitoring is crucial for a reactive architecture. It helps you:
• Track the performance and operation of your resources.
• Monitor resource usage and application performance to ensure your infrastructure meets demand.
• Determine the right permissions for your AWS resources to achieve your security objectives.
Monitoring your costs:
To build a more flexible and cost-effective architecture, it's important to track your spending. Monitoring
can help you manage your AWS infrastructure costs. AWS offers several tools for this:
• AWS Cost Explorer: Visualizes and manages your AWS costs and usage with daily or monthly
views, helping you identify spending patterns over the past 13 months.
• AWS Budgets: Allows you to set custom budgets and sends alerts if your costs or usage exceed
your budgeted limits.
• AWS Cost and Usage Report: Provides detailed data on your AWS costs and usage, along with
metadata about services, pricing, and reservations.
• Cost Optimization Monitor: Automatically processes billing reports to give you granular metrics
in a customizable dashboard, helping you analyze service usage and costs by period, account,
resource, or tags.
Amazon CloudWatch:
Amazon CloudWatch is a monitoring and observability service designed for DevOps engineers,
developers, and IT managers. It provides valuable insights to help you monitor your applications, respond
to performance changes, optimize resource use, and maintain a unified view of operational health.
With CloudWatch, you can:
• Collect and track metrics for your resources and applications.
• Visualize and analyze these metrics and logs.
• Set up alarms to detect unusual behavior and receive notifications.
• Automatically adjust resources when specific thresholds are breached.
For instance, you can monitor the CPU usage and disk activity of your EC2 instances. This information
can help you decide whether to launch more instances to manage increased demand or shut down
underused ones to save costs.
CloudWatch also allows you to track custom metrics, giving you comprehensive visibility into resource
utilization, application performance, and overall operational health.
How CloudWatch responds:
You can use several components of Amazon CloudWatch to monitor your resources and applications and
respond to events.
CloudWatch Metrics
CloudWatch metrics provide data about system performance. By default, many AWS services offer metrics
for resources such as EC2 instances, Amazon EBS volumes, and Amazon RDS DB instances. Detailed
monitoring can be enabled for some resources, including EC2 instances, and custom application metrics
can be published. All metrics in an account, including AWS resource metrics and custom metrics, can be
accessed for searching, graphing, and setting alarms. Metric data is retained for 15 months, allowing
both real-time and historical analysis.
Amazon CloudWatch Logs
Amazon CloudWatch Logs allows for the monitoring, storage, and access of log files from various sources,
including EC2 instances, AWS CloudTrail, and Route 53. Log data can be utilized for application and
system monitoring, such as tracking errors in application logs and sending notifications when error rates
exceed a specified threshold. Additionally, CloudWatch Logs Insights provides quick analysis of logs with
interactive queries and visualizations, which can be displayed in line or stacked area charts on CloudWatch
Dashboards.
CloudWatch Alarms
Alarms can be set up to automatically initiate actions based on specified metrics. Each alarm monitors a
single metric over a designated time period and performs actions based on the metric's value relative to
a threshold. Actions may include sending notifications to an Amazon SNS topic or executing an Auto
Scaling policy. Alarms trigger actions only for sustained state changes, not merely for entering a specific
state. For example, an alarm might be activated if the CPU utilization of an EC2 instance exceeds 50
percent for 5 minutes, which could prompt an Auto Scaling action or a notification to the development
team.
Amazon EventBridge Events
Amazon EventBridge (formerly Amazon CloudWatch Events) captures a stream of real-time data from
applications, SaaS applications, and AWS services, routing that data to targets such as AWS Lambda. An
event signifies a change in an environment, which could include AWS resources, SaaS services, or custom
applications. For instance, an event is generated when an EC2 instance transitions from pending to
running, or when API calls are made via AWS CloudTrail. Scheduled events can also be created to occur
periodically. Existing CloudWatch Events users can access their default bus, rules, and events in the new
EventBridge console.
Amazon EventBridge Rules
Routing rules can be established to determine how data is sent, enabling real-time reactions to various
data sources. Each rule matches incoming events and routes them to multiple targets for parallel
processing. Rules are not processed in any specific order, allowing different organizational parts to focus
on events of interest. Rules can customize the JSON data sent to targets by selecting specific parts or
replacing it with constants.
Amazon EventBridge Targets
Targets process the events received. These can include EC2 instances, Lambda functions, Amazon Kinesis
streams, Amazon ECS tasks, AWS Step Functions, SNS topics, and SQS queues. Events are received in
JSON format. When creating a rule, it is associated with a specific event bus, and the rule only matches
events that are received by that bus.
How CloudWatch and EventBridge work:
This architecture illustrates the high-level interaction between CloudWatch and EventBridge.
CloudWatch serves as a repository for metrics. AWS services, such as Amazon EC2, send metrics to this
repository, allowing for the retrieval of statistics based on those metrics. Custom metrics can also be
added, enabling their statistics to be accessed.
These metrics can be used to calculate statistics, which can then be graphically represented in the
CloudWatch console.
In EventBridge, rules can be created to match incoming events and route them to targets for processing.
Alarm actions can be configured to stop, start, or terminate EC2 instances based on specific criteria.
Additionally, alarms can trigger Amazon EC2 Auto Scaling and Amazon SNS actions automatically.
REASONS TO AUTOMATE
Without automation:
Building a large-scale computing environment requires significant time and effort.
Many organizations begin their AWS journey by manually creating an Amazon Simple Storage Service
(Amazon S3) bucket or launching an Amazon Elastic Compute Cloud (Amazon EC2) instance to run a web
server. As business needs grow, more resources are added manually. However, managing and
maintaining these resources can quickly become challenging.
Key questions to consider include:
• Should efforts focus on design or implementation?
• What risks are associated with manual setups?
• How can production servers be updated ideally?
• What is the plan for rolling out deployments across different geographic regions?
• When issues arise, how will rollbacks to a stable version be managed?
• What strategies will be used for debugging deployments?
• How will dependencies among various systems be managed?
• Is it feasible to handle all these tasks through manual configurations?
Risks from manual processes:
Manual resource creation and management have several limitations, especially when scaling.
Challenges include:
• Lack of Repeatability: Deployments cannot be easily replicated across multiple Regions.
• No Version Control: Rolling back to a previous version of the production environment is not
feasible.
• Absence of Audit Trails: Ensuring compliance and tracking changes to configuration details at
the resource level becomes difficult.
• Inconsistent Data Management: Maintaining consistent configurations across multiple
Amazon EC2 instances can be problematic.
Manually managing a large corporate application may not be sustainable due to resource constraints.
Building architecture and applications from scratch lacks inherent version control, making it challenging
to revert to a stable state during emergencies.
An audit trail is crucial for compliance and security. Allowing unrestricted manual control of environments
poses risks. Consistency is vital for minimizing these risks, and automation provides a solution to maintain
that consistency.
Complying with AWS Well-Architected Framework principles:
Operational Excellence Principles:
1. Perform Operations as Code:
Operations can be treated like application code, allowing the entire workload—applications,
infrastructure, and resources—to be defined and updated through code. Automating operational
procedures by scripting them and triggering actions based on events reduces human error and
ensures consistent responses.
2. Make Frequent, Small, Reversible Changes:
Workloads should be designed for regular updates to components, enabling the introduction of
beneficial changes. Changes should be implemented in small increments that can be easily
reversed if necessary, minimizing impact on customers during updates.
Reliability Principle:
• Manage Change in Automation:
Infrastructure changes should be executed through automation. Managing changes to the
automation itself becomes crucial, as altering production systems presents significant risks.
Automation should be used for tasks such as testing and deploying changes, adjusting capacity,
and migrating data.
AUTOMATING YOUR INFRASTRUCTURE
AWS CloudFormation simplifies the modeling, creation, and management of AWS resources through a
collection known as a CloudFormation stack. There is no extra charge for using CloudFormation; users
only pay for the resources created.
Key Features:
• Create, Update, and Delete Stacks:
Stacks can be provisioned in an orderly and predictable manner.
• Repeatable Provisioning:
Resources can be built and rebuilt without manual actions or custom scripts.
• Infrastructure as Code (IaC):
A document describing the desired infrastructure is authored, which can be treated like code. This
document serves as a model for creating resources in the account.
• Version Control:
The CloudFormation document can be edited in any code editor and stored in version control
systems like GitHub or AWS CodeCommit. This allows for collaboration and review before
deployment. If necessary, a stack can be deleted, an older version of the document checked out,
and a new stack created from it, enabling essential rollback capabilities.
AWS CloudFormation overview:
This diagram illustrates the AWS CloudFormation process.
1. Define Resources: First, the desired AWS resources are specified. In this example, resources
such as EC2 instances, a load balancer, an Auto Scaling group, and an Amazon Route 53 hosted
zone are included. These resources are defined in an AWS CloudFormation template, which can
be created from scratch or based on a pre-built template. Various sample templates are also
available.
2. Upload the Template: The template can be uploaded directly to AWS CloudFormation or stored
on Amazon S3, with AWS CloudFormation directed to the template's location.
3. Create Stack: The create stack action is initiated. AWS CloudFormation reads the template and
provisions the specified resources in the AWS account. A single stack can manage resources across
multiple AWS services within a single Region.
4. Monitor Progress: The progress of the stack creation can be monitored. Once the stack creation
is successfully completed, the AWS resources are available in the account. The stack object
remains as a reference for all the created resources, enabling future actions such as updates (to
add or modify resources) or deletions (to remove all resources associated with the stack).
Infrastructure as code (IaC):
Infrastructure as Code (IaC) refers to the process of managing and provisioning cloud resources through
a template file that is both human-readable and machine-consumable.
Key benefits of IaC include:
• Replicability: Infrastructure can be easily replicated, re-deployed, or re-purposed.
• Rollback Capability: In the event of a failure, IaC allows for a rollback to the last known good
state.
The growing popularity of IaC is attributed to its ability to address challenges related to infrastructure
management in a simple, reliable, and consistent manner. One of the standout features of AWS
CloudFormation is its transactional nature, which ensures that any failures result in a rollback to the
previous stable state, enhancing reliability for users.
AWS CloudFormation template syntax:
AWS CloudFormation templates can be written in either JavaScript Object Notation (JSON) or YAML Ain't
Markup Language (YAML).
Advantages of YAML:
• Readability: YAML is less verbose, eliminating the need for braces ({}) and many quotation
marks (“”).
• Comments: It supports embedded comments, making it easier to document the template.
Advantages of JSON:
• Compatibility: JSON is widely used across various computer systems, such as APIs, allowing for
seamless integration without transformation.
• Ease of Use: It is generally easier to generate and parse JSON compared to YAML.
Best Practices:
Templates should be treated as source code and stored in a code repository for version control.
The AWS Management Console includes the AWS CloudFormation Designer, a graphical tool for creating
and viewing templates. It allows users to convert between JSON and YAML formats and features a drag-
and-drop interface for easier template authoring.
Simple template: Create an EC2 instance:
This example illustrates an AWS CloudFormation template that creates an EC2 instance. It highlights key
sections commonly found in templates: Parameters, Resources, and Outputs.
• Parameters: This optional section allows values to be passed to the template at runtime during
stack creation or updates. Parameter names and descriptions are displayed in the Specify
Parameters page when launching the Create Stack wizard in the console.
• Resources: This required section specifies the AWS resources to be created, along with their
properties. In this example, a resource of type AWS::EC2::Instance is defined, which creates an
EC2 instance. The resource includes both static properties (like ImageId and InstanceType) and a
reference to the KeyPair parameter.
• Outputs: This section describes the values returned when viewing the stack's properties. In the
example, an InstanceId output is declared. After the stack is created, this value can be viewed in
the AWS CloudFormation console, retrieved using the aws cloudformation describe-stacks
command in the AWS CLI, or accessed through AWS SDKs.
Consistency Across Environments with AWS CloudFormation:
The same AWS CloudFormation template can be utilized to create both production and development
environments, ensuring consistency in application binaries, Java versions, and database versions. This
approach helps guarantee that the application performs in production as it does in development.
In the example provided, both environments are generated from the same template. However, the
production environment is set to operate across two Availability Zones, while the development
environment runs in a single Availability Zone. Such deployment-specific differences can be managed
using Conditions in the AWS CloudFormation template, allowing for identical configurations in terms of
settings while varying size and scope.
Multiple testing environments may also be required for functional testing, user acceptance testing, and
load testing. Creating these environments manually can introduce risks, but using AWS CloudFormation
ensures consistency and repeatability.
AWS CloudFormation change sets:
To update a stack and its AWS resources, the AWS CloudFormation template used for creation can be
modified, followed by running the Update Stack option. However, to gain insight into the specific changes
that will be made before executing the update, change sets can be utilized.
Change sets allow for a preview of changes, ensuring they meet expectations before approval. Here’s a
basic workflow for using change sets:
1. Create a Change Set: Submit the changes for the stack that requires updating.
2. View the Change Set: Examine which settings and resources will be affected. If further
adjustments are needed, additional change sets can be created for consideration.
3. Run the Change Set: Execute the change set to update the stack with the approved changes.
When using change sets, it's advisable to set a deletion policy on certain resources. The DeletionPolicy
attribute can preserve or back up resources when a stack is deleted or updated. If this attribute is not
specified, AWS CloudFormation will delete the resource.
Drift detection:
In a scenario where an AWS CloudFormation stack is used to create an application environment, manual
modifications made outside of AWS CloudFormation—such as adding a new inbound rule to a security
group—can lead to discrepancies between the deployed environment and the defined template.
To identify which resources have been modified and no longer align with the specifications in the stack,
drift detection can be utilized. Drift detection allows you to check for any changes made outside of AWS
CloudFormation.
Here’s how to perform drift detection:
1. Run Drift Detection: Select "Detect Drift" from the Stack actions menu in the AWS Management
Console.
2. Review Drift Status: The drift detection process will indicate whether the stack has deviated
from its expected configuration. Detailed information about the drift status of each resource that
supports drift detection will be provided.
It's important to note that when deleting a stack with drift, AWS CloudFormation does not automatically
resolve these discrepancies during the cleanup process. If there are unresolved resource dependencies,
the delete action may fail, requiring manual intervention to resolve the issues.
Scoping and organizing templates:
As the use of AWS CloudFormation templates increases within an organization, establishing a strategy
for managing these templates becomes essential. This strategy should outline the scope of each template
and the criteria for defining AWS infrastructure across multiple templates.
A structured approach can help maintain templates and facilitate their use in a coherent manner. Here
are some suggestions for organizing your templates:
1. Group by Functionality: Organize resource definitions in templates based on tightly connected
components, similar to how a large enterprise application is divided into distinct parts.
2. Define Areas of Focus: Consider creating templates for specific areas, such as:
o Frontend Services
o Backend Services
o Shared Services
o Network
o Security
Each template can be tailored to the needs of a single application or department.
3. Version Control: Treat templates as code. Store them in a version control system to track
changes and maintain consistency.
By implementing a clear strategy for your AWS CloudFormation templates, management and
collaboration can be significantly improved.
AWS Quick Starts:
AWS Quick Starts are pre-built AWS CloudFormation templates designed to simplify the deployment of
popular solutions on AWS. They are developed by AWS solutions architects and partners, ensuring
alignment with AWS best practices for security and high availability.
Key Features:
• Gold-Standard Deployments: Based on best practices for secure and highly available
architectures.
• Rapid Deployment: Complete architectures can be set up in under an hour, often in just minutes.
• Flexible Use: Quick Starts can serve as a foundation for production environments or as a means
for experimentation with new deployment strategies.
AUTOMATING DEPLOYMENTS
AWS Systems Manager:
While using Infrastructure as Code (IaC) tools like AWS CloudFormation is essential for creating and
maintaining AWS resources, additional tools can support ongoing configuration management needs. AWS
Systems Manager is one such service designed for automation and management of both cloud and on-
premises systems.
Key Features:
• Management Focus: AWS Systems Manager allows you to identify instances for management
and define specific tasks to perform on them.
• Cost: The service is available at no extra charge, making it accessible for managing Amazon EC2
instances and on-premises resources.
Tasks You Can Accomplish:
• Collect software inventory
• Apply operating system (OS) patches
• Create system images
• Configure both Microsoft Windows and Linux operating systems
These features help maintain system configurations, prevent drift, and ensure software compliance across
your AWS and on-premises environments.
Benefits:
• Automates Operational Tasks: For instance, you can apply OS patches and software upgrades
across multiple EC2 instances with ease.
• Simplifies Management: Easily manage software inventory and view detailed configurations for
your resource fleet.
• Cross-Environment Management: Manage servers both on-premises and in the cloud
effectively.
Systems Manager capabilities:
AWS Systems Manager simplifies the update, management, and configuration of EC2 instances. By
installing the AWS Systems Manager Agent (SSM Agent) on an EC2 instance, on-premises server, or
virtual machine (VM), you can enable Systems Manager to perform various management tasks.
How It Works:
• Once the SSM Agent is installed, it processes requests from Systems Manager, executes the
specified tasks, and sends status updates and relevant information back to Systems Manager.
• The SSM Agent comes preinstalled on most Microsoft Windows Server AMIs, all Amazon Linux and
Amazon Linux 2 AMIs, and some Ubuntu AMIs. For other Linux AMIs, manual installation is
required. Refer to the AWS documentation for detailed instructions.
Key Tools Provided by AWS Systems Manager:
• Run Command: Remotely manage configurations without needing SSH or RDP access, reducing
the need for a bastion host. Supports running scripts in Bash, PowerShell, Salt, or Ansible.
• Maintenance Windows: Schedule potentially disruptive tasks like OS patching, driver updates,
or software installations.
• Parameter Store: Securely store configuration data and secrets such as passwords and database
connection strings.
• Patch Manager: Automate the patching process for both security and non-security updates.
• State Manager: Ensure that your EC2 and hybrid infrastructure remains in a desired state.
• Automation: Create workflows to automate instance and resource management.
• Session Manager: Interactively manage EC2 instances via a browser-based shell.
• Inventory: Gain visibility into your EC2 and on-premises environments by collecting metadata
from managed instances.
• Documents: Define actions for Systems Manager using pre-configured documents or create
custom documents in JSON or YAML.
These tools help streamline management tasks and ensure that your instances remain updated and
compliant.
AWS CloudFormation and Systems Manager complement each other:
AWS CloudFormation and AWS Systems Manager are two powerful tools that work together effectively.
• AWS CloudFormation: Ideal for defining and managing AWS resources at the cloud level. It
allows you to create and delete entire stacks using a single template.
• AWS Systems Manager: Focuses on automating tasks within the guest operating system of
instances. It helps with ongoing management, such as applying OS patches and aggregating logs
to Amazon CloudWatch.
How They Work Together:
1. Resource Definition: Use AWS CloudFormation to set up your AWS resources, such as EC2
instances.
2. OS Configuration: After the resources are created, leverage AWS Systems Manager to manage
and configure the operating systems of those instances.
By using both services, you maintain streamlined resource management with CloudFormation while
ensuring continuous operational updates and monitoring with Systems Manager.
AWS OpsWorks:
AWS OpsWorks is a configuration management service designed to automate the configuration,
deployment, and management of EC2 instances. It supports popular automation platforms, Chef and
Puppet, and is available in three versions:
1. AWS OpsWorks for Chef Automate:
o Provides a fully managed Chef Automate server.
o Enables workflow automation for continuous deployment and automated testing for
compliance and security.
o Manages operational tasks such as software configurations, package installations, and
database setups through Chef recipes.
2. AWS OpsWorks for Puppet Enterprise:
o Offers a managed Puppet Enterprise server with automation tools for orchestration and
provisioning.
o Allows you to define and version configurations for your servers like application source
code.
o Dynamically configures nodes based on the state of other nodes in the environment.
3. AWS OpsWorks Stacks:
o A service that helps configure and operate applications using Chef.
o Allows you to define the application architecture and specify each component, including
software configurations and resource requirements.
With AWS OpsWorks, teams can efficiently manage their infrastructure while ensuring consistency and
compliance across their applications.
AWS OpsWorks Stacks:
AWS OpsWorks Stacks helps manage applications by organizing them into stacks and layers. Here's how
a basic application might be structured:
1. Stacks: The primary unit of creation in OpsWorks is the stack. A stack serves as the container for
your application.
2. Layers: Within a stack, multiple layers can be added to define different functionalities. For
example:
o An Application Server Layer runs your application servers.
o A Load Balancing Layer includes an Elastic Load Balancer to distribute traffic.
o An RDS Layer hosts a backend Amazon RDS database.
3. Chef Recipes: Each layer uses Chef recipes to perform tasks like installing packages, deploying
applications, and executing scripts. Custom Chef cookbooks must be stored in an online repository,
such as a ZIP file or a source control system like Git.
4. Lifecycle Events: OpsWorks Stacks features lifecycle events (Setup, Configure, Deploy,
Undeploy, Shutdown) that automatically trigger specified recipes at the right time on each
instance. Each layer can have recipes assigned to these events, allowing for streamlined
management of tasks.
This structure allows for organized management of applications, ensuring that all components work
together seamlessly.
OpsWorks Stacks complements AWS CloudFormation:
AWS OpsWorks Stacks can be created using AWS CloudFormation, allowing the two technologies to work
together effectively. Here's how this integration can function:
1. Infrastructure Setup: An AWS CloudFormation template can be used to create the foundational
infrastructure for your environment, such as a Virtual Private Cloud (VPC).
2. Application Management: A separate AWS CloudFormation template can then be used to create
the OpsWorks stack that will be deployed within that VPC.
Once both CloudFormation stacks are set up in your account, the OpsWorks stack can be used to manage
the application efficiently, ensuring a cohesive and organized deployment.
AWS ELASTIC BEANSTALK
General challenges:
AWS Elastic Beanstalk is a Platform as a Service (PaaS) that simplifies the deployment, scaling, and
management of web applications. It provides an easy way to get applications up and running quickly by
automatically handling:
• Infrastructure provisioning and configuration
• Deployment
• Load balancing
• Automatic scaling
• Health monitoring
• Analysis and debugging
• Logging
With Elastic Beanstalk, you maintain control of your code while AWS manages the underlying
infrastructure. The setup process is straightforward: a simple wizard in the AWS Management Console
guides you through selecting instance types, database options, and scaling settings. You can also enable
Secure HTTP (HTTPS) on the load balancer and access server log files.
Once your code is uploaded, Elastic Beanstalk takes care of the deployment, including all necessary
resources. There are no additional charges for using the service itself; you only pay for the AWS resources
used, such as EC2 instances and S3 buckets, ensuring you only pay for what you need.
AWS Elastic Beanstalk:
AWS Elastic Beanstalk supports web applications built on popular platforms, including Java, .NET, PHP,
Node.js, Python, Ruby, Go, and Docker.
You simply upload your code, and Elastic Beanstalk automatically manages the deployment for you. It
configures each EC2 instance with the necessary components to run your application, eliminating the
need for manual setup.
Elastic Beanstalk deploys your code on:
• Apache Tomcat for Java applications
• Apache HTTP Server for PHP and Python applications
• NGINX or Apache HTTP Server for Node.js applications
• Passenger or Puma for Ruby applications
• Microsoft Internet Information Services (IIS) for .NET, Java SE, Docker, and Go applications
With Elastic Beanstalk, the focus is on your code, making the deployment process quick and easy.
AWS Elastic Beanstalk deployments:
AWS Elastic Beanstalk supports web applications built on popular platforms such as Java, .NET, PHP,
Node.js, Python, Ruby, Go, and Docker.
To deploy your application:
1. Upload Your Code: Simply upload your code to Elastic Beanstalk.
2. Automatic Deployment: Elastic Beanstalk automatically handles the deployment process,
configuring the necessary components on each EC2 instance for the selected platform.
3. No Manual Configuration Needed: There's no need to log in to instances to set up your
application stack.
Elastic Beanstalk streamlines the deployment process, allowing you to focus solely on your code. It
deploys applications on various servers, including:
• Apache Tomcat for Java
• Apache HTTP Server for PHP and Python
• NGINX or Apache HTTP Server for Node.js
• Passenger or Puma for Ruby
• Microsoft Internet Information Services (IIS) for .NET, Java SE, Docker, and Go
This service makes it easy and quick to get your application up and running.
Elastic Beanstalk application environment:
When using AWS Elastic Beanstalk, you can choose between two environment types:
1. Single-Instance Environment: Launches a single EC2 instance without load balancing or
automatic scaling.
2. Load Balanced Environment: Launches multiple EC2 instances, includes load balancing, and
supports automatic scaling. An optional managed database layer can also be added.
After creating an Elastic Beanstalk application, you can view the AWS resources it manages in your
account. For instance, two EC2 instances running Amazon Linux with Java 8 and Apache Tomcat 8.5 are
created to host your application.
Automatic scaling is configured to respond to increased loads. If resource utilization (like CPU usage)
exceeds a certain threshold for over five minutes, additional instances will be launched automatically. An
RDS database instance can also be included, allowing you to store application data and interact with it
using SQL.
Elastic Beanstalk manages the scaling and connectivity between your EC2 instances and the database.
Additionally, it assigns a unique domain name to your application, which follows the format <your-
application>.elasticbeanstalk.com. You can also set up a custom domain using Amazon Route 53.
Choosing the right automation solution:
Among several AWS services for application management, deciding which service to use depends on the
level of convenience and control you need:
1. AWS Elastic Beanstalk: This is an easy-to-use platform for deploying web applications in
languages like Java, PHP, Node.js, Python, Ruby, and Docker. If you prefer to simply upload your
code without needing to customize the environment, Elastic Beanstalk is a suitable option.
2. AWS OpsWorks: This service allows you to define an application's architecture and specify each
component, including package installations and software configurations. OpsWorks provides
templates for common technologies or allows you to create your own.
Both Elastic Beanstalk and OpsWorks offer higher-level management compared to manually maintaining
AWS CloudFormation templates or directly managing EC2 instances. The best choice will depend on your
specific needs, so consider your requirements carefully. As an architect, it's essential to determine which
service—or combination of services—will work best for your use case.
OVERVIEW OF CACHING
Caching: Trading capacity for speed:
Speed is crucial for applications, whether they deliver news, maintain leaderboards, showcase product
catalogs, or sell tickets. The speed of content delivery directly impacts your application's success. Caching
can significantly enhance this speed.
What is Caching?
A cache is a high-speed storage layer that temporarily holds a subset of data. Unlike a traditional database
that stores data in a complete and durable form, a cache focuses on performance by reducing the need
to access slower storage.
Benefits of Caching:
• Faster Data Retrieval: Cached data can be accessed much quicker than data stored in primary
storage.
• Efficiency: Caching allows for the reuse of previously retrieved or computed data, improving
overall efficiency.
Data in a cache is typically stored in fast-access hardware like random access memory (RAM), making it
ideal for speeding up data delivery.
What should you cache?
When deciding what data to cache, keep these key factors in mind:
1. Speed and Expense
• Bottlenecks: Complex or time-consuming database queries can slow down applications. If a
query is slow and costly—like those involving joins across multiple tables—it's a strong candidate
for caching.
• Simple Queries: Even relatively quick queries might be worth caching based on usage patterns.
2. Data and Access Patterns
• Static vs. Dynamic: Cache data that is relatively static and frequently accessed, like user profiles
on social media. Avoid caching highly dynamic data, such as search results, which change
frequently and offer little benefit from caching.
3. Staleness
• Understanding Staleness: Cached data is inherently stale. Assess how your application can
handle this. For instance, a website showing stock prices can tolerate some delay if it indicates
that prices may be outdated. However, for applications that require real-time data—like trading
platforms—up-to-the-minute accuracy is crucial.
By considering these factors, you can make informed decisions about which data to cache for optimal
performance.
Benefits of caching:
A cache offers high throughput and low-latency access to frequently accessed application data by storing
it in memory.
Benefits of Caching:
• Improves Speed: Enhances overall application performance.
• Reduces Latency: Minimizes response time for users.
• Optimizes Processing: Lowers application processing and database access times, especially for
read-heavy workloads like social networking, gaming, media sharing, and Q&A platforms.
While write-heavy applications may not benefit as much, they often still have a higher read-to-write ratio,
making read caching valuable.
Caching throughout the data journey:
This simple web application architecture illustrates how data flows to and from the user, highlighting
opportunities for caching at each layer to enhance performance and usability. The layers include
operating systems, networking components like content delivery networks (CDNs) and Domain Name
Systems (DNS), web applications, and databases.
In this architecture, caching can:
• Speed up information retrieval from websites.
• Store mappings of domain names to IP addresses.
• Accelerate access to web content from servers.
• Enhance application performance and data access.
• Reduce latency associated with database queries.
EDGE CACHING
Network latency:
When someone visits your website or uses your application, their request travels through various
networks to reach your origin server. The origin server stores the original versions of your objects, such
as web pages, images, and media files. The number of network hops and the distance the request travels
can significantly impact your website's performance and responsiveness.
Network latency can also vary based on the origin server's geographic location. If your web traffic is
spread across different regions, replicating your entire infrastructure globally may not be feasible or cost-
effective. In such cases, a content delivery network (CDN) can be very helpful.
Content delivery network (CDN):
A content delivery network (CDN) is a globally distributed system of caching servers that stores copies
of frequently requested files, such as HTML, CSS, JavaScript, images, and videos. When a user requests
content, the CDN delivers a local copy from the nearest cache edge or Point of Presence (PoP), ensuring
faster access. This setup improves application performance and scalability by reducing load times.
Amazon CloudFront:
Amazon CloudFront is a global content delivery network (CDN) that enhances content delivery for users.
It supports both static and dynamic content, including media files and streaming video, and operates on
a pay-per-use basis without long-term commitments.
CloudFront features a multi-tier cache that improves latency by utilizing regional edge caches, reducing
the load on origin servers. It offers various streaming options, ensuring high throughput for 4K delivery
to global audiences.
For security, CloudFront includes built-in protections like AWS Shield Standard and allows the
management of custom SSL certificates through AWS Certificate Manager (ACM) at no extra cost. It
supports SSL/TLS protocols for secure data transmission.
Additionally, CloudFront enables real-time, bidirectional communication via the WebSocket protocol,
making it ideal for applications like chat, collaboration, and gaming. It also supports various HTTP
methods (such as GET, POST, and PUT), optimizing the performance of dynamic websites with user
interaction features. This allows a single domain name to efficiently deliver both download and upload
capabilities for your entire site.
What type of content can you cache in an edge cache?:
An example of an Amazon.com webpage illustrates how static and dynamic content combine to create a
dynamic web application. This application is delivered via HTTPS, ensuring that user requests and
responses are encrypted.
Static content, such as HTML documents, CSS stylesheets, JavaScript files, images, and videos, can be
cached using a CDN or edge cache. However, dynamically generated content or user-generated data
cannot be cached. Instead, CloudFront can be configured to deliver this information from a custom origin,
like an EC2 instance or a web server.
Furthermore, CloudFront can be set to require HTTPS for viewer requests, ensuring that all connections
are encrypted. It can also use HTTPS to retrieve objects from the origin, maintaining secure
communication throughout.
How caching works in Amazon CloudFront:
Amazon CloudFront delivers content to users through a global network of data centers known as edge
locations.
When a user requests content served by CloudFront, DNS directs the request to the nearest edge location
for optimal performance. CloudFront first checks its cache for the requested content. If the content is
available, it is delivered immediately to the user. If not, CloudFront forwards the request to the origin
server, retrieves the content, and then sends it to the user while also adding it to the cache for future
requests.
As certain objects become less popular, edge locations may remove them to make space for more
frequently accessed content. To manage this, CloudFront uses regional edge caches, which are larger
caches located between the origin server and global edge locations. These caches store less popular
content longer, reducing the need to access the origin server and enhancing overall performance for
viewers.
How to configure a CloudFront distribution:
When you want to use CloudFront to distribute your content, you create a distribution.
1. You specify the origin server that hosts your files. Your origin server can be an S3 bucket, an AWS
Elemental MediaPackage channel, an AWS Elemental MediaStore container, or a custom origin.
For example, a custom origin might be an EC2 instance or your own web server.
2. You then specify details about how to track and manage content delivery. For example, you can
specify whether you want your files to be available to everyone or only to certain users. You can
also specify whether you want CloudFront to perform the following functions: create access logs
that show user activity, forward cookies or query strings to your origin, or require users to use
HTTPS to access your content.
3. CloudFront assigns a domain name to your new distribution.
4. CloudFront sends your distribution's configuration—but not the content—to all edge locations.
How to expire content:
You can expire cached content in three ways:
1. Time to Live (TTL): This method sets a fixed expiration period for how long files stay in the
CloudFront cache before a request is forwarded to your origin. A shorter TTL is suitable for dynamic
content, while a longer TTL enhances performance by serving more files directly from the cache,
reducing the load on your origin. If you set TTL to 0, CloudFront still caches the content and uses
an If-Modified-Since header in a GET request to check if the cached content is still valid.
2. Change Object Name: This method requires renaming the object (e.g., changing Header-v1.jpg
to Header-v2.jpg) for an immediate refresh. While you can update existing objects with the same
name, it's not recommended, as CloudFront only distributes objects when they are requested, not
when they're updated in your origin.
3. Invalidate Object: This is the least efficient method, as it forces interaction with all edge
locations. Use this sparingly and only for specific objects that need to be removed immediately.
Example: Video on demand streaming:
CloudFront can be used to deliver both on-demand and live streaming video.
For on-demand video streaming, video content must be formatted and packaged using an encoder,
such as AWS Elemental MediaConvert or Amazon Elastic Transcoder. This process creates segments—
static files containing audio, video, and captions—along with manifest files that outline the segments'
playback order. Supported package formats include Dynamic Adaptive Streaming over HTTP (DASH),
Apple HTTP Live Streaming (HLS), Microsoft Smooth Streaming, and Common Media Application Format
(CMAF).
Once the video is converted into the necessary formats, it is hosted in an S3 bucket, which serves as
your origin server. CloudFront then delivers these segment files to users globally.
Example: Dynamically generated content:
Typically, only static content is cached, but some content may appear static due to its URL, even though
it's generated dynamically the first time it's requested. This approach is useful for reusable content that
is costly to create and changes infrequently.
A classic example is map tiles. Instead of generating every possible tile—which would be wasteful—only
commonly viewed areas, like major cities, are cached. Each tile’s URL can include parameters needed for
its generation. If the tile is already cached at a CloudFront edge location, it is served directly. If not, the
tile is generated, returned to the edge location, and cached for future requests.
Example: DDoS mitigation:
CloudFront can enhance the resilience of your AWS applications against distributed denial of service
(DDoS) attacks, which aim to make your website or application unavailable by overwhelming it with
traffic from multiple sources, such as infected computers and IoT devices.
A resilient architecture can mitigate these attacks by using a DNS service like Amazon Route 53 to connect
user requests to a CloudFront distribution. CloudFront proxies requests for dynamic content to your
application’s infrastructure. Both Route 53 DNS requests and application traffic through CloudFront are
monitored for anomalies and attacks.
Common DDoS attacks include SYN floods, UDP floods, and reflection attacks. When a SYN flood exceeds
a threshold, SYN cookies are activated to maintain connections with legitimate clients. Deterministic
packet filtering drops malformed packets, while heuristics-based anomaly detection evaluates traffic
attributes to score and filter out suspicious requests, reducing false positives and maintaining application
availability.
Route 53 is also designed to withstand DNS query floods, utilizing techniques like shuffle sharding and
anycast striping to distribute traffic across edge locations.
Additionally, AWS WAF (Web Application Firewall) allows you to monitor and control HTTP and HTTPS
requests to services like API Gateway, CloudFront, or Application Load Balancer. You can set conditions
based on IP addresses or query string values, allowing you to block or permit requests accordingly.
CloudFront can also return custom error pages for blocked requests.
CACHING WEB SESSIONS
Session management: Sticky sessions:
When users interact with a web application, they send HTTP requests, and the application responds,
forming a session. Each request is independent, but sessions help manage user authentication and store
user data, so users don’t need to send their credentials with every request.
User sessions can be managed in various ways. By default, a load balancer routes requests independently
to the server with the lightest load. Sticky sessions, however, route requests from the same user to the
same server, which requires client support for cookies.
Sticky sessions have advantages, such as being cost-effective and reducing network latency since session
data is stored on the web servers. However, they also have disadvantages:
• Loss of Sessions: If the server handling the session fails, the session data is lost.
• Limited Scalability: Sticky sessions can prevent the load balancer from evenly distributing traffic.
If one server becomes overloaded with requests, it can lead to slower response times for users.
In summary, while sticky sessions speed up retrieval and are easy to implement, they can affect
application scalability and reliability.
Instead of sticky sessions: Persist sessions inside a distributed cache:
Instead of using sticky sessions, you can designate a layer in your architecture that can store sessions in
a scalable and robust manner outside the instance. One option is to persist session data in a distributed
cache, as this architecture diagram shows. It makes sense to implement this architecture in a dynamic
environment when the number of web servers changes to accommodate load, and you don’t want to risk
losing sessions.
Instead of sticky sessions: Persist sessions inside a DynamoDB table:
Another option is to persist session data in an Amazon DynamoDB database, as this diagram shows. With
Amazon DynamoDB, you can set your scaling policy and have DynamoDB scale capacity up and down as
necessary.
Example: Storing session states for an online gaming application:
This architecture is designed for an online gaming application where fast session retrieval is essential.
Game data is constantly updated as players collect items, defeat enemies, earn gold, unlock levels, and
achieve milestones. Each event must be recorded in the database to avoid data loss. For this purpose,
game developers use DynamoDB to store session history and other time-sensitive data, allowing quick
lookups by player, date, and time.
Each DynamoDB table has a specified throughput capacity, allowing for 1,000 writes per second, with
automatic scaling in the background. As player demand changes, capacity can be adjusted easily,
enabling support for sudden surges from a few thousand to millions of players. Similarly, scaling back
down is straightforward.
DynamoDB ensures predictable, low-latency performance at any scale, which is crucial for
accommodating millions of latency-sensitive gamers. No time is required for performance tuning with
DynamoDB.
CACHING DATABASES
When should you cache your database?
Slow and complex database queries can lead to application bottlenecks. Caching your database can be
beneficial in several scenarios:
1. Response Time Concerns: If you have latency-sensitive workloads, caching can enhance
throughput and reduce data retrieval delays, thereby improving application performance.
2. High Request Volume: For applications experiencing heavy traffic that overwhelms your
database, adding a caching layer can increase throughput and enhance performance.
3. Cost Reduction: Scaling databases for high read volumes can be expensive, especially with disk-
based NoSQL databases or relational databases that require multiple read replicas. A single in-
memory cache node can handle requests more efficiently and cost-effectively.
A database cache complements your primary database by alleviating the pressure on it, particularly for
frequently accessed read data. Caches can be integrated into your database, application, or operate as
a standalone layer.
Using DynamoDB for state information:
In a simplified architecture for an online gaming application, session state information is stored in
DynamoDB. However, if millisecond-scale response times are insufficient for your needs, database
caching can address this issue by speeding up data retrieval.
Amazon DynamoDB Accelerator:
Amazon DynamoDB Accelerator (DAX) is a fully managed, in-memory cache designed to enhance
DynamoDB's performance. It offers:
• Extreme Performance: DAX achieves response times in microseconds, improving performance
by up to 10 times, even under heavy loads.
• High Scalability: Start with a three-node DAX cluster and easily scale up to ten nodes as needed.
• Fully Managed: DAX automates management tasks like setup, configuration, and software
patching, handling failure detection and recovery.
• Integration with DynamoDB: DAX is API-compatible with DynamoDB, requiring no changes to
your application code. Simply provision a DAX cluster and use the DAX SDK.
• Flexibility: Provision one DAX cluster for multiple DynamoDB tables or multiple clusters for a
single table, as needed.
• Security: DAX integrates with AWS services for enhanced security, including IAM for user access
control and CloudWatch for monitoring. It also supports VPC for secure access.
By caching data, DAX reduces the read load on DynamoDB tables, potentially lowering operational costs
and required read capacity.
Using DynamoDB with DAX to accelerate response time:
In the gaming scenario, adding Amazon DynamoDB Accelerator (DAX) enhances performance without
requiring significant changes to your game code. To integrate DAX, you only need to reinitialize your
DynamoDB client to point to the DAX endpoint—no other code modifications are necessary. DAX
automatically manages cache invalidation and data population, streamlining the deployment process.
This cache is especially beneficial during high-traffic events, such as seasonal downloadable content
(DLC) releases or new patch updates, as it improves responsiveness and handles spikes in player
activity.
Remote or side caches:
DynamoDB Accelerator (DAX) is a transparent cache, but another approach to caching is using a side
cache, which operates alongside the database rather than being directly connected to it. Side caches are
often built on key-value NoSQL stores like Redis or Memcached and can handle hundreds of thousands
to a million requests per second per node.
Side caches are ideal for read-heavy workloads and work as follows:
1. The application first attempts to read data from the cache.
2. If the data is found (a cache hit), the value is returned.
3. If the data is not found (a cache miss), the application retrieves it from the underlying database.
4. To ensure future availability, the fetched data is then written back to the cache.
This approach helps improve response times and reduces the load on the database.
Amazon ElastiCache:
Amazon ElastiCache is a fully managed, in-memory data store that enhances web applications with high
performance and low latency. It acts as a side cache, enabling sub-millisecond response times for
demanding applications. Key features include:
• High Performance: Delivers rapid data access.
• Fully Managed: No need for tasks like hardware provisioning, software patching, or backups—
ElastiCache handles monitoring and recovery.
• Scalable: Easily scales out, in, or up to meet changing application demands. Supports write and
memory scaling through sharding, and read scaling via replicas.
• Supports Redis and Memcached: Provides flexibility with two popular open-source in-memory
databases.
With ElastiCache, developers can focus on building applications while it manages the underlying
infrastructure.
Redis and Memcached:
Amazon ElastiCache for Memcached can scale up to 20 nodes per cluster, while ElastiCache for Redis
supports up to 250 nodes, enhancing data access performance.
Key features include:
• Amazon VPC Support: Allows you to isolate your cluster within chosen IP ranges for added
security.
• Reliable Infrastructure: Operates on the same dependable infrastructure as other AWS
services.
• High Availability: ElastiCache for Redis offers Multi-AZ deployments with automatic failover for
resilience.
• Data Partitioning: In Memcached, data is distributed across all nodes in the cluster, allowing
for better scalability as demand increases.
This flexibility ensures that ElastiCache can effectively manage varying workloads and support growth.
Memcached versus Redis comparison:
Memcached and Redis are both effective caching engines that help reduce database load, each with
unique advantages. Here’s a comparison of their key features:
• Sub-Millisecond Latency: Both engines deliver sub-millisecond response times by storing data
in memory, enabling faster data retrieval than disk-based databases.
• Horizontal Scalability: Memcached allows you to easily scale out and in by adding or removing
nodes based on demand.
• Multi-Threaded Performance: Memcached's multi-threaded architecture utilizes multiple
processing cores, enhancing its ability to handle more operations as compute capacity increases.
• Advanced Data Structures: Redis supports complex data types such as strings, hashes, lists,
sets, sorted sets, and bitmaps, offering greater versatility.
• Sorting and Ranking: Redis can sort and rank datasets in memory, making it ideal for
applications like game leaderboards that need to display rankings.
• Publish/Subscribe Messaging: Redis features publish/subscribe messaging with pattern
matching, suitable for high-performance applications like chat rooms, real-time comment streams,
and social media feeds.
• High Availability: ElastiCache for Redis ensures high availability with Multi-AZ deployments and
automatic failover in case the primary node fails.
• Persistence: Redis allows for data persistence, meaning data can be saved even if a node fails
or is replaced. In contrast, Memcached does not support persistence, leading to potential data loss
when nodes are terminated or scaled down.
These features make both Memcached and Redis valuable tools for enhancing application performance,
depending on specific needs.
Feature Memcached Redis
Sub-millisecond latency Yes Yes
Ability to scale horizontally for writes and storage Yes No
Multi-threaded performance Yes No
Advanced data structures No Yes
Sorting and ranking datasets No Yes
Publish/subscribe messaging No Yes
Multi-AZ deployments with automatic failover No Yes
Persistence No Yes
ElastiCache components:
A cache node is the fundamental unit of an ElastiCache deployment, consisting of a fixed amount of
secure, network-attached RAM. Each node operates the selected engine from the time the cluster or
replication group was created or last modified. Each node has its own DNS name and port, and it can
function independently or as part of a larger group, known as a cluster.
Caching strategies: Lazy loading:
ElastiCache supports two caching strategies.
The first, lazy loading, loads data into the cache only when it's needed. When your application requests
data, it checks the ElastiCache cache first. If the data is present and up-to-date (a cache hit), ElastiCache
returns it directly. If not (a cache miss), the application retrieves the data from the primary data store,
then writes that data to the cache for future use.
Lazy loading is useful for data that is frequently read but rarely changed. For example, a user’s profile
may be accessed many times a day but only updated a few times a year.
The advantage of lazy loading is that it only caches data that is requested, preventing unnecessary data
from filling the cache. However, a cache miss can lead to delays, as it involves multiple trips to retrieve
the data. Additionally, because data is only written to the cache on a miss, it can become outdated if
changes occur in the database.
Caching strategies: Write-through:
The second caching strategy is write-through, which updates the cache whenever data is written to
the database. This strategy is useful for data that needs real-time updates, allowing you to avoid
unnecessary cache misses for frequently accessed information. Examples include leaderboards, popular
news stories, or recommendations, as these types of data are often updated through specific application
processes.
The main advantage of write-through caching is that it increases the chances of finding the required data
in the cache when requested. However, the downside is that it may lead to caching unnecessary data,
potentially increasing costs.
In practice, both lazy loading and write-through strategies are often used together, making it important
to understand how frequently data changes and to set appropriate time-to-live (TTL) values.
Adding TTL:
Lazy loading can lead to stale data, while write-through caching ensures data is always fresh but may fill
the cache with unnecessary information.
By using a time-to-live (TTL) value for each entry, you can combine the benefits of both strategies
while preventing cache clutter. TTL is a specified duration (in seconds or milliseconds) that determines
how long a key remains valid. When an application tries to access an expired key, it behaves as if the
data is not in the cache, prompting a query to the database and updating the cache. This approach keeps
the data from becoming too stale and ensures that cached values are refreshed periodically from the
database.
Three-tier web hosting architecture:
One scenario where you might want to use Amazon ElastiCache is in a traditional, three-tier web hosting
architecture. In this scenario, you want to run a public web application while you still maintain private
backend servers in a private subnet. You can create a public subnet for your web servers that have access
to the internet. At the same time, you can place your backend infrastructure in a private subnet with no
internet access. The database tier of your backend infrastructure might include Amazon Relational
Database Service (Amazon RDS) DB instances and an ElastiCache cluster that provides the in-memory
layer.
In this web hosting architecture diagram:
• Amazon Route 53 enables you to map your zone apex (such as example.com) DNS name to your
load balancer DNS name.
• Amazon CloudFront provides edge caching for high-volume content.
• A load balancer spreads traffic across web servers in Auto Scaling groups in the presentation layer.
• Another load balancer spreads traffic across backend application servers in the Auto Scaling groups
that are in the application layer.
• Amazon ElastiCache provides an in-memory data cache for the application, which removes load
from the database tier.
.
10. ASSIGNMENT
Assignment 1: Basic EC2 Scaling Setup
Difficulty Level: Beginner
Topic: Scaling Your Compute Resources
Task: Set up a basic Auto Scaling group with EC2 instances.
Instructions:
1. Create a VPC with a suitable CIDR range.
2. Launch an EC2 instance in the VPC.
3. Create an Auto Scaling group with a minimum of 1 instance and a maximum of 3 instances.
4. Configure a scaling policy based on CPU utilization.
5. Document each step with screenshots and explanations.
Expected Output: A guide detailing the creation of the Auto Scaling group, scaling policy, and instance
management.
Assignment 2: Database Scaling Strategies
Difficulty Level: Intermediate
Topic: Scaling Your Databases
Task: Implement horizontal scaling with Amazon RDS Read Replicas.
Instructions:
1. Create an Amazon RDS instance with a specific database engine.
2. Enable Read Replicas for the RDS instance.
3. Test read performance by directing read queries to the Read Replica.
4. Document the steps taken to set up the Read Replicas, including any configurations made.
Expected Output: A comprehensive report with screenshots showing the RDS instance setup, Read
Replica configuration, and performance testing results.
Assignment 3: Highly Available Architecture Design
Difficulty Level: Expert
Topic: Designing an Environment That’s Highly Available
Task: Design and implement a highly available architecture using multiple Availability Zones.
Instructions:
1. Set up a multi-tier application architecture with an Elastic Load Balancer, EC2 instances, and a
database (RDS or DynamoDB).
2. Configure Auto Scaling across multiple Availability Zones.
3. Implement Route 53 for DNS management with health checks.
4. Test the failover mechanism by simulating instance failures.
5. Document the architecture, configurations, and results of the failover tests.
Expected Output: An architecture diagram, configuration documentation, and testing results showing
high availability in action.
Assignment 4: Monitoring and Cost Management
Difficulty Level: Intermediate
Topic: Monitoring
Task: Set up a comprehensive monitoring system using Amazon CloudWatch.
Instructions:
1. Create an EC2 instance and enable CloudWatch monitoring.
2. Set up custom CloudWatch metrics for application performance.
3. Create alarms for high CPU utilization and low disk space.
4. Analyze cost reports in the AWS Cost Management console.
5. Document the CloudWatch setup, metrics created, and any insights gained from cost analysis.
Expected Output: A report with CloudWatch dashboard screenshots, metric configurations, and cost
management findings.
Assignment 5: Infrastructure Automation with CloudFormation
Difficulty Level: Advanced
Topic: Automating Your Infrastructure
Task: Create a CloudFormation template to automate the deployment of a multi-tier application.
Instructions:
1. Write a CloudFormation template to provision an EC2 instance, RDS instance, and an Elastic Load
Balancer.
2. Use parameters to make the template flexible for different environments.
3. Deploy the application using the CloudFormation template.
4. Implement drift detection to monitor changes to the stack.
5. Document the template structure, deployment process, and drift detection results.
Expected Output: A well-structured CloudFormation template with comments, deployment steps, and
drift detection report.
11. PART A QUESTIONS AND ANSWERS
1. What is elasticity in the context of AWS compute resources? K2
Elasticity refers to the ability of a cloud infrastructure to automatically adjust its
resources based on current demand, allowing for scaling up or down to match workload
fluctuations.
2. Explain the difference between vertical and horizontal scaling of databases. K2
Vertical scaling involves increasing the resources (CPU, RAM) of a single database
instance, while horizontal scaling involves adding more instances to distribute the load.
3. How would you configure Auto Scaling for an application experiencing fluctuating K3
traffic?
Auto Scaling can be configured by creating a launch configuration, defining the desired
number of instances, setting minimum and maximum instance counts, and establishing
scaling policies based on CloudWatch metrics like CPU utilization.
4. Describe how to implement read replicas in Amazon RDS to enhance database K3
performance.
Read replicas can be created in Amazon RDS by selecting an existing RDS instance and
choosing the option to create a read replica. This will offload read queries from the
primary database, improving overall performance.
5. Compare the benefits of using Elastic Load Balancing with a single-instance K4
deployment.
Elastic Load Balancing improves application availability and fault tolerance by
distributing incoming traffic across multiple instances, whereas a single-instance
deployment is vulnerable to failures and can lead to downtime.
6. Analyze how CloudWatch can be utilized for monitoring AWS resources effectively. K4
CloudWatch collects and tracks metrics, sets alarms based on specified thresholds, and
provides dashboards for visualizing resource performance, allowing for proactive
management and troubleshooting of AWS resources.
7. Evaluate the effectiveness of horizontal scaling for a web application that experiences K5
sudden traffic spikes.
Horizontal scaling is effective in handling sudden traffic spikes because it allows for
adding multiple instances quickly to distribute the load, enhancing availability and
ensuring performance without downtime.
8. Assess the implications of scaling a database vertically versus horizontally in terms of K5
cost and performance.
Vertical scaling can lead to higher costs due to the need for more powerful instances
and potential downtime during upgrades, while horizontal scaling can provide better
cost efficiency and flexibility but may require more complex management.
9. Create a monitoring strategy using CloudWatch for an application deployed across K6
multiple AWS regions.
A monitoring strategy could involve setting up CloudWatch Alarms for key metrics
across regions, creating a centralized dashboard for visualizing performance, and
implementing automated notifications for anomaly detection to ensure operational
health.
10. Design a highly available architecture using Auto Scaling and Elastic Load Balancing for K6
a web application.
The architecture would consist of multiple EC2 instances spread across different
Availability Zones behind an Elastic Load Balancer. Auto Scaling would dynamically
adjust the number of instances based on traffic patterns, ensuring high availability and
performance.
11. What are the primary benefits of automating infrastructure management? K2
Automating infrastructure management reduces manual errors, increases efficiency,
and enables faster deployment of resources, leading to more consistent and reliable
environments.
12. Describe the role of AWS Elastic Beanstalk in application deployment. K2
AWS Elastic Beanstalk simplifies application deployment by automatically handling the
provisioning of infrastructure, load balancing, scaling, and monitoring, allowing
developers to focus on writing code.
13. How can you use AWS CloudFormation to automate infrastructure deployment? K3
AWS CloudFormation allows users to define infrastructure as code using templates,
enabling the automated creation and management of AWS resources by simply
updating the template and deploying it.
14. Describe the process of deploying an application using AWS Elastic Beanstalk. K3
To deploy an application using Elastic Beanstalk, you upload your application code,
configure the environment settings, and Elastic Beanstalk automatically provisions the
required resources and deploys the application.
15. Analyze the potential drawbacks of manual deployment processes compared to K4
automated deployment.
Manual deployment processes are prone to human error, can be time-consuming, and
often lead to inconsistencies across environments, whereas automated deployments
enhance repeatability and speed while reducing errors.
16. Compare the advantages of using AWS Elastic Beanstalk versus deploying applications K4
on EC2 directly.
Elastic Beanstalk provides built-in management features such as automatic scaling and
monitoring, reducing operational overhead, while deploying on EC2 directly requires
more manual setup and management of infrastructure.
17. Evaluate the impact of automation on the speed of application delivery in a DevOps K5
environment.
Automation significantly enhances the speed of application delivery in a DevOps
environment by enabling continuous integration and continuous deployment (CI/CD)
practices, which streamline the development and release processes.
18. Assess the effectiveness of using AWS Elastic Beanstalk for small vs. large applications. K5
For small applications, Elastic Beanstalk offers a quick and easy deployment process,
while for large applications, it can simplify management and scaling, though more
complex configurations may be required to optimize performance.
19. Assess the reasons why an organization should adopt automation in its infrastructure K5
management.
Organizations should adopt automation to enhance operational efficiency, improve
consistency across deployments, reduce the risk of human error, and allow teams to
focus on strategic initiatives rather than routine tasks.
20. Evaluate the trade-offs between using a fully managed service like Elastic Beanstalk K5
and managing infrastructure manually.
Using Elastic Beanstalk reduces the operational burden and simplifies application
management, but may offer less control over the underlying infrastructure compared
to manual management, which allows for fine-tuned customizations but requires more
effort and expertise.
21. What is caching, and why is it used in web applications? K2
Caching is the process of storing frequently accessed data in a temporary storage
location to improve retrieval speed. It is used in web applications to enhance
performance and reduce latency.
22. Describe the purpose of edge caching. K2
Edge caching involves storing cached content closer to the end user, typically at CDN
edge locations, to minimize latency and improve load times for users by reducing the
distance data must travel.
23. How can you implement caching for web sessions in a web application? K3
Caching for web sessions can be implemented by using a distributed cache (e.g., Redis
or Memcached) to store session data, allowing for quick retrieval across multiple servers
and ensuring session consistency.
24. What steps would you take to configure edge caching with Amazon CloudFront? K3
To configure edge caching with CloudFront, create a CloudFront distribution, specify
the origin server, configure caching behavior settings (such as TTL), and deploy the
distribution to begin caching content at edge locations.
25. Analyze the benefits and drawbacks of using caching in database applications. K4
The benefits of caching in database applications include improved response times and
reduced database load. However, drawbacks can include stale data issues and the
complexity of cache invalidation strategies.
26. Compare the effectiveness of using in-memory caching versus disk-based caching for K4
web applications.
In-memory caching provides faster access speeds and lower latency, making it more
effective for real-time applications. Disk-based caching, while slower, allows for larger
data sets to be stored, which can be useful for less frequently accessed data.
27. Evaluate the impact of caching on user experience in web applications. K5
Caching significantly enhances user experience by reducing page load times, decreasing
latency, and allowing for smoother interactions, which can lead to higher user
satisfaction and engagement.
28. Assess the challenges of implementing caching strategies for session management in a K5
distributed system.
Challenges include ensuring data consistency across cache nodes, managing session
expiration effectively, and dealing with the potential for cache misses that can lead to
increased latency and user disruption.
29. Evaluate the trade-offs between using local caching versus centralized caching K5
solutions.
Local caching reduces latency and allows for faster access but can lead to data
inconsistency. Centralized caching ensures consistency and easier management but
may introduce network latency and a single point of failure.
30. Assess the role of caching in optimizing database performance. K5
Caching optimizes database performance by offloading read requests from the
database, reducing load and response times. However, it requires careful management
to prevent stale data and ensure that cache invalidation is properly handled.
12. PART B QUESTIONS
1. Describe the key features and benefits of Amazon EC2 Auto Scaling, including how it K2
adjusts the number of instances based on varying workload demands.
2. Explain the differences between vertical scaling and horizontal scaling in the context K2
of Amazon RDS, and provide examples of scenarios where each method would be
most effective.
3. Outline the components required to create a highly available architecture using K3
Elastic Load Balancing and Auto Scaling in AWS, and discuss how these components
work together to ensure application uptime.
4. Discuss the role of Amazon CloudWatch in monitoring AWS resources, including the K3
types of metrics it can track and how it can be configured to trigger alarms based on
specific thresholds.
5. Identify the main reasons organizations should consider automating their K2
infrastructure management processes, and discuss the potential benefits and
challenges associated with automation.
6. Describe how AWS CloudFormation enables infrastructure as code (IaC) and detail K3
the steps involved in creating and managing a CloudFormation template for deploying
a multi-tier application.
7. Explain how AWS Elastic Beanstalk simplifies the deployment and management of K2
applications, including the steps involved in deploying an application using this
service.
8. Define caching in the context of cloud computing and discuss its advantages for web K2
applications, including the types of data that are typically cached to improve
performance.
9. Explain the concept of edge caching and how services like Amazon CloudFront utilize K2
edge locations to reduce latency and improve the delivery speed of content to end
users.
10. Discuss the various strategies for caching databases, including when to implement K3
caching and the potential impacts of caching on database performance and user
experience.
13. ONLINE CERTIFICATIONS
1. AWS Certified Cloud Practitioner
AWS Certified Cloud Practitioner Certification | AWS Certification | AWS (amazon.com)
2. AWS Certified Solutions Architect – Associate
https://aws.amazon.com/certification/certified-solutions-architect-associate/
14. REAL TIME APPLICATIONS
Reduce SaaS Deployment Costs and Time to Market with Amazon FSx for NetApp ONTAP
Software-as-a-service (SaaS) has become the most effective software delivery model in use today, and
cloud’s on-demand resources have made it possible for SaaS providers to increase the pace of build and
deployment. It’s key to use the right cloud architecture to ensure high availability and resiliency while
managing efficiency and service-level agreements (SLAs).
The SaaS build and delivery pace is dependent to a large extent on the selection of data layer
services. Amazon FSx for NetApp ONTAP is fully managed shared storage built on NetApp’s popular
ONTAP file system. It offers enterprise-grade storage on Amazon Web Services (AWS) with built-in data
management capabilities that can meet the stringent performance requirements of SaaS delivery in the
cloud.
In this post, we will look at how Amazon FSx for NetApp ONTAP can ease the complexity of managing,
deploying, and operating the data layer of your SaaS service, while abstracting complex processes to
support data availability, data protection, data security, compliance, and cost efficiency and
sustainability.
NetApp is an AWS Specialization Partner and AWS Marketplace Seller that delivers cloud storage and data
services for protection, security, classification, and observability.
Reduce SaaS Deployment Costs and Time to Market with Amazon FSx for NetApp ONTAP | AWS Partner
Network (APN) Blog
15. ASSESSMENT SCHEDULE
Tentative schedule for the Assessment During 2024-2025 Odd semester
Name of the
S.NO Start Date End Date Portion
Assessment
1 IAT 1 22.08.2024 30.08.2024 UNIT 1 & 2
2 IAT 2 30.09.2024 08.10.2024 UNIT 3 & 4
3 REVISION - - UNIT 5, 1 & 2
4 MODEL 26.10.2024 08.09.2024 ALL 5 UNITS
16. PRESCRIBED TEXT BOOKS AND REFERENCES
REFERENCES:
1. AWS Certified Solutions Architect Official Study Guide by Joe Baron, Hisham Baz, Tim Bixler
2. Architecting the Cloud by Michael Kavis.
3. AWS Documentation (amazon.com) - https://docs.aws.amazon.com/
4. AWS Skill Builder -
https://explore.skillbuilder.aws/learn/public/learning_plan/view/82/cloud-foundations-
learning-plan?la=sec&sec=lp
5. AWS Academy Cloud Architecting Course -
https://www.awsacademy.com/vforcesite/LMS_Login
17. MINI PROJECT
Mini-Project 1: Basic EC2 Auto Scaling Configuration
• Objective: Set up a simple Auto Scaling group to manage EC2 instances based on traffic demand.
• Tasks:
1. Create an EC2 instance and configure a launch template.
2. Set up an Auto Scaling group with a minimum and maximum number of instances.
3. Configure a scaling policy based on CPU utilization metrics from CloudWatch.
4. Verify that instances scale in and out based on defined policies.
• Difficulty Level: Beginner
Mini-Project 2: Database Scaling with Amazon RDS Read Replicas
• Objective: Enhance the performance of a database by implementing read replicas in Amazon RDS.
• Tasks:
1. Create a primary Amazon RDS database instance.
2. Set up one or more read replicas of the primary database.
3. Modify your application to distribute read requests between the primary instance and the
read replicas.
4. Monitor the performance improvement using Amazon CloudWatch metrics.
• Difficulty Level: Intermediate
Mini-Project 3: Designing a Highly Available Web Application
• Objective: Create a highly available architecture for a web application using multiple AWS services.
• Tasks:
1. Design an architecture that includes EC2 instances, Elastic Load Balancer (ELB), and Auto
Scaling.
2. Deploy the application across multiple Availability Zones.
3. Configure health checks and routing with the ELB to ensure high availability.
4. Document the architecture and provide screenshots of the setup.
• Difficulty Level: Intermediate
Mini-Project 4: Automating Infrastructure Deployment with CloudFormation
• Objective: Use AWS CloudFormation to automate the deployment of a multi-tier application.
• Tasks:
1. Create a CloudFormation template to define resources (VPC, EC2 instances, RDS, etc.).
2. Deploy the stack using the template.
3. Test the application to ensure all components work together.
4. Modify the template to add or update resources and redeploy the stack.
• Difficulty Level: Advanced
Mini-Project 5: Implementing Caching Strategies with ElastiCache
• Objective: Improve application performance by integrating caching using Amazon ElastiCache.
• Tasks:
1. Create an ElastiCache cluster using Redis or Memcached.
2. Implement caching strategies (lazy loading or write-through) in your application.
3. Measure the performance impact of caching on response times and database load.
4. Document the caching strategy and performance metrics.
• Difficulty Level: Advanced
Thank you
Disclaimer:
This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.