Here’s a clean, practical layered cloud architecture you can use as a reference design.
It’s
vendor-neutral first, then I map it to AWS/Azure/GCP.
Layered Cloud Architecture (Reference
Model)
┌────────────────────────────────────────────────────────────┐
│ 7. Business / SaaS (Apps, Portals, Mobile, APIs to clients)│
├────────────────────────────────────────────────────────────┤
│ 6. Application Services (Microservices, API Gateway, │
│ Service Mesh, Functions, Containers) │
├────────────────────────────────────────────────────────────┤
│ 5. Data & Analytics (DBaaS, Data Lake, Streams, Caches, │
│ ML pipelines) │
├────────────────────────────────────────────────────────────┤
│ 4. Platform & DevX (CI/CD, IaC, Secrets, Artifact Repos, │
│ Observability SDKs) │
├────────────────────────────────────────────────────────────┤
│ 3. Cloud Infrastructure (VPC/VNet, Subnets, Load Balancers,│
│ Gateways, CDN, IAM primitives, KMS) │
├────────────────────────────────────────────────────────────┤
│ 2. Virtualization/Runtime (Managed K8s, Serverless, VMs, │
│ Containers, Runtimes, OS) │
├────────────────────────────────────────────────────────────┤
│ 1. Physical / Facilities (Regions, AZs, Power, Cooling, │
│ Physical security – owned by cloud provider) │
⇧ Cross-cutting: Security, Compliance, Observability, FinOps
└────────────────────────────────────────────────────────────┘
What each layer does (and key design choices)
1. Physical/Facilities
Regions & Availability Zones selection, data residency, latency targets.
2. Virtualization/Runtime
Choose VMs for legacy/stateful, containers (Kubernetes) for portable microservices,
serverless for event-driven/peak bursts.
Standardize base images and runtimes (hardened OS, minimal images).
3. Cloud Infrastructure (Networking & IAM)
VPC/VNet design with public/private subnets, NAT, transit/peering.
Zero-trust: least-privilege IAM roles, identity federation (SSO).
Perimeter: WAF + DDoS protection; Edge/CDN for static/offload.
Encryption: KMS-managed keys, TLS everywhere.
4. Platform & Developer Experience
IaC (Terraform/Bicep/CloudFormation) for everything.
CI/CD pipelines with policy checks, unit/integration tests.
Secrets in a managed vault; image/artifact registries.
Observability: centralized logs, metrics, traces, alerts, SLOs.
5. Data & Analytics
Right store for the job: relational (OLTP), NoSQL (scale/read), time-series, search,
object storage (lake).
Streaming (pub/sub, Kafka-style) for events; ETL/ELT to lakehouse.
Backup/restore, PITR, DR replication, data catalog & governance.
6. Application Services
Microservice decomposition, API gateway, service mesh (mTLS, retries).
Async patterns: queues & event buses for decoupling.
Caching at multiple tiers (edge, app, DB read replicas).
7. Business/SaaS
Frontends (web/mobile), external partner APIs, reporting.
Feature flags, A/B testing, tenant isolation (if multi-tenant).
Cross-cutting
Security & Compliance: threat modeling, CIS benchmarks, runtime security,
continuous compliance (e.g., SOC2, ISO 27001).
Resilience: multi-AZ default; consider multi-region for RTO/RPO.
FinOps: budgets, cost allocation tags, autoscaling, demand-based rightsizing.
Reference Request Flow (typical web workload)
1. Client → CDN/WAF → API Gateway / Ingress
2. Gateway → Service (container/serverless) via Service Mesh
3. Service ↔ Cache (Redis) and DB (managed RDBMS/NoSQL)
4. Service → Queue/Event Bus for async tasks
5. Background workers/functions process jobs, write to object storage or search
6. Observability emits logs/metrics/traces; alerts fire on SLO breaches
7. CI/CD deploys via IaC to K8s/Functions/VMs with canary/blue-green
Example Technology Mapping
AWS
Net/IAM: VPC, ALB/NLB, CloudFront, Route 53, WAF, Shield, IAM, KMS
Runtime: EKS/ECS Fargate, Lambda, EC2
Data: RDS/Aurora, DynamoDB, ElastiCache, S3, MSK/Kinesis, OpenSearch
DevX/Obs: CodePipeline/GitHub Actions, CloudWatch, X-Ray, ECR, Secrets
Manager, Terraform
Azure
Net/IAM: VNet, App Gateway/ALB, Front Door, DNS, WAF, Entra ID, Key Vault
Runtime: AKS, Container Apps, Functions, VM Scale Sets
Data: SQL DB/Cosmos DB, Cache for Redis, Event Hubs, Data Lake Storage,
Cognitive Search
DevX/Obs: DevOps/GitHub Actions, Monitor, Log Analytics, Container Registry,
Bicep/Terraform
GCP
Net/IAM: VPC, Cloud Load Balancing, Cloud CDN, Cloud Armor, IAM, KMS
Runtime: GKE, Cloud Run, Cloud Functions, Compute Engine
Data: Cloud SQL/Spanner/Bigtable/Firestore, Memorystore, Pub/Sub, BigQuery,
GCS, Elastic/Lucene alt
DevX/Obs: Cloud Build/GitHub Actions, Cloud Monitoring/Logging/Trace, Artifact
Registry, Terraform
Minimal “starter” blueprint (copy/paste checklist)
Regions/AZs chosen; data residency documented
VPC/VNet with public (edge) + private (app/data) subnets, NAT, route tables
WAF + CDN + DNS; DDoS baseline protections
EKS/AKS/GKE (or serverless) cluster with autoscaling; hardened node images
API Gateway, service mesh (mTLS), sidecar or ambient mesh
Managed DB (PITR on), Redis cache, object storage bucket policies
Queue + event bus for async; scheduled workers
CI/CD with IaC; OPA/policy checks; artifact signing & SBOM
Centralized logs/metrics/traces; SLOs with alert policies
Secrets in vault; KMS CMKs; TLS cert automation
Backups, DR runbooks, RTO/RPO targets tested quarterly
Cost tags/labels, budgets, anomaly detection
Multi-Region & DR patterns (when needed)
Active/Active stateless services behind global load balancing; DB with multi-region
write (e.g., Spanner/Cosmos) or async replication with conflict strategy.
Active/Passive warm standby for stateful tiers; DNS or traffic manager failover;
replicate S3/GCS/ADLS buckets.
Regular chaos/failover drills and data recovery tests.
Security-by-Design (per layer quick hits)
L1–3: Private subnets, SG/NSG least-open, egress control, security hub/defender.
L4: Signed artifacts, dependency scanning, IaC drift detection.
L5: Row/column-level security, data masking, CMEK, audit trails.
L6–7: OAuth/OIDC, short-lived tokens, rate limiting, schema validation.
1. What is it?
The NIST Cloud Computing Reference Architecture (by the U.S. National Institute of
Standards and Technology, Special Publication 500-292) is a conceptual model that:
Defines roles in a cloud ecosystem
Shows how they interact
Gives a vendor-neutral blueprint for cloud design
It’s not a technical implementation guide — it’s a role & responsibility framework.
2. Core Components
NIST defines five major actors in the cloud ecosystem:
Actor Role in the Architecture
Cloud Uses cloud services (e.g., an enterprise using SaaS, a dev team deploying to PaaS, or
Consumer ops team spinning up VMs in IaaS).
Cloud Makes cloud services available (responsible for infrastructure, platform, or software
Provider delivery).
Manages, negotiates, and optimizes the use of cloud services on behalf of consumers
Cloud Broker
(can aggregate, arbitrage, or customize services).
Provides connectivity and transport between the consumer and provider (e.g., ISPs,
Cloud Carrier
dedicated links, VPN providers).
Independent party assessing security, performance, and compliance of the cloud
Cloud Auditor
services.
3. Service Models
The architecture covers the three NIST-defined service models:
SaaS (Software as a Service) – provider manages everything (consumer only
configures and uses).
PaaS (Platform as a Service) – provider manages infrastructure and runtime;
consumer builds/deploys apps.
IaaS (Infrastructure as a Service) – provider offers compute, storage, network;
consumer manages OS, apps, etc.
4. Deployment Models
Private Cloud – exclusive to one organization
Community Cloud – shared by organizations with similar requirements
Public Cloud – available to the general public
Hybrid Cloud – combination of two or more clouds with orchestration
5. How It Fits Together
High-Level Diagram (NIST SP 500-292 style)
┌──────────────────────┐
│ Cloud Consumer │
└─────────┬─────────────┘
│ Service Use
┌─────────▼─────────────┐
│ Cloud Broker │
└─────────┬─────────────┘
│ Mediation
┌──────────────┼───────────────────┐
│ │ │
┌───────▼──────┐ ┌─────▼────────┐ ┌────────▼──────┐
│ Cloud Auditor│ │ Cloud Carrier │ │ Cloud Provider│
└──────────────┘ └───────────────┘ └──────┬───────┘
│
┌────────▼─────────┐
│ Cloud Services │
│ SaaS / PaaS / IaaS│
└──────────────────┘
6. Key Interactions
Consumer ↔ Provider – Direct service consumption (APIs, portals)
Consumer ↔ Broker – Broker helps select, combine, or optimize provider services
Provider ↔ Carrier – Network transport & QoS
Provider ↔ Auditor – Security/compliance verification
Broker ↔ Multiple Providers – Aggregation and interoperability
7. Why It Matters
Common language for architects, vendors, and regulators
Helps assign clear responsibilities in contracts/SLA
Serves as a basis for compliance (FedRAMP, ISO 27017, etc.)
Avoids vendor lock-in by using standard role definitions
🌩️Public Cloud
📌 Definition:
A Public Cloud is a cloud environment owned and managed by a third-party provider (like
AWS, Microsoft Azure, Google Cloud). The infrastructure is shared among multiple
organizations (multi-tenant), and users access services via the internet on a pay-per-use
model.
🔑 Features:
Shared infrastructure (multi-tenant model).
On-demand scalability and elasticity.
Accessible via internet from anywhere.
Pay-as-you-go pricing model.
Managed by Cloud Service Providers (CSPs).
✅ Advantages:
Cost-effective: No need for capital investment in hardware/software.
Scalable: Instantly scale resources up or down.
Accessible globally.
High reliability with distributed data centers.
No maintenance overhead for users.
❌ Disadvantages:
Less control over infrastructure and security.
Data privacy concerns since resources are shared.
Internet dependency (downtime if connection fails).
Compliance issues for sensitive industries.
💼 Use Cases:
Startups and small businesses (low cost).
Web hosting and application development.
Storage, backup, and disaster recovery.
AI/ML and data analytics workloads.
🏢 Private Cloud
📌 Definition:
A Private Cloud is a cloud infrastructure dedicated to a single organization. It can be hosted
on-premises (within company data centers) or by third-party providers but is used exclusively
by one entity.
🔑 Features:
Single-tenant model (exclusive use).
Greater control over infrastructure.
Customizable security and compliance.
May be hosted on-premises or externally.
✅ Advantages:
High security and privacy since resources are not shared.
Better control & customization for applications.
Meets regulatory compliance (banking, healthcare, govt.).
Performance consistency due to dedicated resources.
❌ Disadvantages:
High cost (hardware, software, IT staff).
Limited scalability compared to public cloud.
Requires in-house expertise for maintenance.
💼 Use Cases:
Large enterprises with strict compliance needs.
Banking, finance, healthcare, government.
Companies handling sensitive or mission-critical data.
Industries requiring low-latency internal operations.
🔄 Hybrid Cloud
📌 Definition:
A Hybrid Cloud is a combination of Public and Private Clouds. Organizations use private
cloud for sensitive operations and public cloud for scalable workloads, connecting both via
secure networking.
🔑 Features:
Mix of public and private cloud environments.
Data and applications move between clouds.
Provides flexibility, scalability, and security.
Often integrated with orchestration tools.
✅ Advantages:
Flexibility: Place sensitive workloads in private cloud, others in public.
Scalability: Use public cloud for peak demand.
Cost optimization: Balance investment and pay-per-use.
Business continuity with disaster recovery options.
Improved performance with workload distribution.
❌ Disadvantages:
Complex setup & management (integration required).
Security challenges with data moving between clouds.
Higher cost than pure public cloud.
Requires skilled IT teams for orchestration.
💼 Use Cases:
Businesses with fluctuating workloads.
E-commerce (secure payments in private, website hosting in public).
Healthcare (patient data in private, analytics in public).
Enterprise applications requiring both compliance and scalability.
📊 Quick Comparison Table
Feature Public Cloud 🌐 Private Cloud 🏢 Hybrid Cloud 🔄
Single org (in-house or
Ownership Third-party CSP Mix of public & private
hosted)
Low (OPEX, pay-as-
Cost High (CAPEX & OPEX) Moderate (CAPEX+OPEX)
you-go)
Scalability Very high Limited High
Feature Public Cloud 🌐 Private Cloud 🏢 Hybrid Cloud 🔄
Security Lower (shared infra) High Balanced
Control Limited Full Partial
Enterprises, regulated Enterprises needing both flexibility &
Best For Startups, dev, storage
industries security
☁️Cloud Service Models
1 Infrastructure as a Service (IaaS)
📌 Definition:
IaaS provides virtualized computing resources (like servers, storage, networking, firewalls,
load balancers) over the internet.
Users rent IT infrastructure from a cloud provider instead of buying and maintaining physical
servers.
🔑 Features:
Virtual machines, storage, networks, load balancers.
Self-service provisioning and scaling.
Pay-as-you-go pricing.
High availability with multiple data centers.
✅ Advantages:
Cost saving (no physical hardware).
Scalability & flexibility for computing resources.
Disaster recovery & backup options.
Complete control over infrastructure.
❌ Disadvantages:
Requires skilled IT staff to manage.
Security responsibilities shared with provider.
Can become costly with high usage.
💼 Examples:
AWS EC2 (Elastic Compute Cloud)
Google Compute Engine
Microsoft Azure VM
DigitalOcean, Linode
🎯 Use Cases:
Hosting websites & apps.
Testing & development environments.
Storage & backup solutions.
High-performance computing (HPC).
2️ Platform as a Service (PaaS)
📌 Definition:
PaaS provides a ready-to-use platform for application development and deployment.
It includes infrastructure + runtime environment + tools (databases, middleware, APIs)
so developers can focus only on coding, not managing servers.
🔑 Features:
Pre-configured development environments.
Middleware, databases, runtime, APIs included.
Auto-scaling for apps.
Integrated development & deployment tools.
✅ Advantages:
Faster development (no need to manage infra).
Reduces complexity of deployment.
Supports collaboration among dev teams.
Scalable & cost-efficient for apps.
❌ Disadvantages:
Limited control over underlying infrastructure.
Vendor lock-in (hard to migrate).
Customization limitations.
💼 Examples:
Google App Engine
Microsoft Azure App Services
AWS Elastic Beanstalk
Heroku
🎯 Use Cases:
Application development & deployment.
APIs & microservices.
Streamlined software lifecycle (DevOps).
Mobile & web apps.
3️ Software as a Service (SaaS)
📌 Definition:
SaaS delivers fully functional software applications over the internet on a subscription
basis.
Users don’t manage infrastructure or platforms — they just use the app.
🔑 Features:
Ready-to-use applications.
Accessible via web browsers or apps.
Subscription or freemium pricing.
Multi-tenant architecture.
✅ Advantages:
No installation/maintenance required.
Accessible anywhere (internet-based).
Automatic updates & patches.
Cost-effective for businesses.
❌ Disadvantages:
Less customization possible.
Data security & privacy concerns.
Dependent on internet connectivity.
Vendor lock-in.
💼 Examples:
Google Workspace (Gmail, Docs, Drive)
Microsoft 365
Zoom, Slack, Dropbox
Salesforce CRM
🎯 Use Cases:
Collaboration & productivity tools.
Customer Relationship Management (CRM).
Email & communication.
Online accounting, ERP, HR software.
📊 Comparison Table: IaaS vs PaaS vs SaaS
Aspect IaaS 🖥️ PaaS ⚙️ SaaS 💻
What it Infrastructure (VMs, storage, Platform for development (tools, Ready-made
provides networking) runtime, DBs) software apps
Only user
User controls OS, applications, data Applications, data
settings/data
Cost model Pay-as-you-use Subscription/pay-as-you-use Subscription
Target users IT admins, sysadmins Developers End-users
Scalability High High High
AWS EC2, Azure VM, GCP Google App Engine, Azure App Gmail, Zoom,
Examples
Compute Engine Services, Heroku Salesforce
💡 Simple Analogy:
IaaS → Renting a plot of land 🏞️(you build your own house).
PaaS → Renting a furnished apartment 🏢 (you just decorate & live).
SaaS → Booking a hotel room 🏨 (everything ready, just use).