0% found this document useful (0 votes)

6 views8 pages

Tiered Compute Architecture

Uploaded by

sisodia.hs95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views8 pages

Tiered Compute Architecture

Uploaded by

sisodia.hs95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Compute Server Config:

CPU: 5th Gen Intel® Xeon® Silver 4514Y (16 Cores, 32 Threads @ 2.00 GHz) RAM: 128 GB
DDR5 4800MHz ECC REG Memory Storage: 2 x 4TB Samsung 990 Pro M.2 NVMe Gen 4
SSDs (8TB Total)

Other Key Features

● Motherboard: ASRockRack SPC741D8-2L2T/BCM with PCIe 5.0 support

● Networking: Dual 10GbE/1GbE LAN ports
● Management: Integrated IPMI for remote management
● Chassis: 1U Rack Mount with 4 hot-swap bays
● Power Supply: Dual 650W (80+ Platinum) redundant power supplies
● Warranty Options: 3-Year Return-to-Base (RTB) on all components / 3 Years Onsite

NAS Server Config:

CPU: 5th Gen Intel® Xeon® Silver 4509Y (8 Cores, 16 Threads @ 2.60 GHz) RAM: 32 GB
DDR5 4800MHz ECC REG Memory Storage:

● OS Drive: 1 x 500GB Crucial P3 Plus M.2 NVMe SSD

● Data Drives: 6 x 16TB NL-SAS 7.2K RPM Enterprise HDDs

Other Key Features

● Motherboard: ASRockRack SPC741D8-2L2T/BCM with PCIe 5.0

● Networking: Dual 10GbE and 1GbE LAN ports
● Management: Integrated IPMI for remote management
● Chassis: Chenbro RM23808-800R 2U with 8 hot-swap drive bays
● Power Supply: Dual 800W (80+ Platinum) redundant power supplies
● Cooling: Hot-swap cooling fans
● Warranty: 3-Year Return-to-Base (RTB)
Core Architecture: Clustered & Resilient

System Architecture Plan: 5 Compute Nodes & 4 Storage Nodes

The goal is to move beyond individual servers and create two robust clusters: a
Compute Cluster for running virtual machines and a Distributed Storage
Cluster for providing a single, unified storage pool. This design ensures high
availability, scalability, and simplified management.

Tiered Compute Architecture

We will logically divide our five compute servers into two separate groups, or
"clusters," which will share the same underlying network and storage
infrastructure.

● Compute Cluster A: Mission-Critical HA Cluster (2 Nodes)

● Compute Cluster B: General Purpose & Dev/Test Cluster (3 Nodes)

Mission-Critical HA Cluster (2 Nodes)

This cluster will be engineered for maximum uptime to host our most essential
applications, where downtime is not an option.

● Configuration: The two compute servers will be configured in a cluster

using a hypervisor like Proxmox VE or VMware ESXi, with the High
Availability feature enabled.
● How it Works: The VMs for our critical apps will run on these two nodes.
The cluster will constantly monitor the health of both servers.
● Failure Scenario: If one of the two servers fails, the HA system will
automatically and immediately restart its virtual machines on the
second, healthy server. This ensures service continuity with minimal
disruption.
● Resource Planning: To guarantee a successful failover, we should run
this cluster at less than 50% of its total resource capacity. This ensures
that if one node goes down, the surviving node has enough free CPU and
RAM to run the critical VMs from both servers combined.

General Purpose Cluster (3 Nodes)

The remaining three compute servers will form a larger, separate resource pool
for all other tasks.

● Purpose: This cluster is ideal for workloads that are not business-critical.
This includes:
○ Development and testing environments.
○ Staging servers.
○ Continuous Integration/Continuous Deployment (CI/CD) runners.
○ Applications that can tolerate brief periods of downtime.
● Benefit of Isolation: This structure is key to our stability. A risky software
test or a resource-intensive development task running on this
general-purpose cluster cannot crash or impact the performance of our
mission-critical applications running on the separate HA cluster.
● Flexibility: With three nodes, we have a larger pool of resources to
experiment with and work with, without putting our core services at risk.
We can still cluster them for easier management, but we may not require
the same aggressive HA settings.

Storage Cluster Technology Options (4 Nodes)

Our four storage servers, each equipped with six 16 TB drives, create a total raw
capacity of 384 TB. The technology we choose will determine the final usable
space and management overhead. Below are the options for consideration.

Option 1: Ceph Cluster This is a true distributed system where all 24 drives
across all four nodes form a single, self-healing storage pool.

● Management Complexity: Very High. Ceph is incredibly powerful but has

a steep learning curve, requiring significant command-line expertise.
● Usable Storage Calculation: Using the standard 3x replication for
resiliency.
○ 384 TB Raw / 3 = ~128 TB Usable
● Pros: Extreme resiliency (can survive an entire server failure), unified
storage (block, file, and object), and massive scalability.
● Cons: Highest management overhead and lowest storage efficiency.

Option 2: TrueNAS SCALE (4 Separate Pools) In this model, we would run an

independent TrueNAS instance on each node, managing its own RAID-Z2 array.

● Management Complexity: Medium. TrueNAS has a polished web

interface and is far simpler than Ceph, offering a great balance of power
and usability.
● Usable Storage Calculation: Using RAID-Z2 (two-drive parity) on each
node.
○ Per Node: (6 drives - 2 parity) x 16 TB = 64 TB
○ Total Usable: 64 TB/node x 4 nodes = 256 TB Usable
● Pros: Excellent data integrity (ZFS), great storage efficiency, and a robust,
mature platform.
● Cons: Results in four separate storage pools to manage, not a single
unified namespace.

Option 3: OpenMediaVault (OMV) (4 Separate Pools) Similar to the TrueNAS

approach, this option prioritises simplicity above all else.

● Management Complexity: Very Easy. OpenMediaVault is the easiest of

the three to manage, with a clean and straightforward web interface.
● Usable Storage Calculation: Using standard Linux RAID 6 on each node.
○ Per Node: (6 drives - 2 parity) x 16 TB = 64 TB
○ Total Usable: 64 TB/node x 4 nodes = 256 TB Usable
● Pros: Extremely simple to set up, lightweight, and perfect for basic, reliable
network storage.
● Cons: Lacks the advanced features and data integrity checksumming of
ZFS found in TrueNAS.
Operational Usage Model & Data Strategy

Our architecture is built on a powerful tiered-storage principle. We will leverage

the distinct advantages of both the local NVMe SSDs in our compute nodes for
speed and the central storage cluster for capacity and data resilience. This
strategy ensures optimal performance for active workloads while providing a
robust backend for bulk data and protection.

High-Performance Tier: Local Compute Storage

Each of our five compute nodes is equipped with 8 TB of extremely fast Gen 4
NVMe storage. This tier is dedicated to workloads where speed and low latency
are critical.

● VM Operating Systems: The primary disk for every virtual machine (the
C:\ drive or / root partition) will be hosted on this local NVMe storage. This
ensures VMs boot quickly and feel responsive.
● High-I/O Applications: This is the ideal location for workloads that
perform intensive read/write operations, such as:
○ Databases (SQL, PostgreSQL, etc.)
○ Application Caches and temporary files.
○ CI/CD build directories for fast code compilation.
● Mission-Critical VMs: For the 2-node HA cluster, running the OS disks of
our critical applications locally guarantees the highest possible
performance.

Capacity Tier: Central Storage Cluster

Our 4-node storage cluster provides a massive 256 TB usable pool. This tier
serves as the workhorse for bulk data, backups, and centralised resources. It will
be connected to all compute nodes over our redundant 10GbE network.

● Additional VM Data Disks: While the VM's OS runs on fast local storage,
any large data volumes (like a D:\ drive for files or a /data mount) will be
provisioned from the central cluster. This gives our VMs access to huge
amounts of storage without consuming the premium local NVMe space.
● Centralised Backups: The storage cluster is the primary target for all our
data protection tasks. We will configure backup software (like Proxmox
Backup Server or Veeam) to store nightly snapshots of every VM in the
entire lab. The high resilience of the storage nodes makes it a perfect
repository for our critical backups.
● ISO Library and VM Templates: All operating system installation files
(ISOs) and master VM templates will be stored here. This allows us to
deploy new VMs to any compute node instantly from a single, centralised
source.
● General Office File Shares: We can create standard SMB/NFS shares on
the storage cluster for company-wide use, such as storing project files,
archives, and other shared documents.

A Practical Workflow Example

Here’s how we would deploy a new critical file server VM:

1. Deployment: We deploy the VM from a master template stored on the

central storage cluster onto one of the mission-critical compute nodes.
2. Disk Placement:
○ The VM's OS disk (100 GB) is created on the compute node's local
NVMe storage for speed.
○ A data disk (20 TB) for the file share is created on the central
storage cluster and attached to the VM.
3. Operation: The VM runs with a snappy, responsive OS while serving a
huge amount of data from the resilient storage backend.
4. Backup: A nightly backup job takes a snapshot of the entire VM and
stores it safely on the central storage cluster.

Network Architecture & Configuration

To support our 9-node cluster, we will implement a fast, secure, and resilient
network foundation. The design prioritises performance through 10GbE
connectivity and security through network segmentation using VLANs.

Core Switching Hardware

We will build the network around a central, high-performance switching fabric.

● Recommendation: A Layer 3 Lite or full Layer 3 switch is

recommended. This allows the switch to handle routing between our
internal VLANs directly, which is more efficient than sending that traffic to
the office's core router.
● Port Requirements: The switch (or switches) must have at least 18 x
10GbE SFP+ ports to connect all nine servers with redundant links. A
24-port model would be a good choice, providing room for future
expansion.
● Vendor Options: We can source this from a variety of reliable enterprise
vendors such as Cisco, Aruba, MikroTik, or Ubiquiti.
● Uplink: The switch fabric will connect to the main office core router via a
10GbE SFP+ or faster link to ensure a high-speed connection for the
services that require it.

VLAN & IP Address Strategy

We will segment the network into separate Virtual LANs (VLANs) to enhance
security and organise traffic.

Exam Intern
ple et
VLAN Subne Acces
ID Name Purpose t s
192.16
For accessing server IPMI and the 8.10.0/
10 MGMT_VLAN Proxmox/hypervisor management interfaces. 24 No
192.16
VM_TRAFFIC_ The main network for all general purpose 8.20.0/
20 VLAN Virtual Machines. 24 Yes
(Recommended) A dedicated, isolated network 192.16
STORAGE_VL for storage traffic between compute and 8.30.0/
30 AN storage nodes (NFS/iSCSI). 24 No

● Management VLAN (10): This network is strictly for internal

administration. Firewall rules will be applied to block all internet access
to and from this VLAN, securing our core infrastructure.
● VM Traffic VLAN (20): This will be the default network for our VMs,
allowing them to communicate with each other and, where permitted,
access the internet through the office's core router.
● Storage VLAN (30): It is a best practice to create a dedicated VLAN for
storage traffic. This isolates the heavy, performance-sensitive traffic,
preventing it from interfering with management or VM traffic. We can also
enable optimisations like Jumbo Frames on this VLAN to improve
throughput.

Cloud04how To Design A Scalable Private Cloudmarksand 120706080608 Phpapp01
No ratings yet
Cloud04how To Design A Scalable Private Cloudmarksand 120706080608 Phpapp01
28 pages
Windows Server HyperV As Mission Critical Platform
No ratings yet
Windows Server HyperV As Mission Critical Platform
22 pages
Building Your Cloud Infrastructure
No ratings yet
Building Your Cloud Infrastructure
4 pages
Associate Cloud Engineer - Session 2
No ratings yet
Associate Cloud Engineer - Session 2
79 pages
WSV329 - Adams (Architechting Private Cloud With Windows 2012)
No ratings yet
WSV329 - Adams (Architechting Private Cloud With Windows 2012)
54 pages
Developer Day - 7/21/2012 Will Chan - Director of Engineering
No ratings yet
Developer Day - 7/21/2012 Will Chan - Director of Engineering
83 pages
SA Chapter05 Server Management
No ratings yet
SA Chapter05 Server Management
19 pages
Dev Ops
No ratings yet
Dev Ops
9 pages
Virtualization Questionnaire Template
No ratings yet
Virtualization Questionnaire Template
9 pages
HCIE-Data Center Training Material V2.0
No ratings yet
HCIE-Data Center Training Material V2.0
2,418 pages
Hybrid Cloud Solutions with Nutanix
No ratings yet
Hybrid Cloud Solutions with Nutanix
44 pages
Martin Berger - Oracle Priva
No ratings yet
Martin Berger - Oracle Priva
46 pages
Seven Key Requirements For A Turnkey HPC Storage Solution: Features
No ratings yet
Seven Key Requirements For A Turnkey HPC Storage Solution: Features
4 pages
Cse2vvx Assessment4
No ratings yet
Cse2vvx Assessment4
7 pages
Huawei FusionStorage Data Sheet
No ratings yet
Huawei FusionStorage Data Sheet
6 pages
CC - Unit 1
No ratings yet
CC - Unit 1
34 pages
En Pca 3 0 Latest Concept
No ratings yet
En Pca 3 0 Latest Concept
229 pages
Reviewer - Nettech
No ratings yet
Reviewer - Nettech
7 pages
Oceanspace Cloud Storage System
No ratings yet
Oceanspace Cloud Storage System
2 pages
A Practical Guide To Building High-Performance Computing Clusters
No ratings yet
A Practical Guide To Building High-Performance Computing Clusters
69 pages
Cloudera Ref Arch Vmware Local Storage
No ratings yet
Cloudera Ref Arch Vmware Local Storage
20 pages
Hcia-Cloud Service v3.5 Compute 60-138
No ratings yet
Hcia-Cloud Service v3.5 Compute 60-138
79 pages
Clustering in Open Source Systems
No ratings yet
Clustering in Open Source Systems
15 pages
Vfilo Ideal Customer Profile and Deployment Bundles
No ratings yet
Vfilo Ideal Customer Profile and Deployment Bundles
19 pages
Vcs and Oracle Ha
No ratings yet
Vcs and Oracle Ha
157 pages
Sangfor AStor Brochure 202401
No ratings yet
Sangfor AStor Brochure 202401
9 pages
Understanding AWS Compute - Conceptual Guide
No ratings yet
Understanding AWS Compute - Conceptual Guide
14 pages
Anatomy of A Resource Management System For HPC Clusters
No ratings yet
Anatomy of A Resource Management System For HPC Clusters
23 pages
Week 3 Module 3 Graded Quiz
No ratings yet
Week 3 Module 3 Graded Quiz
6 pages
Scale Computing
No ratings yet
Scale Computing
4 pages
Chapter 6
No ratings yet
Chapter 6
27 pages
Huawei FusionSphere 5.1 Brochure (Server Virtualization)
No ratings yet
Huawei FusionSphere 5.1 Brochure (Server Virtualization)
2 pages
Aws1 250318054152 76e40cf7
No ratings yet
Aws1 250318054152 76e40cf7
39 pages
Lecture 11 Cloud Systems
No ratings yet
Lecture 11 Cloud Systems
80 pages
XCP - Xen Server - Step by Step Install
No ratings yet
XCP - Xen Server - Step by Step Install
5 pages
Private Cloud Reference Architecture Tech Note PDF
No ratings yet
Private Cloud Reference Architecture Tech Note PDF
138 pages
Passport CH All An Form
No ratings yet
Passport CH All An Form
2 pages
Dilger Lustre HPCS May Workshop
No ratings yet
Dilger Lustre HPCS May Workshop
45 pages
Introduction to Computing Methods
No ratings yet
Introduction to Computing Methods
6 pages
Page 1
No ratings yet
Page 1
9 pages
Vxflex Data Sheet
No ratings yet
Vxflex Data Sheet
5 pages
Sas A
No ratings yet
Sas A
84 pages
Openstack Setup Diagram
No ratings yet
Openstack Setup Diagram
63 pages
Amazon EC2 Scenariod Base
No ratings yet
Amazon EC2 Scenariod Base
22 pages
Hadoop Platform & Services
No ratings yet
Hadoop Platform & Services
41 pages
Cloud Computing Basics Course Overview of Dell Course With Detailed Storage ND PROTECTION ANALYSIS WHICH IS REQUIRED FR PLACMENTSC
No ratings yet
Cloud Computing Basics Course Overview of Dell Course With Detailed Storage ND PROTECTION ANALYSIS WHICH IS REQUIRED FR PLACMENTSC
11 pages
VMware Vsphere Storage Appliance Evaluation Guide
No ratings yet
VMware Vsphere Storage Appliance Evaluation Guide
46 pages
Disk Storage & Virtualization Basics
No ratings yet
Disk Storage & Virtualization Basics
13 pages
OCS 4.X Troubleshooting
No ratings yet
OCS 4.X Troubleshooting
96 pages
An Overview of Vmware Vcloud Suite
No ratings yet
An Overview of Vmware Vcloud Suite
46 pages
Verge - IO Unitas Update
No ratings yet
Verge - IO Unitas Update
23 pages
Cloud Tech for IT Professionals
No ratings yet
Cloud Tech for IT Professionals
604 pages
AWS - Interview Guide
100% (1)
AWS - Interview Guide
229 pages
Book Sleha
No ratings yet
Book Sleha
502 pages
Week 2 Preparing For PCA Module 1
No ratings yet
Week 2 Preparing For PCA Module 1
85 pages
Masterclass Webinar - Amazon EC2
No ratings yet
Masterclass Webinar - Amazon EC2
111 pages
Storage For Containers Whitepaper
No ratings yet
Storage For Containers Whitepaper
11 pages
Micron 7400 E1mu23y5 Update Instructions 5
No ratings yet
Micron 7400 E1mu23y5 Update Instructions 5
1 page
KingSpec (16 (2024-11-01 01 - 01 - 30)
No ratings yet
KingSpec (16 (2024-11-01 01 - 01 - 30)
25 pages
660p Series Brief 1387234
No ratings yet
660p Series Brief 1387234
3 pages
Dell EMC PowerEdge T550 Guide
No ratings yet
Dell EMC PowerEdge T550 Guide
73 pages
Student Guide Book NVMe 101
No ratings yet
Student Guide Book NVMe 101
35 pages
ThinkCentre M920 Tiny Spec
No ratings yet
ThinkCentre M920 Tiny Spec
9 pages
User Guide: 5288 V3 Server V100R003
No ratings yet
User Guide: 5288 V3 Server V100R003
297 pages
Automotive and Industrial Ssds Product Flyer-3303994-1
No ratings yet
Automotive and Industrial Ssds Product Flyer-3303994-1
3 pages
Ds Flasharray C
No ratings yet
Ds Flasharray C
4 pages
Alienware m15 R3 Setup and Specifications: Regulatory Model: P87F Regulatory Type: P87F002
No ratings yet
Alienware m15 R3 Setup and Specifications: Regulatory Model: P87F Regulatory Type: P87F002
20 pages
NVMe
No ratings yet
NVMe
16 pages
Workstation Z1 G9
No ratings yet
Workstation Z1 G9
58 pages
Phison PS5019-E19T
No ratings yet
Phison PS5019-E19T
2 pages
Redp 5675
No ratings yet
Redp 5675
230 pages
Precision 7540 Setupspecs en Us
No ratings yet
Precision 7540 Setupspecs en Us
25 pages
Performance Analysis of Nvme Ssds and Their Implication On Real World Databases
No ratings yet
Performance Analysis of Nvme Ssds and Their Implication On Real World Databases
13 pages
Virtual Storage Platform e Series Family Matrix
No ratings yet
Virtual Storage Platform e Series Family Matrix
4 pages
HPE B-Series SN3600B Fibre Channel Switch-PSN1009830468HKEN
No ratings yet
HPE B-Series SN3600B Fibre Channel Switch-PSN1009830468HKEN
4 pages
Desktop Prossesor
No ratings yet
Desktop Prossesor
8 pages
X86 Sale
No ratings yet
X86 Sale
11 pages
PC Express - Suggested Retail Price List
No ratings yet
PC Express - Suggested Retail Price List
2 pages
Data Sheet WD Black sn850 Call of Duty Edition Nvme SSD
No ratings yet
Data Sheet WD Black sn850 Call of Duty Edition Nvme SSD
2 pages
FAS2600 SE Presentation - v1.1
No ratings yet
FAS2600 SE Presentation - v1.1
42 pages
OCP Storage Specs Update - Customer Perspectives Panel
No ratings yet
OCP Storage Specs Update - Customer Perspectives Panel
22 pages
NOF+ Storage Network Solution
No ratings yet
NOF+ Storage Network Solution
58 pages
Technical Guide R740
No ratings yet
Technical Guide R740
56 pages
Microsoft Storage Spaces Direct (S2D) Deployment Guide: Front Cover
No ratings yet
Microsoft Storage Spaces Direct (S2D) Deployment Guide: Front Cover
36 pages
Netapp Product Specs
No ratings yet
Netapp Product Specs
24 pages
HPE SimpliVity 380
No ratings yet
HPE SimpliVity 380
44 pages
Project For Digital Logic Design PDF
No ratings yet
Project For Digital Logic Design PDF
32 pages

Tiered Compute Architecture

Uploaded by

Tiered Compute Architecture

Uploaded by

Compute Server Config:

Other Key Features

●​ Motherboard: ASRockRack SPC741D8-2L2T/BCM with PCIe 5.0 support

NAS Server Config:

●​ OS Drive: 1 x 500GB Crucial P3 Plus M.2 NVMe SSD

Other Key Features

●​ Motherboard: ASRockRack SPC741D8-2L2T/BCM with PCIe 5.0

System Architecture Plan: 5 Compute Nodes & 4 Storage Nodes

Tiered Compute Architecture

●​ Compute Cluster A: Mission-Critical HA Cluster (2 Nodes)

Mission-Critical HA Cluster (2 Nodes)

●​ Configuration: The two compute servers will be configured in a cluster

General Purpose Cluster (3 Nodes)

Storage Cluster Technology Options (4 Nodes)

●​ Management Complexity: Very High. Ceph is incredibly powerful but has

Option 2: TrueNAS SCALE (4 Separate Pools) In this model, we would run an

●​ Management Complexity: Medium. TrueNAS has a polished web

Option 3: OpenMediaVault (OMV) (4 Separate Pools) Similar to the TrueNAS

●​ Management Complexity: Very Easy. OpenMediaVault is the easiest of

Our architecture is built on a powerful tiered-storage principle. We will leverage

High-Performance Tier: Local Compute Storage

Capacity Tier: Central Storage Cluster

A Practical Workflow Example

Here’s how we would deploy a new critical file server VM:

1.​ Deployment: We deploy the VM from a master template stored on the

Network Architecture & Configuration

Core Switching Hardware

We will build the network around a central, high-performance switching fabric.

●​ Recommendation: A Layer 3 Lite or full Layer 3 switch is

VLAN & IP Address Strategy

●​ Management VLAN (10): This network is strictly for internal

You might also like

● Motherboard: ASRockRack SPC741D8-2L2T/BCM with PCIe 5.0 support

● OS Drive: 1 x 500GB Crucial P3 Plus M.2 NVMe SSD

● Motherboard: ASRockRack SPC741D8-2L2T/BCM with PCIe 5.0

● Compute Cluster A: Mission-Critical HA Cluster (2 Nodes)

● Configuration: The two compute servers will be configured in a cluster

● Management Complexity: Very High. Ceph is incredibly powerful but has

● Management Complexity: Medium. TrueNAS has a polished web

● Management Complexity: Very Easy. OpenMediaVault is the easiest of

1. Deployment: We deploy the VM from a master template stored on the

● Recommendation: A Layer 3 Lite or full Layer 3 switch is

● Management VLAN (10): This network is strictly for internal