UNIT IV
CLOUD INFRASTRUCTURE AND VIRTUALIZATION
Data Center Infrastructure and Equipment – Virtual Machines –
Containers – Virtual Networks - Virtual Storage.
Explain Data Center Infrastructure and Equipment in detail.
What is a Data Center?
A data center is a facility that houses computer systems, servers, networking
equipment, and storage to manage and process large amounts of data.
Key components of a cloud data center infrastructure:
Servers:
The core processing units, responsible for running applications and handling
data processing.
Storage Systems:
Devices like hard drives, SSDs, and storage area networks (SANs) to store large
volumes of data.
Networking Equipment:
Routers, switches, firewalls that facilitate data communication between servers
and external networks.
Power Supply:
Uninterruptible Power Supplies (UPS) and backup generators to ensure
continuous power even during outages.
Cooling Systems:
HVAC systems to maintain optimal temperature and humidity levels to prevent
overheating.
Security Measures:
Physical security like access controls, surveillance, and cyber security solutions
to protect data.
PoD
PoD (Point of Delivery) is a basic unit of organization in a data center. It
includes a specific set of equipment, such as servers, storage, and networking
components that work together as a functional group.
Size of a Pod
The size of a Pod (Point of Delivery) in a data center can vary depending on
the design and purpose of the facility, but a typical pod is designed to fit a
standard set of racks and equipment while optimizing power, cooling, and
network connectivity. Here are some general guidelines:
1. Rack Capacity:
o A pod usually contains 10 to 20 racks.
o Each rack is a standardized unit for holding servers, storage devices,
and networking equipment.
2. Space Requirement:
o Pods can occupy anywhere from 200 to 1,000 square feet,
3. Power Allocation:
o Each pod may have a dedicated power supply, typically in the range of
50–250 kilowatts, depending on the load requirements.
4. Cooling Zones:
o Pods are often designed to fit within specific cooling zones, making
airflow and temperature management efficient.
Power for a Pod
1. Power Distribution Units (PDUs):
o Each pod has dedicated PDUs to distribute electricity to individual
racks.
o PDUs can handle power loads ranging from 50 kW to 250 kW
2. Redundancy :
o Redundant power supplies ensure reliability. If one power source
fails, another takes over to prevent downtime.
3. Uninterruptible Power Supply (UPS):
o A UPS provides temporary power in case of a power outage,
ensuring continuous operation until backup generators kick in.
o
4. Power Monitoring:
o Sensors and software track energy consumption to optimize power
usage and prevent overloads.
Power Efficiency Techniques
1. Energy-Efficient Hardware:
o Using servers and networking devices designed for low power
consumption.
2. Renewable Energy:
o Incorporating solar, wind, or other renewable energy sources to
power operations.
3. Dynamic Power Management:
o Adjusting power delivery based on workload demands to optimize
efficiency.
4. Backup Power Systems:
o Redundant power supplies, batteries, and generators ensure
continuous operation during outages.
Cooling Techniques in Data Centers
Data centers use these methods to manage heat:
1. Raised Floor Cooling:
o Cool air flows through raised floors to racks.
o Hot air is vented away.
2. Hot/Cold Aisles:
o Hot Aisles: Expel hot air.
o Cold Aisles: Direct cool air to equipment.
o Barriers keep hot and cold air separate.
3. Exhaust Ducts:
o Funnel hot air outside or to cooling systems.
o Reduces air conditioning load.
4. Lights-Out Operations:
o Operated remotely without staff, reducing heat from lights.
North-South and East-West Network Traffic in Data Centers
Data center network traffic can be categorized into two primary types based on
the direction of data flow:
1. North-South Traffic
Definition:
o Traffic that flows into or out of the data center.
o Typically involves communication between the data center and
external clients or users.
Examples:
o A user accessing a web application hosted in the data center.
o Data being uploaded or downloaded to/from cloud storage.
2. East-West Traffic
Definition:
o Traffic that flows within the data center.
o Primarily involves server-to-server or rack-to-rack communication.
Examples:
o Communication between application servers and database servers.
o Replication of data across storage systems.
Spine-Leaf Architecture
Components of the Spine-Leaf Architecture
1. Spine Layer:
o Composed of high-speed switches (spine 1, spine 2, spine 3, spine
4 in the diagram).
o Functions as the backbone of the datacenter network.
o Ensures high-bandwidth connectivity between leaf switches.
2. Leaf Layer:
o Includes switches that connect directly to servers or racks (leaf 1–
leaf 6 in the diagram).
o Each leaf switch connects to all spine switches.
3. Racks:
o Represent groups of servers or storage devices connected to a
single leaf switch (rack 1–rack 6 in the diagram).
Explain Virtual machine in detail.
Virtual machine
A virtual machine (VM) is a software-based simulation of a computer that runs
an operating system and applications like a physical machine.
Examples of Virtualization Software:
VMware Workstation
Oracle Virtual Box
Microsoft Hyper-V
KVM (Kernel-based Virtual Machine)
Key Characteristics:
Hardware Virtualization:
Operating System Independence:
Resource Allocation:
Isolation
Approaches to Virtualization
1. Software emulation
Software emulation is a technique that allows one computer system to imitate
another.
Example:
It enables a computer (C1) to run software (P) that was originally designed for
another type of computer (C2). The emulator software translates the instructions
meant for C2 into instructions that C1 can understand and execute.
2. Para-virtualization
Para-virtualization allows multiple operating systems to run on a computer at
the same time by using a piece of software known as a hypervisor to control the
operating systems.
Para-virtualization has the advantage of allowing high speed execution and the
disadvantage of requiring code to be altered to replace privileged instructions
before it can be run.
Without Para-Virtualization:
The OS would directly access the hardware to perform this operation.
In a virtualized environment, this direct access isn't possible since the
hypervisor controls the hardware.
With Para-Virtualization:
The OS is modified so that instead of directly accessing the hardware, it makes
a call to the hypervisor whenever it needs to perform a privileged operation.
3. Full virtualization
Full virtualization provides the capability for multiple operating systems to run
simultaneously on a single physical machine, without the need to modify the
operating system's code.
In full virtualization, the guest operating system directly accesses the simulated
hardware provided by the hypervisor. Because the hypervisor fully emulates the
physical hardware.
Conceptual Organization of VM System
Software is loaded onto a server to enable the creation of one or more
VMs.
Tenants can boot their own operating systems on these VMs to run
applications.
The key software that creates and manages VMs.
It controls the underlying hardware and ensures each VM operates
independently.
Each VM runs its own operating system and applications, isolated from
other VMs.
How VMs Run Apps
OS Starts First: The OS initializes when the computer boots up, setting the
foundation for the system's operation.
OS Starts an App: The OS loads an application into memory and starts it,
allowing the app to run.
App Invokes an OS Service: When the application needs to perform tasks
like accessing files, it requests these services from the OS.
Code in Memory: Both the OS code and the application code are loaded
into memory for execution.
Execution in Kernel Mode: The OS runs in kernel mode, giving it full
control over the hardware and system resources. This is necessary for
performing low-level tasks that require direct hardware access.
Execution in User Mode: The application runs in user mode, which restricts
its access to hardware for security reasons. When the app needs to perform
privileged operations, it makes system calls to the OS, temporarily switching
to kernel mode.
Privilege Levels:
1. Kernel Mode: Allows the operating system to execute all instructions for
full system control.
2. User Mode: Restricts applications to basic instructions for security.
How the Hypervisor and VM Work Together
Hypervisor Starts First:
The hypervisor initializes when the physical machine boots up. It's responsible
for managing the virtual machines (VMs).
Hypervisor Creates a VM and Starts an OS:
The hypervisor creates a VM and loads an operating system (OS) into it. This
OS runs within the virtual environment provided by the hypervisor.
OS Loads Code into Memory:
The operating system loads its code into memory, preparing to manage
applications and system processes.
Hypervisor Code Execution:
The hypervisor's code runs in its own mode, managing the VMs and handling
privileged instructions.
OS Starts an App:
The OS within the VM starts an application, loading the app's code into
memory.
Application Invokes an OS Service:
When the application needs to perform tasks like file access, it requests these
services from the OS. The OS in the VM forwards this request to the hypervisor
since it handles the hardware. The hypervisor interacts with the physical
hardware to perform the action. The hypervisor sends the result back to the OS
in the VM. The OS completes the request and provides the result to the
application.
Execution Modes
1. Hypervisor Mode: The hypervisor operates in this mode, managing the
virtual environment and VMs.
2. Kernel Mode: The OS runs in this mode, with full control over the
hardware and system resources.
3. User Mode: The application runs in this mode, with restricted access to
ensure security.
VM Migration and Live Migration
VM Migration
Virtual migration is the process of moving a virtual machine (VM) or its
workloads from one physical server to another.
Advantages
Load Balancing:
Maintenance and Upgrades
High Availability
Resource Optimization
Disaster Recovery
Live Migration
Allows migration without completely stopping the VM, minimizing downtime.
Phases of Live Migration:
Phase 1: Pre-Copy:
Entire memory is copied to the new server while the VM continues
running.
Changed memory pages are tracked.
Phase 2: Stop-and-Copy:
VM is temporarily suspended, and modified pages are copied again.
Modern servers efficiently identify changed (dirty) pages.
Phase 3: Post-Copy:
Remaining state (e.g., register contents) is sent to the new server.
The VM resumes execution on the new server.
Key Benefit:
Pre-copying minimizes the time required for suspension and resumption,
ensuring minimal disruption.
Explain Container in detail.
Docker Container
A Docker container is a lightweight, standalone, and executable package that
contains everything needed to run an application, including:
The application code.
Runtime libraries and dependencies.
Configuration files.
Docker Container terminology
Docker software components
Docker
Docker is a platform that enables developers to build, deploy, and run
applications in containers—lightweight, isolated environments. It simplifies
application management across different systems by ensuring consistency.
Key Components of Docker
DockerEngine
The core of Docker managing containerization, is composed of:
Docker Daemon: Runs on the host machine, managing containers, images,
networks, and volumes.
REST API: Allows tools or applications to interact with the Docker Daemon
programmatically.
Docker CLI (docker): A command-line interface for users to execute
commands and interact with the Docker Daemon.
Docker Images
Read-only templates containing everything needed to run an application
(e.g., code, libraries, tools, configurations).
Serve as the blueprint for creating Docker containers.
Docker Containers
Lightweight, isolated environments created from Docker images.
Encapsulate the application and its dependencies, ensuring consistent
performance across environments.
Docker Hub
A public registry for discovering, sharing, and distributing Docker
images.
Users can push their custom images or pull existing ones to deploy
applications quickly.
Command Line Interface
The CLI allows users to execute commands one at a time and receive
immediate responses.
When a user installs Docker, the installation includes an application
named docker, which provides the necessary CLI functionalities.
To send a command to the Docker daemon (dockerd), users can type
commands in a terminal window. The general syntax is:
docker command [arguments...]
Explain Virtual Storage in detail
Persistent Storage:
Persistent storage refers to storage that retains data even after power is
removed.
It is the backbone of long-term data retention in computers and data centers.
Two Forms of Persistent Storage:
1. Persistent Storage Devices:
o Devices that physically store data persistently.
2. Persistent Storage Abstractions:
o These are software mechanisms provided by the operating system to make it easier
to interact with physical storage devices. Two key abstractions:
Named Files:
Hierarchical Directories
Disk Interface Abstraction
Block-Oriented Design:
o Disk devices store and retrieve data in fixed-sized blocks.
o Typical block sizes:
Traditional disks: 512 bytes per block.
Newer disks: 4096 bytes per block (to enhance performance).
o Blocks are identified by block numbers starting at 0.
Operations:
o Read: Retrieve a complete block of data from the disk by specifying the block number.
o Write: Replace the contents of a specific block on the disk by providing the block
number and the new block of data.
File Interface Abstraction
An operating system contains a software module known as a file system that users and applications
use to create and manipulate files.
File Operations:
Open: Gain access to a file and move to the first byte.
Close: Stop using a previously opened file.
Read: Fetch data starting from the current position in the file.
Write: Store data starting at the current position in the file.
Seek: Move to a specific position in an open file.
Local and Remote Storage
Local storage refers to storage devices that are directly attached to a computer, typically
over an I/O bus.
Remote storage refers to storage devices that are not directly connected to a computer but
are accessible over a computer network.
Types of Remote Storage Systems
1. Byte-Oriented Remote File Access
This type of system allows workstations to interact with remote files at the byte level,
as if the files were stored locally.
In the past, sharing files between computers was difficult and involved copying
files manually.
Byte-oriented systems make it easy to share files through a central server.
How It Works:
A central file server stores the files.
Computers send requests to open, read, write, or close files, and the server does the
work.
Users can work with files stored on the server just like they would with files on their
own computer.
Example:
Network File System (NFS):
o Developed by Sun Microsystems, it allows files on the server to appear like
folders on your computer.
o You can open, edit, and save files without knowing they are on a remote
system.
2. Block-Oriented Remote Disk Access
This system enables workstations to access storage at the block level, handling raw
data blocks instead of complete files.
In the 1980s, some computers didn’t have their own storage (diskless workstations).
They needed to use a server for all their storage needs.
How It Works:
A storage server manages the data.
Computers send requests to read or write small chunks of data (blocks).
The server responds with the data or confirms the data is saved.
Example:
iSCSI (Internet Small Computer System Interface):
Network Attached storage (NAS) technology:
Network Attached Storage (NAS) is a specialized file storage system connected to a network,
allowing multiple devices and users to access data over the network.
They come in three main implementations:
1. Host-Based Storage Servers
Host-based storage servers rely on direct-attached storage (DAS) connected to the host
system. These servers are tightly coupled with the operating system and applications, and
they provide local storage to individual systems or servers.
Features:
Limited scalability.
Cost-effective for small-scale applications or single-server environments.
High performance due to the direct connection between the storage and the host
system.
2. Server-Based Storage Servers
Server-based storage servers provide a centralized storage solution that relies on one or more
physical servers to manage and control storage resources.
Key Features:
1. Centralized Management:
2. Scalability:
3. Network-Based Access:
4. .Data Redundancy:
5. Multiprotocol Support:
3. Specialized Hardware-Based Storage Servers
Specialized hardware-based storage servers are purpose-built systems designed exclusively to
provide high-performance, reliable, and scalable storage services.
Key Features:
1. Dedicated Hardware for Storage:
2. Advanced Data Management:
3. Optimized Cooling and Power Efficiency
Storage Area Network (SAN) Technology:
A Storage Area Network (SAN) is a high-performance, block-oriented storage solution used
in data centers. It connects servers and storage devices via a specialized, dedicated network
optimized for storage traffic.
Key Features:
1. Block-Oriented Access:
2. Dedicated Network
3. Traffic Optimization:
4. Durability
Virtual Disks in SANs
SAN servers provide virtual disks to clients instead of directly allocating physical disks.
How Virtual Disks Work
1. Request for Disk Creation:
Initiation: A new entity (e.g., a virtual machine) sends a storage request to the
SAN server.
Details Provided: The request includes a unique client ID and the desired disk
size in blocks.
2. Creation of Virtual Disk Map:
Virtual Disk Map: The SAN server creates a virtual disk map for the
requesting entity.
Mapping: Each entry in the map represents a block in the virtual disk.
3. Mapping Physical Blocks:
The SAN server assigns unused physical blocks from its local disks to the
virtual disk.
A virtual disk map entry links the client's block number to a specific physical
disk and block on that disk.
Example of Virtual Disk Mapping
Client's Block Number Server’s Physical Disk Block on Disk
0 27 21043
1 83 8833
2 91 77046
3 15 90023
o For example, when the client requests block 0, the SAN server retrieves data
from block 21043 on disk 27.
4. Read and Write Operations:
o The SAN server translates the client’s block number using the map and
accesses the corresponding physical disk and block.