Unit 1
1. Cloud Computing: Definition and Components
Definition:
o A model that provides on-demand network access to shared configurable computing
resources.
o Delivers computing as a utility (like water or electricity) on a pay-as-you-go basis.
Key Components:
o Hardware Infrastructure: Servers, storage, and networking devices.
o Virtualization Layer: Hypervisors that create virtual machines.
o Management & Orchestration Software: Tools for resource allocation, monitoring,
and automation.
o Service Delivery Models: Frameworks for offering services (SaaS, PaaS, IaaS).
o Data Centers: Facilities housing the physical infrastructure.
Diagram: Cloud Computing Components
+------------------+
| End Users |
+--------+---------+
Internet/Network
+--------v--------------+
| Cloud Frontend | <-- Portals, APIs, Web Interfaces
+--------+--------------+
+--------v------------------+
| Cloud Management | <-- Provisioning, Billing, Monitoring
+--------+------------------+
+--------v-----------+
| Infrastructure | <-- Virtualized Servers, Storage, Network
+---------------------+
2. Characteristics, Advantages, and Disadvantages of Cloud Computing
Characteristics:
o On-Demand Self-Service: Users can provision resources without human
intervention.
o Broad Network Access: Accessible over the Internet from various devices.
o Resource Pooling: Shared multi-tenant model with dynamic resource allocation.
o Rapid Elasticity: Scalable resources that can quickly expand or contract.
o Measured Service: Resource usage is monitored and billed on a pay-per-use basis.
Advantages:
o Cost Efficiency: Reduces capital expenditure (CapEx) by shifting to operational
expense (OpEx).
o Scalability: Easily adjust resources based on demand.
o Accessibility: Remote access to data and applications.
o Maintenance: Provider handles hardware/software maintenance and updates.
Disadvantages:
o Security Concerns: Data privacy and compliance issues.
o Downtime: Reliance on internet connectivity can cause availability issues.
o Limited Control: Organizations may have less control over infrastructure.
o Vendor Lock-In: Dependency on specific providers’ technologies and policies.
3. Cloud Service Models: SaaS, PaaS, IaaS, and Storage
Software as a Service (SaaS):
o Applications delivered over the Internet.
o Examples: Google Workspace, Salesforce.
o Key Points: No need for local installation, automatic updates, subscription-based.
Platform as a Service (PaaS):
o Provides a platform to develop, test, and deploy applications.
o Examples: Google App Engine, Microsoft Azure App Service.
o Key Points: Offers development tools, middleware, and database management.
Infrastructure as a Service (IaaS):
o Virtualized computing resources over the Internet.
o Examples: Amazon EC2, Microsoft Azure Virtual Machines.
o Key Points: Provides storage, servers, and networking; users manage OS and
applications.
Cloud Storage Services:
o Scalable storage solutions accessed over the Internet.
o Examples: Amazon S3, Google Cloud Storage.
o Key Points: Highly durable, redundant, accessible anytime; billed based on usage.
4. Cloud Computing Reference Model
Layers of the Reference Model:
o Hardware/Infrastructure Layer: Physical servers, storage, network.
o Virtualization Layer: Abstracts hardware resources.
o Resource Management Layer: Manages provisioning, load balancing, and scaling.
o Service Layer: Offers SaaS, PaaS, and IaaS.
o User/Client Layer: Interfaces for end users (web portals, mobile apps).
Diagram: Cloud Reference Model
+--------------------+
| User/Client | <-- End User Applications
+---------+----------+
+---------v----------+
| Service Layer | <-- SaaS, PaaS, IaaS
+---------+----------+
+---------v----------+
| Resource Management| <-- Provisioning, Monitoring
+---------+----------+
+---------v----------+
| Virtualization Layer| <-- Hypervisors, Containers
+---------+----------+
+---------v------------------+
| Hardware/Infra Layer| <-- Servers, Storage, Networks
+-----------------------------+
5. Cloud Deployment Models
Public Cloud:
o Description: Services offered over the public internet by third-party providers.
o Advantages: Economies of scale, cost-effective, high scalability.
o Examples: AWS, Microsoft Azure, Google Cloud.
Private Cloud:
o Description: Cloud infrastructure dedicated to a single organization.
o Advantages: Greater control, enhanced security.
o Examples: On-premises data centers or hosted private clouds.
Hybrid Cloud:
o Description: Combination of public and private clouds.
o Advantages: Flexibility, optimized resource usage, balancing security and scalability.
Community Cloud:
o Description: Shared by several organizations with common requirements.
o Advantages: Cost-sharing, tailored security and compliance.
Diagram: Cloud Deployment Models
+---------------------+
| Public Cloud |
+---------------------+
+---------------------+
| Hybrid Cloud | <-- Combines Public & Private
+---------------------+
^
|
+---------------------+
| Private Cloud |
+-----------------------+
+----------------------------+
| Community Cloud |
+----------------------------+
6. Identity Management as a Service (IDaaS)
Definition:
o Cloud-based solutions for identity and access management.
Key Features:
o Single Sign-On (SSO): One login for multiple applications.
o Multi-Factor Authentication (MFA): Additional security layers.
o User Provisioning/De-provisioning: Automated account management.
o Federated Identity: Integration with external identity providers.
Benefits:
o Reduced IT overhead, enhanced security, centralized management.
7. Cloud Storage Providers
Major Providers:
o Amazon S3: Scalable object storage, high durability.
o Google Cloud Storage: Unified object storage, low latency.
o Microsoft Azure Blob Storage: Secure, cost-effective storage for unstructured data.
o IBM Cloud Object Storage: High availability and reliability.
Common Attributes:
o Scalability, data replication, accessibility via APIs, pay-per-use pricing.
8. Cloud System Architecture
Key Architectural Layers:
o Hardware Layer: Physical servers, storage, networking.
o Virtualization Layer: Abstracts physical resources into virtual machines or containers.
o Middleware/Management Layer: Orchestrates resource provisioning, load
balancing, and scaling.
o Application Layer: Provides cloud services (SaaS, PaaS, IaaS) to end users.
Key Characteristics:
o Multi-Tenancy: Sharing resources among multiple users.
o Fault Tolerance: Data replication and failover mechanisms.
o Scalability: Dynamic resource allocation based on demand.
9. Cloud Economics
Cost Efficiency:
o Shifts from capital expenditure (CapEx) to operational expenditure (OpEx).
o Pay-as-You-Go: Only pay for the resources you use.
Return on Investment (ROI):
o Lower upfront costs and flexible scaling drive quicker ROI.
Total Cost of Ownership (TCO):
o Reduced IT maintenance and hardware replacement costs.
Economies of Scale:
o Providers benefit from large-scale operations and pass savings to customers.
Challenges:
o Unpredictable costs if not properly managed; potential hidden fees.
10. Involvement of Cloud Computing in Organizations
Integration Types:
o Core Business Applications: CRM, ERP, and SCM systems hosted in the cloud.
o Collaboration & Communication: Cloud-based email, file sharing, and conferencing.
o Development & Testing: PaaS/IaaS environments for rapid application development.
o Disaster Recovery & Backup: Cloud storage for data recovery.
Benefits:
o Enhanced agility, cost savings, and improved scalability.
Considerations:
o Security policies, compliance regulations, and vendor lock-in risks.
11. Role of Networking in Cloud Computing
Connectivity:
o Provides the backbone for accessing cloud services over the Internet.
Performance:
o Load Balancing: Distributes traffic evenly.
o Latency Reduction: Optimizes data transfer routes.
Security:
o Implements firewalls, VPNs, and encryption to protect data in transit.
Virtual Networking:
o Software-Defined Networking (SDN) for dynamic resource allocation and network
management.
12. Seven-Step Model of Migration into the Cloud
1. Assessment:
o Evaluate existing infrastructure, applications, and business needs.
2. Planning:
o Define migration strategy, timelines, and risk management.
3. Pilot Testing:
o Migrate a small set of applications to validate the process.
4. Migration:
o Gradually transfer data and applications to the cloud.
5. Integration:
o Ensure seamless connectivity between on-premises and cloud systems.
6. Optimization:
o Fine-tune performance, cost, and security post-migration.
7. Governance & Management:
o Monitor, manage, and continuously improve cloud operations.
Unit-2:- Cloud File Systems and Data Storage
1. Cloud File System with Architectures
Definition:
o Distributed file systems designed to store, manage, and access large volumes of data
over clusters.
Key Concepts:
o Distributed Metadata: Centralized management of file metadata.
o Fault Tolerance: Data replication and error recovery.
o Scalability: Horizontal scaling to support large data sets.
Example Architectures:
o Google File System (GFS) / HDFS:
Master/NameNode: Manages metadata.
Chunk/Data Nodes: Store actual data blocks.
Diagram: Simplified Cloud File System
+-------------+
| Client |
+------+------+
+------v-----------+
| Master/ | <-- Metadata Management
| NameNode |
+------+-----------+
+------v----------+
| Chunk/Data | <-- Data Storage Nodes
| Nodes |
+------------------+
2. HBase Data Model
Overview:
o A NoSQL, column-oriented database built on top of HDFS.
Key Characteristics:
o Tables: Consist of rows and column families.
o Column Families: Group related columns stored together.
o Schema Flexibility: Sparse data storage; columns can vary per row.
o Versioning: Stores multiple versions of a cell’s data.
Data Structure:
o Row Key: Unique identifier for rows.
o Column Qualifier: Specifies the column within a family.
o Timestamp: Used for version control.
3. Cloud File System GFS/HDFS
Google File System (GFS):
o Architecture: Master-slave design; master handles metadata, slaves (chunk servers)
store data.
o Fault Tolerance: Data replicated across multiple servers.
Hadoop Distributed File System (HDFS):
o Architecture: NameNode (metadata) and DataNodes (data blocks).
o Block Replication: Typically three copies for reliability.
Diagram: GFS/HDFS Architecture
+----------------+
| Client |
+-------+--------+
+-------v------------------+
| Master/NameNode|
+-------+-------------------+
+---------v-----------------+
| Data/Chunk Nodes |
+----------------------------+
4. Working of Cloud Data Store
Functionality:
o Provides scalable, high-availability storage for structured/unstructured data.
Mechanisms:
o Replication: Data is copied across nodes.
o Sharding: Data is partitioned to distribute load.
o Consistency Models: Options range from strong to eventual consistency.
Applications:
o Often used for NoSQL databases and large-scale data processing.
5. Differences: Cloud File System vs. Normal File System
6. AFS (Andrew File System) Architecture
Overview:
o A distributed file system that supports large-scale file sharing.
Key Features:
o Client-Server Model: Clients cache files locally.
o Volumes: Logical grouping of files that can be moved between servers.
o Cache Consistency: Ensures local caches are updated.
o Security: Built-in authentication and access control.
7. Difference Between SAN and NAS
8. HDFS Architecture in Detail
Components:
o NameNode: Centralized metadata manager.
o DataNodes: Store actual data blocks.
o Secondary NameNode: Assists in checkpointing metadata (not a backup for
NameNode).
Key Features:
o Block Storage: Files broken into blocks (default 128 MB).
o Replication: Each block is replicated (default factor is 3).
o Fault Tolerance: Automatic re-replication of lost blocks.
Diagram: HDFS Architecture
+----------------+
| Client |
+-------+--------+
|
+-------v--------+
| NameNode| <-- Metadata
+-------+--------+
+-------v--------+
| DataNodes | <-- Data Blocks
+----------------+
9. Data Storage Types: EDS, DAS, SAN, and NAS
Enterprise Data Storage (EDS):
o Overview: Centralized, enterprise-grade storage solutions.
o Characteristics: High capacity, performance, and reliability.
Direct Attached Storage (DAS):
o Overview: Storage directly attached to a computer or server.
o Characteristics: Fast access but limited scalability and sharing.
Storage Area Network (SAN):
o Overview: Dedicated high-speed network that provides block-level storage.
o Characteristics: High performance, supports mission-critical applications.
Network Attached Storage (NAS):
o Overview: File-level storage accessed over standard networks.
o Characteristics: Easy file sharing, scalable file storage.
10. Data Intensive Technologies for Cloud Computing
MapReduce Framework:
o Purpose: Processes large datasets in parallel.
o Key Elements: Map tasks (processing) and Reduce tasks (aggregation).
Hadoop Ecosystem:
o Includes: HDFS, YARN, HBase, Spark.
o Purpose: Enables scalable data storage and processing.
NoSQL Databases:
o Examples: Cassandra, MongoDB, HBase.
o Purpose: Handle unstructured and semi-structured data with high throughput.
11. Distributed Data Storage
Concept:
o Data is stored across multiple nodes/servers.
Advantages:
o Scalability: Easily add more nodes.
o Fault Tolerance: Data remains available despite node failures.
o Performance: Parallel access improves data retrieval speeds.
Techniques:
o Sharding: Partitioning data into smaller chunks.
o Replication: Duplicating data across different nodes.
o Consistency Models: Manage data accuracy across distributed systems.