CLOUDCOMPUTING
Mr.SAYYED NAGULMEERA
Assistant Professor of
Computer Science & Engineering
Nimra college of Engg & Technology
PREREQUISITES
Some of the prerequisites for learning cloud computing are:
• Having basic knowledge of operating systems like Windows OS, Linux etc.
• Basic knowledge of Computers, Internet, Database and Networking concepts.
Cloud Computing/ Unit-1 2
SYLLABUS
UNIT -I: Systems modeling, Clustering and virtualization
Scalable Computing over the Internet-The Age of Internet Computing, Scalable computing over
the internet, Technologies for Network Based Systems, System models for Distributed and Cloud
Computing, Performance, Security and Energy Efficiency
UNIT- II: Virtual Machines and Virtualization of Clusters and Data Centers
Implementation Levels of Virtualization, Virtualization Structures/ Tools and Mechanisms,
Virtualization of CPU, Memory and I/ O Devices, Virtual Clusters and Resource Management,
Virtualization for Data-Center Automation.
UNIT- III: Cloud Platform Architecture
Cloud Computing and Service Models, Public Cloud Platforms, Service Oriented Architecture,
Programming on Amazon AWS and Microsoft Azure
Cloud Computing/ Unit-1 3
.
UNIT- IV: Cloud Resource Management and Scheduling
Policies and Mechanisms for Resource Management Applications of Control Theory to Task
Scheduling on a Cloud, Stability of a Two-Level Resource Allocation Architecture, Feedback
Control Based on Dynamic Thresholds. Coordination of Specialized Autonomic Performance
Managers, Resource Bundling, Scheduling Algorithms for Computing Clouds, Fair Queuing, Start
Time Fair Queuing
UNIT- V: Storage Systems
Evolution of storage technology, storage models, file systems and database, distributed file
systems, general parallel file systems. Google file system.
TEXT BOOKS:
1. Distributed and Cloud Computing, Kai Hwang, Geoffry C. Fox, Jack J. Dongarra MK Elsevier.
2. Cloud Computing, Theory and Practice, Dan C Marinescu, MK Elsevier.
Cloud Computing/ Unit-1 4
What is Cloud?
The term Cloud refers to a Network or Internet. In other words, we can say that
Cloud is something, which is present at remote location.
Cloud can provide services over public and private networks, i.e., WAN, LAN or
VPN
Cloud Computing/ Unit-1 5
Cloud Computing/ Unit-1 6
Why Cloud Computing
• In that server room, there should be a database server, mail
server, networking, firewalls, routers, modem, switches, configurab
le system, high net speed and the maintenance engineers.
• To establish such IT infrastructure, we need to spend lots of money. To
overcome all these problems and to reduce the IT infrastructure cost,
Cloud Computing comes into existence.
Cloud Computing/ Unit-1 7
Cloud Computing/ Unit-1 8
CloudComputingArchitecture
• It can be divide into two parts: Front End
and Back End
Front End refers to the client part of the
cloud computing, it consists of Interfaces and
applications
Ex: Web browser, Thin client, mobile,
tablet
Back End refers to cloud itself, it consists of
Data Storage, Security mechanism, Virtual
machines, Servers, Network etc..
Cloud Computing/ Unit-1 9
Types of Cloud/Deployment Models: There are the following 3 types of cloud that
you can deploy according to the organization's needs-
-Public cloud
-Private cloud
-Hybrid Cloud
Public Cloud: the cloud infrastructure is made available to general public over the internet
and is owned by a cloud provider.
Cloud Computing/ Unit-1 10
-Private cloud: The cloud infrastructure is exclusively operated by a organization.
• It can be managed by 3rd party or by organization.
• It may exist at on premise or off-premise.
• Providers are: AWS, VM Ware.
Cloud Computing/ Unit-1 11
-Hybrid Cloud: it consists of the functionalities of both public and private clouds
• Here, the sensitive data will be stored in private cloud and non sensitive data will be
stored in public clouds.
• Ex: Federal Agency uses this type of cloud.
Cloud Computing/ Unit-1 12
Types of Service models: There are the following three types of cloud service models -
1. Infrastructure as a Service (IaaS)
2. Platform as a Service (PaaS)
3. Software as a Service (SaaS)
Cloud Computing/ Unit-1 13
Infrastructure as a Service (IaaS): it provides the basic computing infrastructure.
• The services are available on “Pay-For-What-You-Use” model.
• Users : IT Administrators
• Providers: AWS, Azure, Google etc..
Platform as a Service (PaaS):it provides cloud platforms and runtime environment for developing,
Testing, Managing applications.
Users : Software Developers.
Providers: BigCommerce, Google Apps, Salesforce, Dropbox, ZenDesk, Cisco WebEx, ZenDesk,
Slack, and GoToMeeting.
Software as a Service (SaaS): Here, cloud providers Host & Manage the software application on a
Pay-as-you-go pricing model.
-S/w and H/w are provided and managed by a vendor so you don’t have to maintain anything.
Users: End Customers.
Cloud Computing/ Unit-1 14
Cloud Computing/ Unit-1 15
Cloud Computing
Cloud computing means on-demand delivery/ availability of computer system
resources over the internet on a pay-as-you-go basis.
Cloud Computing/ Unit-1 16
DISTRIBUTED COMPUTING
• Distributed computing is a computing concept that, in its most general
sense, refers to multiple computer systems working on a single
problem.
• In distributed computing, a single problem is divided into many parts,
and each part is solved by different computers.
• As long as the computers are networked, they can communicate with
each other to solve the problem. If done properly, the computers
perform like a single entity.
Cloud Computing/ Unit-1 17
Cloud Computing/ Unit-1 18
SERIAL COMPUTING
• A problem is broken into a discrete series of instructions
• Instructions are executed sequentially one after another
• Executed on a single processor
• Only one instruction may execute at any moment in time
Cloud Computing/ Unit-1 19
PARALLEL COMPUTING
• Parallel computing is a type of computation in which many
calculations are carried out simultaneously, operating on the principle that
large problems can often be divided into smaller ones, which are then
solved at the same time.
• In the simplest sense, parallel computing is the simultaneous use of
multiple compute resources to solve a computational problem:
• A problem is broken into discrete parts that can be solved concurrently
• Each part is further broken down to a series of instructions
• Instructions from each part execute simultaneously on different
processors
• An overall control/coordination mechanism is employed
Cloud Computing/ Unit-1 20
Cloud Computing/ Unit-1 21
• The main difference between parallel and distributed computing is
that parallel computing allows multiple processors to execute tasks
simultaneously while distributed computing divides a single task
between multiple computers to achieve a common goal.
Cloud Computing/ Unit-1 22
GRID COMPUTING
• Grid computing is a processor architecture that combines computer resources
from various domains to reach a main objective. In grid computing, the
computers on the network can work on a task together, thus functioning as a
supercomputer.
• Grid computing is the practice of leveraging multiple computers, often
geographically distributed but connected by networks, to work together to
accomplish joint tasks. It is typically run on a “data grid,” a set
of computers that directly interact with each other to coordinate jobs.
Cloud Computing/ Unit-1 23
Cloud Computing/ Unit-1 24
History of Cloud Computing
• Client/Server computing
• distributed computing
• At around in 1961, John MacCharty suggested in a speech at MIT
that computing can be sold like a utility, just like a water or
electricity.
Cloud Computing/ Unit-1 25
• In 1999, Salesforce.com started delivering of applications to users using a
simple website. (Saas)
• In 2002, Amazon started Amazon Web Services, providing services like
storage, computation and even human intelligence.
• In 2009, Google Apps also started to provide cloud computing enterprise
applications.
• In 2009, Microsoft launched Windows Azure, and companies like Oracle
and HP have all joined the game. This proves that today, cloud computing
has become mainstream.
Cloud Computing/ Unit-1 26
Cloud Computing/ Unit-1 28
Scalable Computing over the Internet
• The Age of Internet Computing
–The Platform Evolution
–High-Performance Computing
–High-Throughput Computing
–Three New Computing Paradigms
–Computing Paradigm Distinctions
• Centralized computing
• Parallel computing
• Cloud computing
Distributed System Families
• Scalable Computing Trends and New Paradigms
Cloud Computing/ Unit-1 29
The Age of Internet Computing
– High-Performance Computing
– High-Throughput Computing
Cloud Computing/ Unit-1 30
The Platform Evolution
• Computer technology has gone through five generations of development
• 1950 to 1970, a handful of mainframes(IBM360,CDC6400) for business and govt. organizations.
• From 1960 to 1980, lower-cost minicomputers(VAX-Small business & College campuses)
• From 1970 to 1990, widespread use of personal computers (VLSI microprocessors )
• From 1980 to 2000, massive numbers of portable computers (wired and wireless
applications).
• Since 1990, the use of both HPC and HTC systems(hidden in Cluster, Grid, cloud computing).
Cloud Computing/ Unit-1 31
Figure (Evolutionary trend toward parallel, distributed, and cloud computing with clusters,
Massively parallel processing (MPPs), P2P networks, grids, clouds, web services, and the
Internet of Things)
Cloud Computing/ Unit-1 32
High-Performance Computing
• HPC systems emphasize the raw speed performance.
• The speed of HPC systems has increased from Gflops (gigaflops: one billion floating-point
operations per second)(109) in the early 1990s to now Pflops(petaflops)(1015) in 2010.
• This improvement was driven mainly by the demands from scientific, engineering, and
manufacturing communities.
High-Throughput Computing
• The development of market-oriented high-end computing systems is undergoing a strategic
change from an HPC paradigm to an HTC paradigm.
• This HTC paradigm pays more attention to high-flux computing.
• The main application for high-flux computing is in Internet searches and web services.
• The performance goal thus shifts to measure high throughput, or the number of tasks
completed per unit of time.
Cloud Computing/ Unit-1 33
Three New Computing Paradigms
• Radio-frequency identification (RFID):
• Global Positioning System (GPS):
• Internet of Things (IoT):
Cloud Computing/ Unit-1 34
Radio-frequency identification (RFID):Radio-frequency identification uses
electromagnetic fields to automatically identify, and track tags attached to
objects.
Global Positioning System (GPS): is a satellite-based radio navigation system
Internet of Things (IoT):describes the network of physical objects—“things”—
that are embedded with sensors, software, and other technologies for the
purpose of connecting and exchanging data with other devices and systems
over the interne
Cloud Computing/ Unit-1 35
Computing Paradigm Distinctions
• Centralized Computing
• All computer resources are centralized in one physical system.
• Parallel Computing
• All processors are either tightly coupled with central shard memory or loosely
coupled with distributed memory
• Distributed Computing
• A distributed system consists of multiple autonomous computers, each with its own
private memory, communicating over a network.
• Cloud Computing
• An Internet cloud of resources that may be either centralized or decentralized. The
cloud apples to parallel or distributed computing or both. Clouds may be built from
physical or virtualized resources.
Cloud Computing/ Unit-1 36
Distributed System Families
• Both HPC and HTC systems emphasize parallelism and distributed
computing.
• Future HPC and HTC systems must be able to satisfy this huge
demand in computing power in terms of throughput, efficiency,
scalability, and reliability.
Cloud Computing/ Unit-1 37
Distributed System Families
• The system efficiency is decided by speed, programming, and throughput.
Meeting these goals requires to yield the following design objectives:
• Efficiency :Efficiency is decided by speed, programming and throughput
demands’ achievement.
• Dependability :This measures the reliability from the chip to the system at
different levels. Main purpose here is to provide good QoS (Quality of
Service)
• Adaptation: This measures the ability to support unending number of job
requests over massive data sets and virtualized cloud resources under
different models.
• Flexibility: It is the ability of distributed systems to run in good health in both
HPC (science/engineering) and HTC (business).
Cloud Computing/ Unit-1 38
Degrees of Parallelism(DoP)
• Bit-level parallelism (BLP) converts bit-serial processing to word-level
processing gradually.
• Instruction-level parallelism (ILP), in which the processor executes multiple
instructions simultaneously rather than only one instruction at a time.
• Data-level parallelism (DLP) Instructions from a single stream operate
concurrently on several data(It focuses on distributing the data across different
nodes, which operate on the data in parallel)
• Task-level parallelism (TLP) Multiple threads or instruction sequences from the
same application can be executed concurrently.
• Job-level parallelism (JLP) is also exploited within a single computer by
treating a job or several jobs as a collection of independent tasks. It is fair to
say that coarse-grain parallelism is built on top of fine-grain parallelism.
Cloud Computing/ Unit-1 39
Innovative Applications
• Both HPC and HTC systems desire transparency in many application aspects.
• For example, data access, resource allocation, process location, concurrency
in execution, job replication, and failure recovery should be made
transparent to both users and system management.
• These applications spread across many important domains in science,
engineering, business, education, health care, traffic control, Internet and
web services, military, and government applications.
Cloud Computing/ Unit-1 40
Cloud Computing/ Unit-1 41
The Trend toward Utility Computing
• Utility computing, or The Computer Utility, is a service provisioning model in
which a service provider makes computing resources and infrastructure
management available to the customer as needed, and charges them for
specific usage rather than a flat rate.
• Utility computing focuses on a business model in which customers receive
computing resources from a paid service provider. All cloud platforms are
regarded as utility service providers.
Cloud Computing/ Unit-1 42
• Figure: The vision of computer utilities in modern distributed computing systems.
Cloud Computing/ Unit-1 43
2.Technologies for network based systems
Cloud Computing/ Unit-1 44
Multi-core CPUs and Multithreading Technologies:
• Over the last 30 years the speed of the chips and their capacity to handle
variety of jobs has increased at an exceptional rate.
• This is crucial to both HPC and HTC system development.
• The processor speed is measured in MIPS (millions of instructions per second)
and the utilized network bandwidth is measured in Mbps or Gbps.
Cloud Computing/ Unit-1 45
Advances in CPU Processors
33 year Improvement in Processor and Network Technologies
Cloud Computing/ Unit-1 46
Figure: Schematic of a modern multicore CPU chip using a hierarchy of caches, where L1 cache is private to each core, on-
chip L2 cache is shared and L3 cache or DRAM Is off the chip.
Cloud Computing/ Unit-1 47
Multi-core CPU: A multi-core processor is a single computing component with
two or more independent actual processing units (called "cores"), which are units
that read and execute program instructions.
(Ex: add, move data, and branch).
The multiple cores can run multiple instructions at the same time, increasing
overall speed for programs open to parallel computing.
Cloud Computing/ Unit-1 48
Multicore CPU and Many-Core GPU Architectures
• Graphics processing units (GPUs) that adopt a many-core architecture
with hundreds to thousands of simple cores.
Cloud Computing/ Unit-1 49
Many-core GPU: (Graphics Processing Unit) Many-core processors are
specialist multi-core processors designed for a high degree of parallel
processing, containing a large number of simpler, independent processor
cores.
• Many-core processors are used extensively in embedded computers and
high-performance computing. (Main frames, super computers)
Cloud Computing/ Unit-1 50
• Fine-Grained SIMD: which deals with the much smaller components which are
in actual is composed of the much larger components.
• Here, programs are broken into large number of small tasks.
• Fine Grain SIMD have much higher level of parallelism then Coarse grain SIMD.
• Fine Grain SIMD have less computation time then the coarse grain architecture.
• Coarse-Grained SIMD: the size of components is much more than the fine-
grained subcomponents of a system.
• Here, programs are broken into small number of large task.
• Coarse grain SIMD have lower level of parallelism then Fine Grain SIMD.
• Coarse Grain SIMD have more computation time then the Fine grain
architecture.
Cloud Computing/ Unit-1 51
Multithreading Technology
• Thread is a single sequence stream within a process.
• Threads are also known as Lightweight processes.
• In computer science, a thread of execution is the smallest sequence of
programmed instructions that can be managed independently by a scheduler,
which is typically a part of the operating system.
Cloud Computing/ Unit-1 52
Multithreading Technology
• A technique by which a single set of code can be used by several processors at
different stages of execution.
• Only instructions from the same thread are executed in a superscalar processor.
• Fine-grain multithreading switches the execution of instructions from different
threads per cycle.
• Course-grain multithreading executes many instructions from the same thread
for quite a few cycles before switching to another thread.
• The multicore CMP(chip multi processing) executes instructions from different
threads completely.
• The SMT(simultaneous multithreaded) allows simultaneous scheduling of
instructions from different threads in the same cycle.
Cloud Computing/ Unit-1 53
5 Micro-architectures of CPUs
Each row represents the issue slots for a single execution cycle:
• A filled box indicates that the processor found an instruction to execute in that issue slot on that cycle;
• An empty box denotes an unused slot.
Cloud Computing/ Unit-1 54
GPU Computing to Exascale and Beyond
• Exascale is a thousandfold increase in computing power from petascale.
• A GPU is a graphics coprocessor or accelerator mounted on a
computer’s graphics card or video card.
Cloud Computing/ Unit-1 55
How GPUs Work
• Early GPUs functioned as coprocessors attached to the CPU.
• Today, the NVIDIA GPU has been upgraded to 128 cores on a single chip.
• Each core on a GPU can handle eight threads of instructions.
Cloud Computing/ Unit-1 56
GPU Programming Model
• Figure shows the interaction between a CPU and GPU in performing parallel
execution of floating-point operations concurrently.
Figure: The use of a GPU along with a CPU for massively parallel execution in hundreds or thousands of processing cores.
Cloud Computing/ Unit-1 57
Power Efficiency of the GPU
• Bill Dally of Stanford University considers power and massive parallelism as
the major benefits of GPUs over CPUs for the future.
• By extrapolating current technology and computer architecture, it was
estimated that 60 Gflops(109)/watt per core is needed to run an exaflops(1018)
system.
• Dally has estimated that the CPU chip consumes about 2 Nanojoule
/instruction, while the GPU chip requires 200 Petajoule/instruction, which is
1/10 less than that of the CPU.
• The CPU is optimized for latency in caches and memory, while the GPU is
optimized for throughput with explicit management of on-chip memory.
Cloud Computing/ Unit-1 58
The GPU performance (middle line, measured 5 Gflops/W/core in 2011), compared with the lower CPU performance (lower
line measured 0.8 Gflops/W/core in 2011) and the estimated 60 Gflops/W/core performance in 2011 for the Exascale (EF in
upper curve) in the future
Cloud Computing/ Unit-1 59
Memory, Storage, and Wide-Area Networking
• Memory Technology
• Disks and Storage Technology
• System-Area Interconnects
• Wide-Area Networking
Cloud Computing/ Unit-1 60
Memory Technology
• The growth of DRAM chip capacity is increased from 16 KB to 64 GB.
• [SRAM is Static RAM and is 'static' because the memory does not have to
be continuously refreshed like Dynamic RAM.
• SRAM is faster but also more expensive and is used inside the CPU. The
traditional RAMs in computers are all DRAMs].
•
Cloud Computing/ Unit-1 61
Disks and Storage Technology
• Beyond 2011, disks or disk arrays have exceeded 3 TB in capacity. The
lower curve in Figure shows the disk storage growth in 7 orders of
magnitude in 33 years.
Cloud Computing/ Unit-1 62
Memory Technology
Fig 1.10:Improvement in memory and disk technologies over 33 years. The Seagate Barracuda XT disk
has a capacity of 3 TB in 2011.
Cloud Computing/ Unit-1 63
System-Area Interconnects
• The nodes in small clusters are mostly interconnected by an Ethernet switch
or a local area network (LAN). As Figure shows, a LAN typically is used to
connect client hosts to big servers.
• A storage area network (SAN) connects servers to network storage such as
disk arrays.
• Network attached storage (NAS) connects client hosts directly to the disk
arrays.
• All three types of networks often appear in a large cluster built with
commercial network components.
Cloud Computing/ Unit-1 64
System-Area Interconnects
• A disk array is a data storage
system that contains multiple
disk drives and a cache memory.
• A NAS unit includes a dedicated
hardware device that connects to
a local area network, usually
through an Ethernet connection.
• A SAN commonly uses Fibre
Channel interconnects and
connects a set of storage devices
that share data with one another.
• A NAS is a single storage device
that operates on data files, while
a SAN is a local network of
several devices.
Cloud Computing/ Unit-1 65
Virtual Machines and Virtualization Middleware
• Virtual Machines
• VM Primitive Operations
• Virtual Infrastructures
Cloud Computing/ Unit-1 66
Virtual Machines
• Eliminate real machine constraint
• Increases portability and flexibility
• Virtual machine adds software to a physical machine to give it the
appearance of a different platform or multiple platforms.
• Benefits
• Cross platform compatibility
• Increase Security
• Enhance Performance
• Simplify software migration
Cloud Computing/ Unit-1 67
Virtual Machine Basics
• Virtual software placed between underlying
machine and conventional software
• Virtualization process involves:
• Mapping of virtual resources (registers and
memory) to real hardware resources
• Using real machine instructions to carry out the
actions specified by the virtual machine
instructions
Cloud Computing/ Unit-1 68
• A Virtual Machine Monitor (VMM) is a software program that enables the creation, management and
governance of virtual machines (VM) and manages the operation of a virtualized environment on top of a
physical host machine.
• VMM is also known as Virtual Machine Manager or Hypervisor.
Cloud Computing/ Unit-1 69
• A hypervisor is a software layer installed on the physical hardware, which
allows splitting the physical machine into many virtual machines.
• This allows multiple operating systems to be run simultaneously on the same
physical hardware. The operating system installed on the virtual machine is
called a guest OS, and is sometimes also called an instance.
• The hardware, the hypervisor runs on is called the host machine. A hypervisor
management console, which is also called a virtual machine manager (VMM),
is computer software that enables easy management of virtual machines.
Cloud Computing/ Unit-1 70
Three VM Architectures
• A Virtual Machine Monitor (VMM) is a software program that enables the creation, management and governance of
virtual machines (VM) and manages the operation of a virtualized environment on top of a physical host machine.
• VMM is also known as Virtual Machine Manager or Hypervisor.
Cloud Computing/ Unit-1 71
Types of hypervisor
• There are two types of hypervisors: type 1 hypervisors and type 2
hypervisors.
• Type 1 hypervisor is also called a native or bare-metal hypervisor that is
installed directly on the hardware, which splits the hardware into several
virtual machines where we can install guest operating systems. Virtual
machine management software helps to manage this hypervisor, which
allows guest OSes to be moved automatically between physical servers based
on current resources requirements.
Cloud Computing/ Unit-1 72
• Type 2 hypervisor is also called hosted hypervisor, which is installed
within a host operating system, with the advantage that there’s no
need to have a hypervisor management console.
Cloud Computing/ Unit-1 73
Cloud Computing/ Unit-1 74
First, the VMs can be multiplexed between hardware machines, as shown in
Figure (a).
• Second, a VM can be suspended and stored in stable storage, as shown in
Figure (b).
• Third, a suspended VM can be resumed or provisioned to a new hardware
platform, as shown in Figure (c).
• Finally, a VM can be migrated from one hardware platform to another, as
shown in Figure (d).
Cloud Computing/ Unit-1 75
Data Center Virtualization for Cloud Computing
• Data Center Virtualization
• Data center virtualization is the process of designing, developing and deploying a data center on
virtualization and cloud computing technologies.
• It primarily enables virtualizing physical servers in a data center facility along with storage, networking and
other infrastructure devices and equipment.
• Data center virtualization usually produces a virtualized, cloud and collocated virtual/cloud data center.
Cloud Computing/ Unit-1 76
Cloud Computing/ Unit-1 77
3.System Models for Distributed and Cloud Computing
• Computer clusters (Clusters of Cooperative Computers)
• Computing Grids(Grid Computing Infrastructures)
• P2P Networks(Peer-to-Peer Network Families)
• Internet clouds(Cloud Computing over the Internet)
Cloud Computing/ Unit-1 78
Clusters of Cooperative Computers
• Cluster Architecture
• Single-System Image
• Hardware, Software, and Middleware Support
• Major Cluster Design Issues
Cloud Computing/ Unit-1 79
Clusters of Cooperative Computers
• A computing cluster consists of interconnected stand-alone computers which
work cooperatively as a single integrated computing resource.
• In the past, clustered computer systems have demonstrated impressive results
in handling heavy workloads with large data sets.
Cloud Computing/ Unit-1 80
Cluster Architecture
Fig: cluster of servers interconnected by a high-bandwidth SAN or LAN with shared I/O devices and disk arrays; the cluster
acts as a single computer attached to the Internet.
Cloud Computing/ Unit-1 81
Cluster Architecture
• Figure shows the architecture of a typical server cluster built around a low-
latency, high bandwidth interconnection network. This network can be as
simple as a SAN (e.g., Myrinet) or a LAN (e.g., Ethernet).
• To build a larger cluster with more nodes, the interconnection network can
be built with multiple levels of Gigabit Ethernet, Myrinet, or InfiniBand
switches.
• Through hierarchical construction using a SAN, LAN, or WAN, one can build
scalable clusters with an increasing number of nodes. The cluster is
connected to the Internet via a virtual private network (VPN) gateway.
• The gateway IP address locates the cluster.
Cloud Computing/ Unit-1 82
• The system image of a computer is decided by the way the OS manages the
shared cluster resources.
• Most clusters have loosely coupled node computers. All resources of a server
node are managed by their own OS.
• Thus, most clusters have multiple system images as a result of having many
autonomous nodes under different OS control.
Cloud Computing/ Unit-1 83
Single-System Image
• An ideal cluster should merge multiple system images into a single-system
image (SSI).
• Cluster designers desire a cluster operating system or some middleware to
support SSI at various levels, including the sharing of CPUs, memory, and I/O
across all cluster nodes.
• SSI makes the cluster appear like a single machine to the user.
Cloud Computing/ Unit-1 84
Hardware, Software, and Middleware
• The building blocks are computer nodes (PCs, workstations, servers, etc..),
special communication software such as PVM (Parallel Virtual Machine) or
MPI(message passing interface), and a network interface card in each
computer node.
• Most clusters run under the Linux OS.
• The computer nodes are interconnected by a high-bandwidth network (such
as Gigabit Ethernet, Myrinet, InfiniBand, etc.).
Cloud Computing/ Unit-1 85
Gridcomputing
• Grid computing is a processor architecture that combines computer
resources from various domains to reach a main objective.
• Grid computing is a distributed architecture of large numbers of computers
connected to solve a complex problem.
Cloud Computing/ Unit-1 86
Grid Computing Infrastructures
• Computational Grids
• Grid Families
Cloud Computing/ Unit-1 87
Grid Computing Infrastructures contd..
• Internet services such as the Telnet command enables a local computer
to connect to a remote computer.
• A web service such as HTTP enables remote access of remote web pages.
Cloud Computing/ Unit-1 88
Computational Grids
• A computing grid offers an infrastructure that couples computers,
software/middleware, special instruments, and people and sensors
together.
Cloud Computing/ Unit-1 89
Cloud Computing/ Unit-1 90
Grid Families
• Grid systems are classified in essentially two categories:
1. computational or data grids.
2. P2P grids.
Cloud Computing/ Unit-1 91
PEER-TO-PEER GRID
• A P2P grid with peer groups managed, locally arranged into a global
system supported by servers.
• Grids would control the central servers while services at the edge are
grouped into “middleware peer groups”.
• In this case the P2P technologies are part of the services of the
middleware.
Cloud Computing/ Unit-1 92
Cloud Computing/ Unit-1 93
Peer-to-Peer Network Families
• P2P Systems
• Overlay Networks
• P2P Application Families
• P2P Computing Challenges
Cloud Computing/ Unit-1 94
P2P Networks
• P2P Stands for "Peer to Peer.“
• In a P2P network, the "peers" are computer systems which are connected to each other via the Internet. Files can
be shared directly between systems on the network without the need of a central server.
• In other words, each computer on a P2P network becomes a file server as well as a client.
• The only requirements for a computer to join a peer-to-peer network are an Internet connection and P2P software.
• Common P2P software programs include Kazaa, Limewire, BearShare, Morpheus, and Acquisition. These
programs connect to a P2P network, such as "Gnutella," which allows the computer to access thousands of other
systems on the network.
Cloud Computing/ Unit-1 95
Peer-to-Peer (P2P) Network
• In a P2P system, every node acts as both a client and a server, providing
part of the system resources.
• Peer machines are simply client computers connected to the Internet.
• All client machines act autonomously to join or leave the system freely.
• This implies that no master-slave relationship exists among the peers.
• No central coordination or central database is needed.
• In other words, no peer machine has a global view of the entire P2P
system.
• The system is self-organizing with distributed control.
• P2P network does not use a dedicated interconnection network.
Cloud Computing/ Unit-1 96
Overlay Networks
• An overlay network can be thought of as a computer network on
top of another network.
• All nodes in an overlay network are connected with one another by
means of logical or virtual links and each of these links correspond
to a path in the underlying network.
Cloud Computing/ Unit-1 97
The structure of a P2P system by mapping a physical IP network to an overlay network built with virtual links.
Cloud Computing/ Unit-1 98
• There are two types of overlay networks:
1. Unstructured and
2. Structured.
Cloud Computing/ Unit-1 99
P2P Application Families
Based on application, P2P networks are classified into four groups
1. Distributed file sharing
2. Collaborative platform
3. Distributed P2P computing
4. P2P platform
Cloud Computing/ Unit-1 100
P2P Computing Challenges
• P2P computing faces three types of problems in hardware,
software, and network requirements.
Cloud Computing/ Unit-1 101
• Advantages of Peer-to-peer networking are :-
1. It is easy to install and so is the configuration of computers on this
network,
2. All the resources and contents are shared by all the peers, unlike server-
client architecture where Server shares all the contents and resources.
3. P2P is more reliable as central dependency is eliminated. Failure of one
peer doesn’t affect the functioning of other peers. In case of Client –Server
network, if server goes down whole network gets affected.
4. There is no need for full-time System Administrator. Every user is the
administrator of his machine. User can control their shared resources.
5. The over-all cost of building and maintaining this type of network is
comparatively very less.
Cloud Computing/ Unit-1 102
• Disadvantages(drawbacks) of Peer to peer are:-
1. In this network, the whole system is decentralized thus it is difficult
to administer. That is one person cannot determine the whole
accessibility setting of whole network.
2. Security in this system is very less viruses, spywares, trojans, etc
malwares can easily transmitted over this P-2-P architecture.
3. Data recovery or backup is very difficult. Each computer should
have its own back-up system
4. Lot of movies, music and other copyrighted files are transferred
using this type offile transfer. P2P is the
technology used in
torrents.
Cloud Computing/ Unit-1 103
Cloud Computing over the Internet
• Internet Clouds
• The Cloud Landscape
Cloud Computing/ Unit-1 104
The Cloud
• Historical roots in today’s
Internet apps
• Search, email, social networks
• File storage (Live Mesh, Mobile Me, Flicker, …)
• A cloud infrastructure provides a framework to manage scalable, reliable,
on-demand access to applications
• A cloud is the “invisible” backend to many of our mobile applications
• A model of computation and data storage based on “pay as you go”
access to “unlimited” remote data center capabilities
Cloud Computing/ Unit-1 105
Basic Concept of Internet Clouds
• Cloud computing is the use of computing resources (hardware and software)
that are delivered as a service over a network (typically the Internet).
• The name comes from the use of a cloud-shaped symbol as an abstraction for
the complex infrastructure it contains in system diagrams.
• Cloud computing entrusts remote services with a user's data, software and
computation.
Cloud Computing/ Unit-1 106
Cloud Computing over the Internet
• Cloud computing has been defined differently by many users and
designers.
• “A cloud is a pool of virtualized computer resources. A cloud can host a
variety of different workloads, including batch-style backend jobs and
interactive and user-facing applications.”
Cloud Computing/ Unit-1 107
Internet Clouds
• Cloud computing applies a virtualized platform with elastic resources on
demand by provisioning hardware, software, and data sets dynamically.
• The idea is to move desktop computing to a service-oriented platform
using server clusters and huge databases at data centers.
Cloud Computing/ Unit-1 108
Cloud Computing/ Unit-1 109
Cloud Service Models (1)
Infrastructure as a service (IaaS)
• Most basic cloud service model
• This model puts together infrastructures demanded by users –namely servers, storage, networks and data center fabric.
• Cloud providers offer computers, as physical or more often as virtual machines, and other resources.
• Virtual machines are run as guests by a hypervisor, such as Xen or KVM.
• Cloud users deploy their applications by then installing operating system images on the machines as well as their
application software.
• Cloud providers typically bill IaaS services on a utility computing basis, that is, cost will reflect the amount of resources
allocated and consumed.
• Examples of IaaS include: Amazon CloudFormation (and underlying services such as Amazon EC2), Rackspace Cloud,
Terremark, and Google Compute Engine.
Cloud Computing/ Unit-1 110
Cloud Service Models (2)
Platform as a service (PaaS)
• This model enables user to deploy user-built applications onto a virtualized cloud platform.
• Cloud providers deliver a computing platform typically including operating system, programming
language execution environment, database, and web server.
• Application developers develop and run their software on a cloud platform without the cost and
complexity of buying and managing the underlying hardware and software layers.
• Examples of PaaS include: Amazon Elastic Beanstalk, Cloud Foundry, Heroku, Force.com,
EngineYard, Mendix, Google App Engine, Microsoft Azure and OrangeScape.
Cloud Computing/ Unit-1 111
Cloud Service Models (3)
Software as a service (SaaS)
• This refers to browser initiated application software over
thousands of paid cloud customers.
• Cloud providers install and operate application software in the
cloud and cloud users access the software from cloud clients.
• The pricing model for SaaS applications is typically a monthly
or yearly flat fee per user, so price is scalable and adjustable if
users are added or removed at any point.
• Examples of SaaS include: Google Apps, innkeypos,
Quickbooks Online, Limelight Video Platform, Salesforce.com,
and Microsoft Office 365.
Cloud Computing/ Unit-1 112
Cloud Computing/ Unit-1 113
• Internet clouds offer four deployment modes:
• private,
• public,
• community, and
• hybrid.
Cloud Computing/ Unit-1 114
Public cloud
• As the name suggests, this type of cloud deployment model supports all
users who want to make use of a computing resource, such as hardware
(OS, CPU, memory, storage) or software (application server, database) on
a subscription basis.
• Most common uses of public clouds are for application development and
testing, non-mission-critical tasks such as file-sharing, and e-mail service.
Cloud Computing/ Unit-1 115
Private cloud
• a private cloud is typically infrastructure used by a single organization.
• Such infrastructure may be managed by the organization itself to support
various user groups, or it could be managed by a service provider that
takes care of it either on-site or off-site.
• Private clouds are more expensive than public clouds due to the capital
expenditure involved in acquiring and maintaining them.
• However, private clouds are better able to address the security and
privacy concerns of organizations today.
Cloud Computing/ Unit-1 116
Hybrid cloud
• In a hybrid cloud, an organization makes use of interconnected private
and public cloud infrastructure.
• Many organizations make use of this model when they need to scale up
their IT infrastructure rapidly, such as when leveraging public clouds to
supplement the capacity available within a private cloud.
• For example, if an online retailer needs more computing resources to run
its Web applications during the holiday season it may attain those
resources via public clouds.
Cloud Computing/ Unit-1 117
Community cloud
• This deployment model supports multiple organizations sharing
computing resources that are part of a community; examples include
universities cooperating in certain areas of research, or police
departments within a county or state sharing computing resources.
• Access to a community cloud environment is typically restricted to the
members of the community.
Cloud Computing/ Unit-1 118
Cloud Computing/ Unit-1 119
1.4.1 SERVICE-ORIENTED ARCHITECTURE (SOA)
• The Service Oriented Architecture is an architectural design which
includes collection of services in a network which communicate with
each other.
Cloud Computing/ Unit-1 120
Layered Architecture for Web Services
• These architectures build on the traditional seven Open Systems
Interconnection (OSI) layers that provide the base networking
Abstractions.
Cloud Computing/ Unit-1 121
The Evolution of SOA
• SOA applies to building grids, clouds, grids of clouds, clouds of grids,
clouds of clouds, and systems of systems in general.
• A large number of sensors provide data-collection services, denoted
as SS (sensor service).
• A sensor can be a ZigBee device, a Bluetooth device, a WiFi access
point, a personal computer, a GPA, or a wireless phone, among other
things.
• Filter services (fs) are used to eliminate unwanted raw data, in order to
respond to specific requests from the web, the grid, or web services.
Cloud Computing/ Unit-1 122
The Evolution of SOA
Cloud Computing/ Unit-1 123
Grids versus Clouds
Cloud Computing/ Unit-1 124
1.4.2 Trends toward Distributed Operating Systems
Distributed operating system
• Distributed operating system is an operating system that runs on several
machines whose purpose is to provide a useful set of services, generally to
make the collection of machines behave more like a single machine.
• A distributed operating system is a software over a collection of
independent, networked, communicating, and physically separate
computational nodes. They handle jobs which are serviced by multiple
CPUs.
Cloud Computing/ Unit-1 125
AMOEBA
• Amoeba is a distributed operating system developed at the Vrije
University
• The aim of the Amoeba project was to build a timesharing system
that makes an entire network of computers appear to the user as a
single machine.
Cloud Computing/ Unit-1 126
Distributed Computing Environment (DCE)
In Distributed Computing Environment (DCE) software system was developed in
the early 1990s for the work of the Open Software Foundation (OSF).
• The DCE supplies a framework and a toolkit for
developing client/server applications.
The framework includes:
• a remote procedure call (RPC) mechanism known as DCE/RPC
• a naming (directory) service
• a time service
• an authentication service
• a distributed file system (DFS) known as DCE
Cloud Computing/ Unit-1 127
MOSIX
• MOSIX is a cluster management system that provides a single-system
image.
• MOSIX supports both interactive concurrent processes and batch jobs.
• It incorporates automatic resource discovery and dynamic workload
distribution by preemptive process migration.
• MOSIX is implemented as a software layer that allows applications to
run in remote nodes as if they run locally.
Cloud Computing/ Unit-1 128
Cloud Computing/ Unit-1 129
4.1.3 Transparency in Programming Environments
Cloud Computing/ Unit-1 130
1.4.3 Parallel and Distributed Programming Models
Message-Passing Interface (MPI)
• MPI is the primary programming standard used to develop parallel and
concurrent programs to run on a distributed system.
Cloud Computing/ Unit-1 131
MapReduce
• MapReduce is a web programming model for scalable data
processing on large clusters over large data sets.
• The model is applied mainly in web-scale search and cloud
computing applications.
Cloud Computing/ Unit-1 132
Hadoop Library
• Hadoop enables users to write and run apps over vast amounts of
distributed data.
• Users can easily scale Hadoop to store and process Petabytes of data in
the web space.
• The package is economical (open source), efficient (high level of
parallelism) and is reliable (keeps multiple data copies).
Cloud Computing/ Unit-1 133
Open Grid Services Architecture (OGSA)
• OGSA offers common grid service standards for general public.
• It supports a heterogeneous distributed environment, bridging
CA(certificate Authority), multiple security mechanisms.
Cloud Computing/ Unit-1 134
Globus Toolkits and Extensions
• Globus is a middleware library that implements OGSA standards for
resource discovery, allocation and security enforcement.
Cloud Computing/ Unit-1 135
5. PERFORMANCE, SECURITY, AND
ENERGY EFFICIENCY
Cloud Computing/ Unit-1 136
Performance Metrics and Scalability Analysis
Performance Metrics:
• CPU speed: MHz or GHz,
• Network Bandwidth: Mbps or Gbps
• System throughput: MIPS, TFlops (tera floating-point operations per
second), TPS (transactions per second), IOPS (IO operations per second)
• Other metrics: Response time, network latency, system availability
Cloud Computing/ Unit-1 137
Scalability
• Scalability is the ability of a system to handle growing amount of work in
a capable/efficient manner or its ability to be enlarged to accommodate
that growth.
• For example, it can refer to the capability of a system to increase total
throughput under an increased load when resources (typically hardware)
are added.
Cloud Computing/ Unit-1 138
Dimensions of Scalability
• Size scalability
• Software scalability
• Application scalability
• Technology scalability
Cloud Computing/ Unit-1 139
Amdahl's Law
• Amdahl's law is named after computer architect Gene Amdahl.
• It is not really a law but rather an approximation that models the ideal
speedup that can happen when serial programs are modified to run in
parallel.
• For this approximation to be valid, it is necessary for the problem size to
remain the same when parallelized.
Cloud Computing/ Unit-1 140
where T is total execution time
n no.of processors
α proportion of a program that can be made parallel
(1- α )the portion of that cannot be parallelized
Cloud Computing/ Unit-1 141
Fault Tolerance and System Availability
• Fault tolerance the ability of your infrastructure to continue
providing service to underlying applications even after the failure of
one or more component pieces in any layer.
• System Availability:A system is highly available if it has a long
mean time to failure (MTTF) and a short mean time to repair
(MTTR). System availability is formally defined as follows:
Cloud Computing/ Unit-1 142
Network Threats and Data Integrity
• Data integrity means maintaining and assuring the accuracy and
completeness of data over its entire lifecycle. This means that data
cannot be modified in an unauthorized or undetected manner.
Cloud Computing/ Unit-1 143
Network Threats and Data Integrity
Cloud Computing/ Unit-1 144
Energy Efficiency in Distributed Computing
• Energy Consumption of Unused Servers
• Reducing Energy in Active Servers
Cloud Computing/ Unit-1 145
• Power management issues in distributed computing platforms can
be categorized into four layers
• the application layer,
• middleware layer,
• resource layer, and
• network layer.
Cloud Computing/ Unit-1 146
Application layer
• Most apps in different areas like science, engineering,
business, financial etc. try to increase the system’s speed
or quality. By introducing energy-conscious applications,
without reducing the performance. For this goal, we need
to identify a relationship between the performance and
energy consumption areas (correlation).
Cloud Computing/ Unit-1 147
Middleware layer
• The middleware layer is a connection between application layer
and resource layer.
• This layer provides resource broker, communication service, task
analyser & scheduler, security access, reliability control, and
information service capabilities.
• It is also responsible for energy-efficient techniques in task
scheduling.
• In distributed computing system, a balance has to be brought out
between efficient resource usage and the available energy.
Cloud Computing/ Unit-1 148
Resource Layer
• This layer consists of different resources including the computing
nodes and storage units.
• Since this layer interacts with hardware devices and the operating
systems, it is responsible for controlling all distributed resources.
• Several methods exist for efficient power management of hardware
and OS and majority of them are concerned with the processors.
Cloud Computing/ Unit-1 149
Resource Layer(contd..)
• Dynamic power management (DPM) and dynamic voltage
frequency scaling (DVFS) are the two popular methods being used
recently.
• In DPM, hardware devices can switch from idle modes to lower
power modes.
• In DVFS, energy savings are obtained based on the fact that power
consumption in CMOS (Complementary Metal-Oxide
Semiconductor) circuits have a direct relationship with frequency
and the square of the voltage supply.
Cloud Computing/ Unit-1 150
Network layer
• The main responsibilities of the network layer in distributed
computing are routing and transferring packets, and enabling
network services to the resource layer.
• Energy consumption and performance are to measured, predicted
and balanced in a systematic manner so as to bring out energy-
efficient networks.
Cloud Computing/ Unit-1 151
Thank you
Cloud Computing/ Unit-1 152