FUNDAMENTALS OF DISTRIBUTED SYSTEM
Chapter one
Introduction to Distributed system
1
Chapter Outline
• Defining distributed system
• Examples of distributed systems
• Why distribution?
• Goals and challenges of distributed systems
• Where is the borderline between a computer and a distributed
system?
• Examples of distributed architectures
Introduction and Definition
▪ before the mid-80s, computers were
▪ very expensive (hundred of thousands or even millions of dollars)
▪ very slow (a few thousand instructions per second)
▪ not connected among themselves
▪ after the mid-80s: two major developments
▪ cheap and powerful microprocessor-based computers appeared
▪ computer networks
▪ LANs at speeds ranging from 10 to 1000 Mbps
▪ WANs at speed ranging from 64 Kbps to gigabits/sec
▪ consequence
▪ feasibility of using a large network of computers to work for the same
application; this is in contrast to the old centralized systems where there
was a single computer with its peripherals
▪ Definition of a Distributed System
▪ a distributed system is:
a collection of independent computers that appears to its users as a
single coherent system - computer (Tanenbaum & Van Steen)
▪ this definition has two aspects:
1. hardware: autonomous machines
2. software: a single system view for the users
▪ Other Definitions
a distributed system is a system designed to support the development of
applications and services which can exploit a physical architecture consisting of
multiple, autonomous processing elements that do not share primary memory but
cooperate by sending asynchronous messages over a communication network
(Blair & Stefani)
CENTRALIZED SYSTEM CHARACTERISTICS
One component with non-autonomous parts
Component shared by users all the time
All resources accessible
Software runs in a single process
Single point of control
Single point of failure
6
DISTRIBUTED SYSTEM CHARACTERISTICS
Multiple autonomous components
Components are not shared by all users
Resources may not be accessible
Software runs in concurrent processes on different
processors
Multiple points of control
Multiple points of failure
7
COMMON CHARACTERISTICS OF DISTRIBUTED SYSTEM
Resource Sharing: is the ability to use any h/w,s/w & data
anywhere in the system.
Openness: is concerned with extension & improvement of
distributed system.
Scalability: increase the scale of the system.
Fault tolerance: if any failure is occurred in the S/W or
H/W the system continue work or it is the reliability of
the system.
Transparrency:Hides the complexity of distributed
system to the users & application programs.
8
❖ OPENNESS
Openness is concerned with extensions and
improvements of distributed systems.
Detailed interfaces of components need to be
published.
New components have to be integrated with existing
components.
Differences in data representation of interface types
on different processors (of different vendors) have to
be resolved.
9
❖ SCALABILITY
Adaptation of distributed systems to
accommodate more users
respond faster (this is the hard one)
Usually done by adding more and/or faster processors.
Components should not need to be changed when
scale of a system increases.
Design components to be scalable!
10
❖ FAILURE HANDLING (FAULT TOLERANCE)
Hardware, software and networks fail!
Distributed systems must maintain availability even
at low levels of hardware/software/network reliability.
Fault tolerance is achieved by
recovery
redundancy
11
❖ CONCURRENCY
Components in distributed systems are executed in
concurrent processes.
Components access and update shared resources (e.g.
variables, databases, device drivers).
Integrity of the system may be violated if concurrent
updates are not coordinated.
Lost updates
Inconsistent analysis
12
❖ TRANSPARENCY
Distributed systems should be perceived by users and
application programmers as a whole rather than as a
collection of cooperating components.
Transparency has different aspects.
These represent various properties that distributed
systems should have.
13
EXAMPLES OF DISTRIBUTED SYSTEMS
Local Area Network and Intranet
Database Management System
Automatic Teller Machine Network
Internet/World-Wide Web
Mobile and Ubiquitous Computing
14
LOCAL AREA NETWORK
email server Desktop
computer s
print and other servers
Local area
Web server network
email server
print
File server
other servers
the rest of
the Internet
router/firewall
15
DATABASE MANAGEMENT SYSTEM
16
AUTOMATIC TELLER MACHINE NETWORK
17
ORGANIZATION OF A DISTRIBUTED SYSTEM
to support heterogeneous computers and networks and to provide
a single-system view, a distributed system is often organized by
means of a layer of software called middleware that extends over
multiple machines
19
a distributed system organized as middleware; note that the middleware layer
extends over multiple machines
INTRODUCTION TO ARCHITECTURAL MODEL
Architecture of distributed system:
1. shared memory Architecture(Tightly coupled
system)
2.Distributed memory Architecture(loosly coupled
system)
20
DISTRIBUTED COMPUTING MODELS
Fundamental model: are based on some fundamental
properties such as: characterstics,failures &security.
1. Interaction models
2. failure model
3. security model
Architectural models: It deals with organizations of
component across the network of computer & their
interrelationship.
1. client-server model
2. peer-to-peer model
21
ARCHITECTURAL MODEL
Client-server model
▪ how are processes organized in a system
▪ thinking in terms of clients requesting services from
servers
22
general interaction between a client and a server
Peer-to-peer model
Unlike client-server,p2p model does not distinguish
between client/server, instead each node can either be
a client or server depending on whether the node is
requesting or providing the service.
Each component acts as its own process and acts as
both a client and a server to other peer components.
Any component can initiate a request to any other peer
component.
23
FUNDAMENTAL MODELS
1. Interaction model:
❖ computation occurs within processes.
❖ the processes interact by passing messages resulting in:
i. communication(information flow)
ii. coordination(synchronization & ordering of
activities)between processes
❖ Interaction model reflects the facts that communication
takes place with delays.
❖ Two variants of the interaction model
❖ I. Synchronized D.S has bounds on:
- process is executing in a known lower/upper bounded
time. 24
-message is received within a known bounded time.
II. A Synchronized D.S has no bounds on:
- process execution speed.
-message transmission delay.
-clocks drift rate.
2. Failure Model:
✓ it defines & classifies the faults.
✓ It is important to understand the kinds of failures that
may occur in a system.
Types of failure model:
*Fail stop
*crash
* omission
*Timing failure
25
*Arbitrty
3.Security model:
✓ There are several potential threats a system designer need
be aware of:
1.Treats to processes: an attacker sends a request or response
using false identity.
2.Treats to communication channels: an attacker may listen
to message &save(evasdroping).
3.Denial of service: an attacker may overload a server by
making excessive request.
✓ Cryptography & authentication are often used to provide
security.
26
PROS AND CONS OF DISTRIBUTED SYSTEMS
Advantages of Distributed Systems
Performance: Very often a collection of processors can provide higher
performance (and better price/performance ratio) than a centralized
computer.
Distribution: many applications involve, by their nature, spatially
separated machines (banking, commercial, automotive system).
Reliability (fault tolerance): if some of the machines crash, the
system can survive.
Incremental growth: as requirements on processing power grow, new
machines can be added incrementally.
Sharing of data/resources: shared data is essential to many
applications (banking, computer supported cooperative work,
reservation systems); other resources can be also shared (e.g. expensive
printers). 27
Communication: facilitates human-to-human communication.
DISADVANTAGES OF DISTRIBUTED SYSTEMS
Difficulties of developing distributed software:
how should operating systems, programming
languages and applications look like?
Networking problems: several problems are
created by the network infrastructure, which have to
be dealt with: loss of messages, overloading, ...
Security problems: sharing generates the problem
of data security.
TYPES OF DISTRIBUTION SYSTEMS
Distributed computing systems
Distributed information systems
Distributed pervasive/embedded systems
29
DISTRIBUTED COMPUTING SYSTEMS
Used for high-performance computing tasks
Two types: Cluster computing and Grid computing
Cluster Computing :A collection of similar workstations or
PCs (homogeneous), closely connected by means of a high-
speed LAN
Each node runs the same operating system
Used for parallel programming in which a single compute
intensive program is run in parallel on multiple machines
Grid Computing: “Resource sharing and coordinated
problem solving in dynamic, multi-institutional virtual
organizations” (I. Foster)
high degree of heterogeneity: no assumptions are made
concerning hardware, operating systems, networks,
administrative domains, security policies, etc. 30
DISTRIBUTED INFORMATION SYSTEMS
The vast amount of distributed systems in use today is in the form of
traditional information systems, that now integrate legacy systems.
Example: Transaction processing systems
BEGIN TRANSACTION(server, transaction);
READ(transaction, file-1, data);
WRITE(transaction, file-2, data);
newData := MODIFIED(data);
IF WRONG(newData) THEN
ABORT TRANSACTION(transaction);
ELSE
WRITE(transaction, file-2, newData);
END TRANSACTION(transaction);
END IF;
Note:
All READ and WRITE operations are executed, i.e. their effects are made permanent
at the execution of END TRANSACTION.
31
Transactions form an atomic operation.
TRANSACTION PROCESSING SYSTEMS
A transaction is a collection of operations on the state of an object (database, object
composition, etc.) that satisfies the following properties (ACID):
Atomicity: All operations either succeed, or all of them fail.
- When the transaction fails, the state of the object will remain unaffected by the
transaction.
Consistency: A transaction establishes a valid state transition.
- This does not exclude the possibility of invalid,
intermediate states during the transaction’s execution.
Isolation: Concurrent transactions do not interfere with each other.
- It appears to each transaction T that other transactions occur either before T,
or after T, but never both.
Durability: After the execution of a transaction, its effects are
made permanent:
- Changes to the state survive failures.
32
DISTRIBUTED PERVASIVE SYSTEMS
A next-generation of distributed systems emerging in which the nodes are small,
wireless, battery-powered, mobile (e.g. PDAs, smart phones, wireless surveillance
cameras, portable ECG monitors, etc.), and often embedded as part of a larger
system.
Three requirements for pervasive applications
Embrace contextual changes: a device is aware that its environment
may change all the time
Encourage ad hoc composition: devices are used in different ways by
different users
Recognize sharing as the default: devices join a system to access or
provide information
Examples of pervasive systems :
Home Systems
Electronic Health Care Systems
Sensor Networks
HARDWARE AND SOFTWARE CONCEPTS
Hardware Concepts
different classification schemes exist
multiprocessors - with shared memory
multicomputers - that do not share
memory
can be homogeneous or heterogeneous
▪ a single
backbone
Basic
Organizations of a
Node
different basic organizations of processors and memories in distributed systems
Parallel system?
▪ Multiprocessors - Shared Memory (1)
▪ the shared memory has to be coherent - the same value written by
one processor must be read by another processor
▪ performance problem for bus-based organization since the bus
will be overloaded as the number of processors increases
▪ the solution is to add a high-speed cache memory between the
processors and the bus to hold the most recently accessed words;
may result in incoherent memory
a bus-based multiprocessor
▪ bus-based multiprocessors are difficult to scale even with caches
▪ two possible solutions: crossbar switch and omega network
▪ Crossbar switch
▪ divide memory into modules and connect them to the processors
with a crossbar switch
▪ at every intersection, a crosspoint switch is opened and closed to
establish connection
▪ problem: expensive; with n CPUs and n memories, n2 switches
are required
37
▪ Omega network
▪ use switches with multiple input and output lines
▪ drawback: high latency because of several switching stages
between the CPU and memory
▪ Homogeneous Multicomputer Systems
▪ also referred to as System Area Networks (SANs)
▪ the nodes are mounted on a big rack and connected through a
high-performance network
▪ could be bus-based or switch-based
▪ bus-based
▪ shared multiaccess network such as Fast Ethernet can be used
and messages are broadcasted
▪ performance drops highly with more than 25-100 nodes
(contention)
▪ switch-based
▪ messages are routed through an interconnection network
▪ two popular topologies: meshes (or grids) and hypercubes
Hypercube
Grid
▪ Heterogeneous Multicomputer Systems
▪ most distributed systems are built on heterogeneous
multicomputer systems
▪ the computers could be different in processor type, memory size,
architecture, power, operating system, etc. and the
interconnection network may be highly heterogeneous as well
▪ the distributed system provides a software layer to hide the
heterogeneity at the hardware level; i.e., provides transparency
Software Concepts
OSs in relation to distributed systems
tightly-coupled systems, referred to as distributed
OSs (DOS)
the OS tries to maintain a single, global view of
the resources it manages
used for multiprocessors and homogeneous
multicomputers
loosely-coupled systems, referred to as network
OSs (NOS)
a collection of computers each running its own
OS; they work together to make their services
and resources available to others
used for heterogeneous multicomputers
Middleware: to enhance the services of NOSs so
that a better support for distribution transparency
is provided 42
▪ Summary of main issues
System Description Main Goal
Tightly-coupled operating system for multi- Hide and manage
DOS processors and homogeneous hardware
multicomputers resources
Loosely-coupled operating system for Offer local
NOS heterogeneous multicomputers (LAN and services to remote
WAN) clients
Provide
Additional layer atop of NOS implementing
Middleware distribution
general-purpose services
transparency
an overview of DOSs, NOSs, and middleware
▪ Distributed Operating Systems
▪ two types
▪ multiprocessor operating system: to manage the resources of a
multiprocessor
▪ multicomputer operating system: for homogeneous
multicomputers
▪ Uniprocessor Operating Systems
▪ separating applications from operating system code through a
microkernel
Multiprocessor Operating Systems
▪ extended uniprocessor operating systems to support multiple
processors having access to a shared memory
▪ a protection mechanism is required for concurrent access to
guarantee consistency
▪ two synchronization mechanisms: semaphores and monitors
▪ semaphore: an integer with two atomic operations down (if s=0
then sleep; s := s-1) and up (s := s+1; wakeup a sleeping process
if any)
▪ monitor: a programming language construct consisting of
procedures and variables that can be accessed only by the
procedures of the monitor; only a single process at a time is
allowed to execute a procedure
45
Multicomputer Operating Systems
▪ processors can not share memory; instead communication is
through message passing
▪ each node has its own
▪ kernel for managing local resources
▪ separate module for handling interprocessor communication
46
general structure of a multicomputer operating system
▪ Distributed Shared Memory Systems
▪ how to emulate shared memories on distributed systems to
provide a virtual shared memory
▪ page-based distributed shared memory (DSM) - use the virtual
memory capabilities of each individual node
pages of address space distributed among four machines
47
situation after CPU 1 references page 10
▪ read-only pages can be easily replicated
situation if page 10 is read only and replication is
used
Network Operating Systems
possibly heterogeneous underlying hardware
constructed from a collection of uniprocessor
systems, each with its own operating system
and connected to each other in a computer
network
general structure of a network operating system
49
Services offered by network operating systems
remote login (rlogin)
remote file copy (rcp)
shared file systems through file servers
two clients and a server in a network operating system
50
▪ Middleware
▪ a distributed operating system is not intended to handle a collection
of independent computers but provides transparency and ease of
use
▪ a network operating system does not provide a view of a single
coherent system but is scalable and open
▪ combine the scalability and openness of network operating systems
and the transparency and ease of use of distributed operating
systems
▪ this is achieved through a middleware, another layer of software
general structure of a distributed system as middleware
52
different middleware models exist
treat every resource as a file; just as in UNIX
through Remote Procedure Calls (RPCs) -
calling a procedure on a remote machine
distributed object invocation
▪ middleware services
▪ access transparency: by hiding the low-level message passing
▪ naming: such as a URL in the WWW
▪ distributed transactions: by allowing multiple read and write
operations to occur atomically
▪ security
▪ Middleware and Openness
▪ in an open middleware-based distributed system, the protocols
used by each middleware layer should be the same, as well as the
interfaces they offer to applications
54
▪ a comparison between multiprocessor operating systems,
multicomputer operating systems, network operating systems,
and middleware-based distributed systems
Distributed OS Network Middleware-
Item
Multiproc Multicomp OS based OS
Degree of transparency Very High High Low High
Same OS on all nodes Yes Yes No No
Number of copies of
1 N N N
OS
Basis for Shared Model
Messages Files
communication memory specific
Global, Global,
Resource management Per node Per node
central distributed
Scalability No Moderately Yes Varies
Openness Closed Closed Open Open
55
THANK YOU FOR YOUR ATTENTION
56