Parallel Computation Models Explained

The document discusses three important principles that have emerged for developing parallel models of computation: 1. Work-efficiency - An algorithm is efficient if it performs the same amount of work as the fastest sequential algorithm using the same number of processors. The work captures the actual cost of the computation. 2. Emulation - A useful model does not need to mimic a real machine, but algorithms efficient in the model must map to efficient algorithms on real machines. For example, a PRAM algorithm can be translated to an algorithm for a multi-processor memory system. 3. Modeling communication - To optimize performance, a model may explicitly include capabilities like bandwidth and topology that impact communication between processors. [/SUMMARY]

Uploaded by

debasish behera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views3 pages

Parallel Computation Models Explained

Uploaded by

debasish behera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Parallel Models of Computation

Developing a standard parallel model of computation for analyzing algorithms has proven
difficult because different parallel computers tend to vary significantly in their organizations.
In spite of this difficulty, useful parallel models have emerged, along with a deeper
understanding of the modeling process. In this section we describe three important principles
that have emerged.

1. Work-efficiency. In designing a parallel algorithm, it is more important to make it

efficient than to make it asymptotically fast. The efficiency of an algorithm is determined
by the total number of operations, or work that it performs. On a sequential machine, an
algorithm's work is the same as its time. On a parallel machine, the work is simply the
processor-time product. Hence, an algorithm that takes time t on a P-processor machine
performs work W = Pt. In either case, the work roughly captures the actual cost to
perform the computation, assuming that the cost of a parallel machine is proportional to
the number of processors in the machine. We call an algorithm work-efficient (or just
efficient) if it performs the same amount of work, to within a constant factor, as the
fastest known sequential algorithm. For example, a parallel algorithm that sorts n keys

in time using processors is efficient since the work, , is as

good as any (comparison-based) sequential algorithm. However, a sorting algorithm that

runs in time using processors is not efficient. The first algorithm is better
than the second - even though it is slower - because it's work, or cost, is smaller. Of
course, given two parallel algorithms that perform the same amount of work, the faster
one is generally better.
2. Emulation. The notion of work-efficiency leads to another important observation: a
model can be useful without mimicking any real or even realizable machine. Instead, it
suffices that any algorithm that runs efficiently in the model can be translated into an
algorithm that runs efficiently on real machines. As an example, consider the widely-used
parallel random-access machine (PRAM) model. In the PRAM model, a set of processors
share a single memory system. In a single unit of time, each processor can perform an
arithmetic, logical, or memory access operation. This model has often been criticized as
unrealistically powerful, primarily because no shared memory system can perform
memory accesses as fast as processors can execute local arithmetic and logical operations.
The important observation, however, is that for a model to be useful we only require that
algorithms that are efficient in the model can be mapped to algorithms that are efficient
on realistic machines, not that the model is realistic. In particular, any algorithm that runs
efficiently in a P-processor PRAM model can be translated into an algorithm that runs

efficiently on a -processor machine with a latency L memory system , a much

more realistic machine. In the translated algorithm, each of the processors

emulates L PRAM processors. The latency is ``hidden'' because a processor has useful
work to perform while waiting for a memory access to complete. Although the translated
algorithm is a factor of L slower than the PRAM algorithm, it uses a factor of L fewer
processors, and hence is equally efficient.
3. Modeling Communication. To get the best performance out of a parallel machine, it is
often helpful to model the communication capabilities of the machine, such as its latency,
explicitly. The most important measure is the communication bandwidth. The bandwidth
available to a processor is the maximum rate at which it can communicate with other
processors or the memory system. Because it is more difficult to hide insufficient
bandwidth than large latency, some measure of bandwidth is often included in parallel
models. Sometimes the specific topology of the communication network is modeled as
well. Although including this level of detail in the model often complicates the design of
parallel algorithms, it's essential for designing the low-level communication primitives for
the machine. In addition to modeling basic communication primitives, other operations
supported by hardware, including synchronization and concurrent memory accesses, are
often modeled, as well as operations that mix computation and communication, such as
fetch-and-add and scans. A final consideration is whether the machine supports shared
memory, or whether all communication relies on passing messages between the
processors.

Algorithmic Techniques

A major advance in parallel algorithms has been the identification of fundamental algorithmic
techniques. Some of these techniques are also used by sequential algorithms, but play a more
prominent role in parallel algorithms, while others are unique to parallelism. Here we list
some of these techniques with a brief description of each.

1. Divide-and-Conquer. Divide-and-conquer is a natural paradigm for parallel algorithms.

After dividing a problem into two or more subproblems, the subproblems can be solved in
parallel. Typically the subproblems are solved recursively and thus the next divide step
yields even more subproblems to be solved in parallel. For example, suppose we want to
compute the convex-hull of a set of n points in the plane (i.e., compute the smallest
convex polygon that encloses all of the points). This can be implemented by splitting the

points into the leftmost and rightmost , recursively finding the convex hull of
each set in parallel, and then merging the two resulting hulls. Divide-and-conquer has
proven to be one of the most powerful techniques for solving problems in parallel with
applications ranging from linear systems to computer graphics and from factoring large
numbers to n-body simulations.
2. Randomization. The use of random numbers is ubiquitous in parallel algorithms.
Intuitively, randomness is helpful because it allows processors to make local decisions
which, with high probability, add up to good global decisions. For example, suppose we
want to sort a collection of integer keys. This can be accomplished by partitioning the
keys into buckets then sorting within each bucket. For this to work well, the buckets must
represent non-overlapping intervals of integer values, and contain approximately the same
number of keys. Randomization is used to determine the boundaries of the intervals. First
each processor selects a random sample of its keys. Next all of the selected keys are
sorted together. Finally these keys are used as the boundaries. Such random sampling is
also used in many parallel computational geometry, graph, and string matching
algorithms. Other uses of randomization include symmetry breaking, load balancing, and
routing algorithms.
3. Parallel Pointer Manipulations. Many of the traditional sequential techniques for
manipulating lists, trees, and graphs do not translate easily into parallel techniques. For
example techniques such as traversing the elements of a linked list, visiting the nodes of a
tree in postorder, or performing a depth-first traversal of a graph appear to be inherently
sequential. Fortunately, each of these techniques can be replaced by efficient parallel
techniques. These parallel techniques include pointer jumping, the Euler-tour technique,
ear decomposition, and graph contraction. For example, one way to label each node of
an n-node list (or tree) with the label of the last node (or root) is to use pointer jumping.
In each pointer-jumping step each node in parallel replaces its pointer with that of its

successor (or parent). After at most steps, every node points to the same node, the
end of the list (or root of the tree).
4. Others. Other useful techniques include finding small graph separators for partitioning
data among processors to reduce communication, hashing for balancing load across
processors and mapping addresses to memory, and iterative techniques as a replacement
for direct methods for solving linear systems.

These techniques have led to efficient parallel algorithms in most problem areas for which
efficient sequential algorithms are known. In fact, some of the techniques originally
developed for parallel algorithms have led to improvements in sequential algorithms.

An Introduction To Parallel Algorithms
No ratings yet
An Introduction To Parallel Algorithms
66 pages
Chapter Six
No ratings yet
Chapter Six
18 pages
Chapter Six
No ratings yet
Chapter Six
19 pages
JaJa Parallel - Algorithms Intro
50% (2)
JaJa Parallel - Algorithms Intro
45 pages
Parallel
No ratings yet
Parallel
59 pages
HPC Chapter 2
No ratings yet
HPC Chapter 2
16 pages
What Is Network Topology (AutoRecovered)
No ratings yet
What Is Network Topology (AutoRecovered)
23 pages
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
No ratings yet
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
104 pages
Parallel Algorithm and Programming
No ratings yet
Parallel Algorithm and Programming
4 pages
Ebook Fundations of Paralllel Programming
No ratings yet
Ebook Fundations of Paralllel Programming
109 pages
Parallel Algorithem
No ratings yet
Parallel Algorithem
15 pages
UNIT - I: Parallel and Distributed Computing
No ratings yet
UNIT - I: Parallel and Distributed Computing
58 pages
E - Notes - HPC-Unit 3-1
No ratings yet
E - Notes - HPC-Unit 3-1
26 pages
Layers of Implementing An Application in Software or Hardware Using Parallel Computers
No ratings yet
Layers of Implementing An Application in Software or Hardware Using Parallel Computers
46 pages
4 DesigningParallelPrograms
No ratings yet
4 DesigningParallelPrograms
69 pages
Dis Top Tim Notes 1
No ratings yet
Dis Top Tim Notes 1
3 pages
Chapter 7 - Parallel Programming Issues
No ratings yet
Chapter 7 - Parallel Programming Issues
68 pages
Lec # 04 - Parallel Algorithm
No ratings yet
Lec # 04 - Parallel Algorithm
13 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
1 Parallel and Distributed Computation
No ratings yet
1 Parallel and Distributed Computation
10 pages
PRAM Parallel Computing Algorithms
No ratings yet
PRAM Parallel Computing Algorithms
49 pages
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
No ratings yet
Parallel Algorithms: Theory and Practice: Deterministi C Parallelism
51 pages
1.1 Parallelism Is Ubiquitous
No ratings yet
1.1 Parallelism Is Ubiquitous
3 pages
Parallel Algorithms: Theory and Practice
No ratings yet
Parallel Algorithms: Theory and Practice
44 pages
Lecture Parallelism DC PDF
No ratings yet
Lecture Parallelism DC PDF
7 pages
07 Parallel Algorithms in Parallel and Distributed Computing
No ratings yet
07 Parallel Algorithms in Parallel and Distributed Computing
13 pages
1.1 Parallelism
No ratings yet
1.1 Parallelism
29 pages
HPC Lab: Parallel Computing Basics
No ratings yet
HPC Lab: Parallel Computing Basics
58 pages
Dalgorithm
No ratings yet
Dalgorithm
5 pages
Parallel Algorithms Complete Notes
No ratings yet
Parallel Algorithms Complete Notes
13 pages
Parallel Algorithms & Architectures
No ratings yet
Parallel Algorithms & Architectures
22 pages
Introduction To Parallel Computing Design and Anal
No ratings yet
Introduction To Parallel Computing Design and Anal
53 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
No ratings yet
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
27 pages
Chap 4-7 - Parallel - Abstractions - and - MPI
No ratings yet
Chap 4-7 - Parallel - Abstractions - and - MPI
34 pages
L19-20 PA Design Intro
No ratings yet
L19-20 PA Design Intro
31 pages
Parallel Computing Challanges
No ratings yet
Parallel Computing Challanges
7 pages
Intro to Serial & Parallel Computing
No ratings yet
Intro to Serial & Parallel Computing
39 pages
Parallel Algorithms Explained
No ratings yet
Parallel Algorithms Explained
50 pages
Unit-Iv Concurrent and Parallel Programming: Parallel Programming Paradigms - Data Parallel
No ratings yet
Unit-Iv Concurrent and Parallel Programming: Parallel Programming Paradigms - Data Parallel
61 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
Lectures5 14
No ratings yet
Lectures5 14
85 pages
Overheads
No ratings yet
Overheads
139 pages
Parallel Program Design Guide
No ratings yet
Parallel Program Design Guide
52 pages
Par Seq Algorithms
No ratings yet
Par Seq Algorithms
44 pages
Distributed Algorithms Textbook
No ratings yet
Distributed Algorithms Textbook
254 pages
LNLCH 3 4
No ratings yet
LNLCH 3 4
38 pages
Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
CPP Unit-4
No ratings yet
CPP Unit-4
61 pages
Week 3 Parallel Algorithms
No ratings yet
Week 3 Parallel Algorithms
10 pages
Introduction To Parallel Computation: Akl@cs - Queensu.ca
No ratings yet
Introduction To Parallel Computation: Akl@cs - Queensu.ca
2 pages
Parallel Computing
No ratings yet
Parallel Computing
2 pages
Lect 1 Overview
No ratings yet
Lect 1 Overview
17 pages
Assignment of Algorithm
No ratings yet
Assignment of Algorithm
9 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
1 Overview, Models of Computation, Brent's Theorem
No ratings yet
1 Overview, Models of Computation, Brent's Theorem
8 pages
A Review On Use of MPI in Parallel Algorithms: IPASJ International Journal of Computer Science (IIJCS)
No ratings yet
A Review On Use of MPI in Parallel Algorithms: IPASJ International Journal of Computer Science (IIJCS)
8 pages
Parallel Algorithms Presentation
No ratings yet
Parallel Algorithms Presentation
32 pages
Direct Access File System
100% (1)
Direct Access File System
4 pages
PUBLIC Deploying Vsan Cluster Best Practices Guide
No ratings yet
PUBLIC Deploying Vsan Cluster Best Practices Guide
7 pages
IBM - IBM Storage Scale 5.1.9.1 Data Access Services Guide (2023)
No ratings yet
IBM - IBM Storage Scale 5.1.9.1 Data Access Services Guide (2023)
202 pages
All Questions IT
No ratings yet
All Questions IT
20 pages
VCP DCV For Vsphere 7.x
100% (1)
VCP DCV For Vsphere 7.x
1,273 pages
Lesson 8: Design Processes and Design Metric For An Embedded - System Design
No ratings yet
Lesson 8: Design Processes and Design Metric For An Embedded - System Design
29 pages
Memory Allocation
No ratings yet
Memory Allocation
22 pages
Kfg1G16U2B: 1Gb Onenand B-Die
No ratings yet
Kfg1G16U2B: 1Gb Onenand B-Die
127 pages
VM Page Replacement: Hank Levy
No ratings yet
VM Page Replacement: Hank Levy
21 pages
Final SHORT QUESTONS of Networking
No ratings yet
Final SHORT QUESTONS of Networking
17 pages
Media Literacy Module: Text & Visuals
100% (2)
Media Literacy Module: Text & Visuals
15 pages
4CP0 01 MSC 20210517
No ratings yet
4CP0 01 MSC 20210517
18 pages
An Overview of Computing-In-Memory Circuits With DRAM and NVM
No ratings yet
An Overview of Computing-In-Memory Circuits With DRAM and NVM
6 pages
Arc Machines Model 207 Brochure
No ratings yet
Arc Machines Model 207 Brochure
3 pages
Fireball Plus AS Product Manual PDF
No ratings yet
Fireball Plus AS Product Manual PDF
162 pages
Module1 CA PDF Final
No ratings yet
Module1 CA PDF Final
71 pages
ICT Notes 1 2
No ratings yet
ICT Notes 1 2
5 pages
Unit 3 MP
No ratings yet
Unit 3 MP
8 pages
Computer dept-Gr.8E-1st Semester-Revision Sheet
No ratings yet
Computer dept-Gr.8E-1st Semester-Revision Sheet
6 pages
Manual grx2 PDF
No ratings yet
Manual grx2 PDF
120 pages
Year 10 Term 2 Report 2023-24 - 22 Mar 2024-Xuan10r3
No ratings yet
Year 10 Term 2 Report 2023-24 - 22 Mar 2024-Xuan10r3
5 pages
KT Questions V1.6-Latest
100% (1)
KT Questions V1.6-Latest
18 pages
Solution For Sap Hana Platform in Scale Up Configuration Using Advanced Server ds7000 Second Generation Intel Xeon Scalable Processors
No ratings yet
Solution For Sap Hana Platform in Scale Up Configuration Using Advanced Server ds7000 Second Generation Intel Xeon Scalable Processors
39 pages
Praful Resume Storage Backup
No ratings yet
Praful Resume Storage Backup
5 pages
ICS 2202 Chapter 1
No ratings yet
ICS 2202 Chapter 1
11 pages
Disadvantages of Traditional File SystemA Traditional File System Has The Following Disadvantages
No ratings yet
Disadvantages of Traditional File SystemA Traditional File System Has The Following Disadvantages
3 pages
Removable Disk CDC 9760 SMD
No ratings yet
Removable Disk CDC 9760 SMD
6 pages
Enterprise Surveillance Storage Solutions
No ratings yet
Enterprise Surveillance Storage Solutions
45 pages
Explain The Data Transfer Between Memory and Cpu
No ratings yet
Explain The Data Transfer Between Memory and Cpu
2 pages
Principles of Programming-2016
100% (1)
Principles of Programming-2016
90 pages

Parallel Computation Models Explained

Uploaded by

Parallel Computation Models Explained

Uploaded by

Parallel Models of Computation

1. Work-efficiency. In designing a parallel algorithm, it is more important to make it

in time using processors is efficient since the work, , is as

efficiently on a -processor machine with a latency L memory system , a much

more realistic machine. In the translated algorithm, each of the processors

1. Divide-and-Conquer. Divide-and-conquer is a natural paradigm for parallel algorithms.

You might also like