0% found this document useful (0 votes)

124 views24 pages

The PRAM Model and Algorithms: Advanced Topics Spring 2008

The document provides an overview of the Parallel Random Access Machine (PRAM) model of parallel computation. It describes the PRAM model as having multiple processors that can access shared memory simultaneously. It classifies different PRAM models based on their read and write capabilities, and describes how algorithms designed for more powerful models can be simulated on weaker models. It also introduces the work-time scheduling principle for analyzing parallel algorithms and provides examples of algorithms like matrix multiplication and prefix sums on the PRAM model.

Uploaded by

rajeevrajkumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

124 views24 pages

The PRAM Model and Algorithms: Advanced Topics Spring 2008

Uploaded by

rajeevrajkumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

The PRAM Model

and Algorithms
Advanced Topics Spring 2008
Prof. Robert van Engelen

Overview

The PRAM model of parallel computation

Simulations between PRAM models
Work-time presentation framework of parallel algorithms
Example algorithms

1/23/08

HPC Fall 2007

The PRAM Model of Parallel

Computation

Parallel Random Access Machine (PRAM)

Natural extension of RAM: each processor is a RAM
Processors operate synchronously
Earliest and best-known model of parallel computation

Shared memory with m locations

Shared Memory

p processors, each with private memory

1/23/08

All processors operate synchronously, by

executing load, store, and operations on data
HPC Fall 2007

Synchronous PRAM

Synchronous PRAM is a SIMD-style model

All processors execute the same program
All processors execute the same PRAM step instruction stream
in lock-step
Effect of operation depends on local data
Instructions can be selectively disabled (if-then-else flow)

Asynchronous PRAM
Several competing models
No lock-step

1/23/08

HPC Fall 2007

Classification of PRAM Model

A PRAM step (clock cycle) consists of three phases

Read: each processor may read a value from shared memory
2. Compute: each processor may perform operations on local data
3. Write: each processor may write a value to shared memory
1.

Model is refined for concurrent read/write capability

Exclusive Read Exclusive Write (EREW)
Concurrent Read Exclusive Write (CREW)
Concurrent Read Concurrent Write (CRCW)

CRCW PRAM
Common CRCW: all processors must write the same value
Arbitrary CRCW: one of the processors succeeds in writing
Priority CRCW: processor with highest priority succeeds in
writing

1/23/08

HPC Fall 2007

Comparison of PRAM Models

A model A is less powerful compared to model B if either

The time complexity is asymptotically less in model B for solving
a problem compared to A
Or the time complexity is the same and the work complexity is
asymptotically less in model B compared to A

From weakest to strongest:

1/23/08

EREW
CREW
Common CRCW
Arbitrary CRCW
Priority CRCW

HPC Fall 2007

Simulations Between PRAM

Models

An algorithm designed for a weaker model can be

executed within the same time complexity and work
complexity on a stronger model
An algorithm designed for a stronger model can be
simulated on a weaker model, either with
Asymptotically more processors (more work)
Or asymptotically more time

1/23/08

HPC Fall 2007

Simulating a Priority CRCW on

an EREW PRAM

Theorem: An algorithm that runs in T time on the p-processor priority

CRCW PRAM can be simulated by EREW PRAM to run in O(T log p)
time

1/23/08

A concurrent read or write of an p-processor CRCW PRAM can be

implemented on a p-processor EREW PRAM to execute in O(log p) time
Q1,,Qp CRCW processors, such that Qi has to read (write) M[ji]
P1,,Pp EREW processors
M1,,Mp denote shared memory locations for special use
Pi stores <ji,i> in Mi
Sort pairs in lexicographically non-decreasing order in O(log p) time
using EREW merge sort algorithm
Pick representative from each block of pairs that have same first
component in O(1) time
Representative Pi reads (writes) from M[k] with <k,_> in Mi and copies
data to each M in the block in O(log p) time using EREW segmented
parallel prefix algorithm
Pi reads data from Mi
HPC Fall 2007

Reduction on the EREW PRAM

Reduce p values on the p-processor EREW PRAM in

O(log p) time
Reduction algorithm uses exclusive reads and writes
Algorithm is the basis of other EREW algorithms

1/23/08

HPC Fall 2007

Sum on the EREW PRAM

Sum of n values using n processors (i)
Input: A[1,,n], n = 2k
Output: S
begin
B[i] := A[i]
for h = 1 to log n do
if i < n/2h then
B[i] := B[2i-1] + B[2i]
if i = 1 then
S := B[i]
end
1/23/08

HPC Fall 2007

Matrix Multiplication

Consider nn matrix multiplication with n3 processors

Each cij = k=1..n aik bkj can be computed on the CREW
PRAM in parallel using n processors in O(log n) time
On the EREW PRAM exclusive reads of aij and bij values
can be satisfied by making n copies of a and b, which
takes O(log n) time with n processors (broadcast tree)
Total time is still O(log n)
Memory requirement is huge

1/23/08

HPC Fall 2007

Matrix Multiplication on the

CREW PRAM
Matrix multiply with n3 processors (i,j,l)
Input: nn matrices A and B, n = 2k
Output: C = AB
begin
C[i,j,l] := A[i,l]B[l,j]
for h = 1 to log n do
if i < n/2h then
C[i,j,l] := C[i,j,2l-1] + C[i,j,2l]
if l = 1 then
C[i,j] := C[i,j,1]
end
1/23/08

HPC Fall 2007

The WT Scheduling Principle

The work-time (WT) scheduling principle schedules p

processors to execute an algorithm
Algorithm has T(n) time steps
A time step can be parallel, i.e. pardo

Let Wi(n) be the number of operations (work) performed

in time unit i, 1 < i < T(n)
Simulate each set of Wi(n) operations in Wi(n)/p
parallel steps, for each 1 < i < T(n)
The p-processor PRAM takes
i Wi(n)/p < i (Wi(n)/p+1) < W(n)/p + T(n)
steps, where W(n) is the total number of operations

1/23/08

HPC Fall 2007

Work-Time Presentation

The WT presentation can be used to determine

computation and communication requirements of an
algorithm
The upper-level WT presentation framework describes
the algorithm in terms of a sequence of time units
The lower-level follows the WT scheduling principle

1/23/08

HPC Fall 2007

Matrix Multiplication on the

CREW PRAM WT-Presentation
Input: nn matrices A and B, n = 2k
Output: C = AB
begin
for 1 < i, j, l < n pardo
C[i,j,l] := A[i,l]B[l,j]
for h = 1 to log n do
for 1 < i, j < n, 1 < l < n/2h pardo
C[i,j,l] := C[i,j,2l-1] + C[i,j,2l]
for 1 < i, j < n pardo
C[i,j] := C[i,j,1]
end
1/23/08

HPC Fall 2007

WT scheduling principle:
O(n3/p + log n) time
15

PRAM Recursive Prefix Sum

Algorithm
Input: Array of (x1, x2, , xn) elements, n = 2k
Output: Prefix sums si, 1 < i < n
begin
if n = 1 then s1 = x1; exit
for 1 < i < n/2 pardo
yi := x2i-1 + x2i
Recursively compute prefix sums of y and store in z
for 1 < i < n pardo
if i is even then si := zi/2
else if i = 1 then s1 := x1
else si := z(i-1)/2 + xi
end
1/23/08

HPC Fall 2007

Proof of Work Optimality

Theorem: The PRAM prefix sum algorithm correctly

computes the prefix sum and takes T(n) = O(log n) time
using a total of W(n) = O(n) operations
Proof by induction on k, where input size n = 2k
Base case k = 0: s1 = x1
Assume correct for n = 2k
For n = 2k+1

For all 1 < j < n/2 we have

zj = y1 + y2 + + yj = (x1 + x2) + (x3 + x4) + (x2j-1 + x2j)
Hence, for i = 2j < n we have si = s2j = zj = zi/2
And i = 2j+1 < n we have si = s2j+1 = s2j + x2j+1 = zj + x2j+1 = z(i-1)/2 + xi

T(n) = T(n/2) + a
W(n) = W(n/2) + bn

1/23/08

T(n) = O(log n)
W(n) = O(n)
HPC Fall 2007

PRAM Nonrecursive Prefix Sum

Input: Array A of size n = 2k
Output: Prefix sums in C[0,j], 1 < j < n
begin
for 1 < j < n pardo
B[0,j] := A[j]
for h = 1 to log n do
for 1 < j < n/2h pardo
B[h,j] := B[h-1,2j-1] + B[h-1,2j]
for h = log n to 0 do
for 1 < j < n/2h pardo
if j is even then C[h,j] := C[h+1,j/2]
else if i = 1 then C[h,1] := B[h,1]
else C[h,j] := C[h+1,(j-1)/2] + B[h,j]
end
1/23/08

HPC Fall 2007

First Pass: Bottom-Up

B[3,j] =

B[2,j] =

B[1,j] =

-4

B[0,j] =

-6

-2

A[j] =

-6

-2

1/23/08

HPC Fall 2007

Second Pass: Top-Down

B[3,j] =
C[3,j] =

27
27

B[2,j] =
C[2,j] =
B[1,j] =
C[1,j] =

-4

B[0,j] =
C[0,j] =

-6

-2

A[j] =

-6

-2

1/23/08

HPC Fall 2007

Pointer Jumping

Finding the roots of a forest using pointer-jumping

1/23/08

HPC Fall 2007

Pointer Jumping on the CREW

PRAM
Input: A forest of trees, each with a self-loop at its root,
consisting of arcs (i,P(i)) and nodes i, where 1 < i < n
Output: For each node i, the root S[i]
begin
for 1 < i < n pardo
S[i] := P[i]
while S[i] S[S[i]] do
S[i] := S[S[i]]
end
T(n) = O(log h) with h the maximum height of trees
W(n) = O(n log h)
1/23/08

HPC Fall 2007

PRAM Model Summary

PRAM removes algorithmic details concerning

synchronization and communication, allowing the
algorithm designer to focus on problem properties
A PRAM algorithm includes an explicit understanding of
the operations performed at each time unit and an
explicit allocation of processors to jobs at each time unit
PRAM design paradigms have turned out to be robust
and have been mapped efficiently onto many other
parallel models and even network models

1/23/08

A SIMD network model considers communication diameter,

bisection width, and scalability properties of the network topology
of a parallel machine such as a mesh or hypercube
HPC Fall 2007

An Introduction to Parallel Algorithms, by J. JaJa, 1992

1/23/08

HPC Fall 2007

Par Seq Algorithms
No ratings yet
Par Seq Algorithms
44 pages
Chapter 02
No ratings yet
Chapter 02
47 pages
Assignment of Algorithm
No ratings yet
Assignment of Algorithm
9 pages
PRAM and RAM Models Explained
No ratings yet
PRAM and RAM Models Explained
17 pages
Pda 3
No ratings yet
Pda 3
90 pages
Parallel Computing: Algorithmic Models
No ratings yet
Parallel Computing: Algorithmic Models
41 pages
Parallel Random Access Machine (PRAM) : Control
No ratings yet
Parallel Random Access Machine (PRAM) : Control
9 pages
Parallel Computation Models
No ratings yet
Parallel Computation Models
59 pages
Pram
No ratings yet
Pram
22 pages
Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
PRAM Parallel Computing Algorithms
No ratings yet
PRAM Parallel Computing Algorithms
49 pages
Ram, Pram, and Logp Models
No ratings yet
Ram, Pram, and Logp Models
72 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
Parallel Random Access Machine
No ratings yet
Parallel Random Access Machine
22 pages
PRAM Algorithms
100% (1)
PRAM Algorithms
24 pages
1 Parallel and Distributed Computation
No ratings yet
1 Parallel and Distributed Computation
10 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
1 Overview, Models of Computation, Brent's Theorem
No ratings yet
1 Overview, Models of Computation, Brent's Theorem
8 pages
PRAM Models
No ratings yet
PRAM Models
4 pages
n32 Parallel
No ratings yet
n32 Parallel
16 pages
Parallel Random Access Machine
No ratings yet
Parallel Random Access Machine
8 pages
PRAM and Distributed Computing Report
No ratings yet
PRAM and Distributed Computing Report
5 pages
Parallel ALgs
No ratings yet
Parallel ALgs
16 pages
4 Pram Algorithms
No ratings yet
4 Pram Algorithms
16 pages
Fundamental Algorithms: Chapter 3: Parallel Algorithms - The PRAM Model
No ratings yet
Fundamental Algorithms: Chapter 3: Parallel Algorithms - The PRAM Model
26 pages
Simulating A CRCW Algorithm With An EREW Algorithm: Efficient Parallel Algorithms COMP308
No ratings yet
Simulating A CRCW Algorithm With An EREW Algorithm: Efficient Parallel Algorithms COMP308
11 pages
Abstract Machine Models in Parallel Computing
No ratings yet
Abstract Machine Models in Parallel Computing
48 pages
Parallel Algorithms for PRAM Models
No ratings yet
Parallel Algorithms for PRAM Models
4 pages
Pap 3 Shared Memory Algos
No ratings yet
Pap 3 Shared Memory Algos
23 pages
Parallel Algorithm Merged
No ratings yet
Parallel Algorithm Merged
76 pages
Module 3
No ratings yet
Module 3
104 pages
Sheet 2: Problem 1: Matrix Multiplication Using CREW PRAM
No ratings yet
Sheet 2: Problem 1: Matrix Multiplication Using CREW PRAM
3 pages
PRAM Models
No ratings yet
PRAM Models
6 pages
Unit-3.3 PRAM Model
No ratings yet
Unit-3.3 PRAM Model
29 pages
Parallel Algorithms: Theory and Practice
No ratings yet
Parallel Algorithms: Theory and Practice
44 pages
Parallel
No ratings yet
Parallel
59 pages
Parallel Algorithms
No ratings yet
Parallel Algorithms
19 pages
Notes 02
No ratings yet
Notes 02
9 pages
Teegala Krishna Reddy Engineering College: - Department of Computer Science & Engineering
No ratings yet
Teegala Krishna Reddy Engineering College: - Department of Computer Science & Engineering
7 pages
Week5 Lec14
No ratings yet
Week5 Lec14
27 pages
L2 Parallel Computing Models
No ratings yet
L2 Parallel Computing Models
31 pages
Parallel Algorithm Design Techniques
No ratings yet
Parallel Algorithm Design Techniques
13 pages
Three
No ratings yet
Three
10 pages
PRAM Model
No ratings yet
PRAM Model
72 pages
Bert 2a Parallel Algorithms Parfor Quicksort Reduction Listranking Rootfinding Postordernumbering
No ratings yet
Bert 2a Parallel Algorithms Parfor Quicksort Reduction Listranking Rootfinding Postordernumbering
73 pages
Pram
No ratings yet
Pram
23 pages
Advanced Parallel Processing Models
No ratings yet
Advanced Parallel Processing Models
38 pages
Advanced Parallel Computing Models
No ratings yet
Advanced Parallel Computing Models
38 pages
Advanced Parallel Algorithms
No ratings yet
Advanced Parallel Algorithms
56 pages
The Pram Model and Its Variation
No ratings yet
The Pram Model and Its Variation
47 pages
Parallel Algorithms Course Guide
No ratings yet
Parallel Algorithms Course Guide
13 pages
Parallel Computation Models Explained
No ratings yet
Parallel Computation Models Explained
3 pages
Parallel Random Access Machines: Next Prev Prev-Tail Tail Up
No ratings yet
Parallel Random Access Machines: Next Prev Prev-Tail Tail Up
6 pages
Unit - IV
No ratings yet
Unit - IV
56 pages
S23 PDC Mid Exam
No ratings yet
S23 PDC Mid Exam
2 pages
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
No ratings yet
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
104 pages
Lec 6
No ratings yet
Lec 6
8 pages
ACA Solution Manual
No ratings yet
ACA Solution Manual
39 pages
Digital Image Forgery Detection Using SIFT Feature: Rajeev Rajkumar Manglem Singh
100% (1)
Digital Image Forgery Detection Using SIFT Feature: Rajeev Rajkumar Manglem Singh
6 pages
Copy-Move Image Forgery Detection Using Normalized Colour Histogram Difference and Scale Invariant Feature Transform
100% (1)
Copy-Move Image Forgery Detection Using Normalized Colour Histogram Difference and Scale Invariant Feature Transform
8 pages
Digital Image Forgery Detection
No ratings yet
Digital Image Forgery Detection
6 pages
Smime
No ratings yet
Smime
9 pages
Biometric Attendance for Educators
No ratings yet
Biometric Attendance for Educators
4 pages
OS - Lecture Notes
No ratings yet
OS - Lecture Notes
128 pages
Framing
No ratings yet
Framing
5 pages
What Is The Need For Image Compression?
No ratings yet
What Is The Need For Image Compression?
38 pages
Enhance Knowledge
No ratings yet
Enhance Knowledge
109 pages
Definition of Computer:: Central Processing Unit (CPU)
No ratings yet
Definition of Computer:: Central Processing Unit (CPU)
2 pages
Transport Layer Protocols: Transmission Control Protocol (TCP)
No ratings yet
Transport Layer Protocols: Transmission Control Protocol (TCP)
20 pages
Color Models
No ratings yet
Color Models
81 pages
A Chronological Review of Fingerprint Forgery
100% (1)
A Chronological Review of Fingerprint Forgery
6 pages
AMME4710 Chap5 ColourIP
No ratings yet
AMME4710 Chap5 ColourIP
56 pages
15 Parallel Processing
No ratings yet
15 Parallel Processing
36 pages
Software Engg-Kk Agarwal, Yogesh Singh
No ratings yet
Software Engg-Kk Agarwal, Yogesh Singh
1,094 pages
Computer Architecture Insights
No ratings yet
Computer Architecture Insights
41 pages
Computer Architecture II: Specialized: Fall 2001
No ratings yet
Computer Architecture II: Specialized: Fall 2001
15 pages
Software Testing Life Cycle STLC
No ratings yet
Software Testing Life Cycle STLC
21 pages
Pro Pres 1
No ratings yet
Pro Pres 1
41 pages
Images at Work: The Material Culture of Enchantment David Morgan Latest PDF 2025
No ratings yet
Images at Work: The Material Culture of Enchantment David Morgan Latest PDF 2025
103 pages
P4 Maths CA2 2018henry Park
No ratings yet
P4 Maths CA2 2018henry Park
13 pages
Class 10 Maths Practice (August)
No ratings yet
Class 10 Maths Practice (August)
4 pages
Lab Manual Foundation Engineering
100% (1)
Lab Manual Foundation Engineering
39 pages
Chapter 14
No ratings yet
Chapter 14
12 pages
CIL MT Exam: Signals & Systems MCQs
No ratings yet
CIL MT Exam: Signals & Systems MCQs
11 pages
Football Player Detection Using YOLOv3
No ratings yet
Football Player Detection Using YOLOv3
10 pages
RGPV Notes - Machine Learning
No ratings yet
RGPV Notes - Machine Learning
4 pages
Robotics and Artificial Intelligence
No ratings yet
Robotics and Artificial Intelligence
3 pages
1.5.1.2 Drawing Forces Activity
No ratings yet
1.5.1.2 Drawing Forces Activity
3 pages
Assignment 01 Mathematics-2 BE02000011 2025
No ratings yet
Assignment 01 Mathematics-2 BE02000011 2025
2 pages
Corrections and Minor Revisions of Mathematical Methods in The Physical Sciences, Third Edition, by Mary L. Boas (Deceased)
No ratings yet
Corrections and Minor Revisions of Mathematical Methods in The Physical Sciences, Third Edition, by Mary L. Boas (Deceased)
6 pages
EE Lab Guide: Latches & Flip-Flops
No ratings yet
EE Lab Guide: Latches & Flip-Flops
16 pages
2022 Hmmb032 Supp Exam
No ratings yet
2022 Hmmb032 Supp Exam
11 pages
Circle Equations and Properties
No ratings yet
Circle Equations and Properties
21 pages
Experimental Lab Report
No ratings yet
Experimental Lab Report
17 pages
Sheet No 3 Oblique Projection
No ratings yet
Sheet No 3 Oblique Projection
4 pages
Nursing Statistics Assignment
100% (1)
Nursing Statistics Assignment
7 pages
Konechny 2002
No ratings yet
Konechny 2002
113 pages
Quadrilateral S
No ratings yet
Quadrilateral S
30 pages
5 Roll's Theorem.
No ratings yet
5 Roll's Theorem.
5 pages
Ch. 1 Electric Charges and Fields Revision Sheet
No ratings yet
Ch. 1 Electric Charges and Fields Revision Sheet
10 pages
CH 23
No ratings yet
CH 23
28 pages
1101 10 Iit-Genius
67% (3)
1101 10 Iit-Genius
33 pages
Case Studies in Engineering Economics For Electrical Engineering Students
No ratings yet
Case Studies in Engineering Economics For Electrical Engineering Students
6 pages
Extra Notes HSI
No ratings yet
Extra Notes HSI
9 pages
Parabola Equations & Graphs Guide
No ratings yet
Parabola Equations & Graphs Guide
17 pages
Load Flow 2
No ratings yet
Load Flow 2
23 pages
ER Model
No ratings yet
ER Model
6 pages
The Certified Quality Engineer Handbook Fourth Edition Sarah E. Burke PDF Download
100% (2)
The Certified Quality Engineer Handbook Fourth Edition Sarah E. Burke PDF Download
56 pages

The PRAM Model and Algorithms: Advanced Topics Spring 2008

Uploaded by

The PRAM Model and Algorithms: Advanced Topics Spring 2008

Uploaded by

The PRAM Model

The PRAM model of parallel computation

HPC Fall 2007

The PRAM Model of Parallel

Parallel Random Access Machine (PRAM)

Shared memory with m locations

p processors, each with private memory

All processors operate synchronously, by

Synchronous PRAM is a SIMD-style model

HPC Fall 2007

Classification of PRAM Model

A PRAM step (clock cycle) consists of three phases

Model is refined for concurrent read/write capability

HPC Fall 2007

Comparison of PRAM Models

A model A is less powerful compared to model B if either

From weakest to strongest:

HPC Fall 2007

Simulations Between PRAM

An algorithm designed for a weaker model can be

HPC Fall 2007

Simulating a Priority CRCW on

Theorem: An algorithm that runs in T time on the p-processor priority

A concurrent read or write of an p-processor CRCW PRAM can be

Reduction on the EREW PRAM

Reduce p values on the p-processor EREW PRAM in

HPC Fall 2007

Sum on the EREW PRAM

HPC Fall 2007

Consider nn matrix multiplication with n3 processors

HPC Fall 2007

Matrix Multiplication on the

HPC Fall 2007

The WT Scheduling Principle

The work-time (WT) scheduling principle schedules p

Let Wi(n) be the number of operations (work) performed

HPC Fall 2007

The WT presentation can be used to determine

HPC Fall 2007

Matrix Multiplication on the

HPC Fall 2007

PRAM Recursive Prefix Sum

HPC Fall 2007

Proof of Work Optimality

Theorem: The PRAM prefix sum algorithm correctly

For all 1 < j < n/2 we have

PRAM Nonrecursive Prefix Sum

HPC Fall 2007

First Pass: Bottom-Up

HPC Fall 2007

Second Pass: Top-Down

HPC Fall 2007

Finding the roots of a forest using pointer-jumping

HPC Fall 2007

Pointer Jumping on the CREW

HPC Fall 2007

PRAM Model Summary

PRAM removes algorithmic details concerning

A SIMD network model considers communication diameter,

An Introduction to Parallel Algorithms, by J. JaJa, 1992

HPC Fall 2007

You might also like