0% found this document useful (0 votes)

21 views12 pages

PDC ch#5

The document discusses the differences between sequential and parallel algorithms, emphasizing that parallel algorithms depend on multiple factors including processor count and communication speed. It highlights the importance of evaluating parallel algorithms in the context of the hardware they run on and the overheads that can affect performance, such as communication time and idling. Additionally, it introduces the concept of granularity in parallel computing, explaining how the size of work chunks assigned to processors can impact efficiency and performance.

Uploaded by

F223365 Muhammad Inshal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views12 pages

PDC ch#5

Uploaded by

F223365 Muhammad Inshal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

📌 Key Points: Parallel Algorithms and Systems

1. Sequential Algorithm:
o Execution time depends only on input size.
o Runs on a single processor.
2. Parallel Algorithm:
o Execution time depends on:
 Input size,
 Number of processors,
 Processor speed,
 Communication speed between processors.
o Uses multiple processors at the same time.
3. Evaluation Difference:
o Sequential algorithms are evaluated independently.
o Parallel algorithms must be evaluated with respect to the hardware they run on.
4. Parallel System = Algorithm + Architecture
o You must consider both the algorithm and the parallel computer system
(architecture) together.
5. Why Hardware Matters:
o The same algorithm may perform differently on different systems due to varying
hardware capabilities.

Sure! Here's the simplified explanation first, followed by the key points.

📖 Simplified Explanation:

When we try to measure how well a parallel program works, some basic methods are easy to
understand.

The simplest one is checking the wall clock time—just how long the program takes to run from
start to finish.

But this single number (time taken) isn’t enough when we want to:

 Run different sizes of problems,

 Or use more powerful machines with more processors.

Another way to measure performance is to compare the speed of the parallel version to the
serial (non-parallel) version. This shows how much faster we get by using parallelism.
But even this method has some issues. For example:

 What if the serial version is not well optimized, but the parallel version is?
 It might look like we got a big improvement just because the serial version was poor.

Because of such limitations, we need more advanced and reliable performance measures to
understand how a program will perform on bigger machines or with bigger problems.

📌 Key Points for Exam:

1. Wall Clock Time:

o Simple and intuitive measure.
o Shows total time to solve a problem on a given system.
o Not reliable for comparing across different problem sizes or systems.
2. Speedup (Parallel vs. Serial):
o Measures how much faster a parallel version is compared to a serial one.
o Helps show the benefit of using multiple processors.
3. Limitations of Simple Measures:
o Wall clock time doesn't scale well to different systems or problems.
o Speedup can be misleading if the serial version is not optimized.
4. Need for Complex Metrics:
o Advanced performance measures are needed for:
 Predicting how well a program runs on larger systems.
 Handling different problem sizes more accurately.
5. Takeaway:
o Relying only on simple metrics can give a false idea of performance.
o Performance should be analyzed using more detailed and scalable methods.
Sure! Let me break it down into simple words, and then give you key points for exam
preparation based on the image and explanation.

📖 Simplified Explanation:

When we run a program using many processors (parallel programming), we expect it to run
faster. For example, if we double the number of processors, we might hope it runs twice as fast.
But in real life, it’s not that simple, because of different types of overheads (extra time or work
that reduces efficiency).

Here are the main sources of overhead shown in the figure:

1. Inter-process Communication:
o Processors need to talk to each other to share results or data.
o This takes time and is one of the biggest sources of overhead.
o (In the diagram: shown in light gray)
2. Idling:
o Sometimes, some processors are waiting (doing nothing) while others are still
working.
o This happens due to:
 Unbalanced work,
 Waiting for others (synchronization),
 Serial parts in the program.
o (In the diagram: shown in white)
3. Excess Computation:
o The best serial (single-processor) algorithm might be hard to parallelize.
o So we may use a simpler but slower algorithm that works with parallelism.
o This extra or inefficient work is called excess computation.
o (In the diagram: combined with the black part but represents more than what the
serial program would do)

📝 Key Points for Exams:

1. Parallel programs do not always scale linearly (e.g., 2x processors ≠ 2x speed) due to
overheads.
2. Sources of Overhead in Parallel Programs:
o Inter-process Communication: Time spent in sharing data between processors.
o Idling: Processors waiting because of load imbalance, synchronization, or serial
parts.
o Excess Computation: Using less-efficient algorithms just because they can be
parallelized.
3. Execution Time Components (as seen in the figure):
o Essential/Excess Computation: Work being done (black).
o Communication Overhead: Data exchange time (gray).
o Idle Time: Waiting time (white).
4. Understanding overhead is important for improving the performance of parallel
programs.
5. Load Imbalance: When some processors do more work than others, causing idling.
6. Performance Optimization: Means reducing communication, minimizing idle time,
and choosing better parallel algorithms.

Great question! Here's the main point of the discussion:

⭐ Main Point:

Parallel programs are not always faster just because more processors are used.
They face different kinds of overheads—extra time and work—that reduce performance.

To build efficient parallel programs, it's important to understand and reduce these overheads:

 Communication between processors,

 Idle time (when processors wait),
 And extra or inefficient work (excess computation).

Understanding these factors helps us design better parallel systems that make the best use of
available hardware. Nm m m
Granularity in Parallel Computing Explained Simply
Let me break down what these slides are saying about granularity in parallel computing:

What is Granularity?

Granularity refers to how much work each processor handles. "Coarse granularity" means each
processor does larger chunks of work, while "fine granularity" means smaller chunks.

Key Points:

1. Practical Approach: Instead of giving each processor tiny bits of work, we often assign
larger chunks (increasing granularity).
2. Scaling Down: Sometimes we don't use all available processors - this is called "scaling
down." For example, if you have 100 tasks but use only 50 processors (instead of 100),
each processor handles 2 tasks.
3. Virtual Processing: If you have more inputs (n) than physical processors (p), you can
make each physical processor pretend to be multiple virtual processors. For example,
1,000 inputs with 100 processors means each processor acts as 10 virtual processors.
4. Performance Impact: When using this virtual mapping approach:
o The overall computation time increases by at most n/p (the number of virtual
processors per physical processor)
o The total work (cost) doesn't increase

5. Cost-Optimality Preservation: If your algorithm was cost-optimal with n processors, it

remains cost-optimal when scaled down to p processors (where p < n).
6. Potential Drawback: However, if your algorithm wasn't cost-optimal to begin with,
changing the granularity might not fix this problem and could even make it worse.

Simple Example:

Imagine sorting 1,000 numbers:

 With 1,000 processors: Each handles 1 number (fine granularity)

 With 100 processors: Each handles 10 numbers (coarser granularity)
 With 10 processors: Each handles 100 numbers (very coarse granularity)

The coarser approach often works better in practice because it reduces overhead from
communication between processors, even though theoretically the finest granularity might seem
fastest.
In simple words, "within a constant factor" means the ratio between two numbers stays fixed,
regardless of how big the problem gets.

For our example where:

 Sequential time = 10,000 time units

 Cost = 20,000 time units

The ratio is 20,000 ÷ 10,000 = 2

This means the cost is 2 times the sequential time. Being "within a constant factor" means:

1. The cost isn't growing much faster than the sequential time
2. The ratio (2 in this case) doesn't increase as the problem size grows
3. It's at most some fixed multiplier times the sequential time

For example, if we doubled our problem size:

 Sequential time might become 20,000 time units

 Cost would become 40,000 time units
 The ratio stays at 2 (it's constant)

Being "within a constant factor" is important because it means your parallel solution isn't doing
dramatically more total work than necessary. If your parallel algorithm had a cost that was, say,
100 times or 1000 times the sequential time, it would be wasting computational resources.

In practical terms: you're not paying a huge penalty for going parallel; you're just using about
twice as many resources total to get the speed advantage.

Lect5 Parallel System
No ratings yet
Lect5 Parallel System
41 pages
Untitled Document
No ratings yet
Untitled Document
39 pages
Untitled Document
No ratings yet
Untitled Document
63 pages
Week 7
No ratings yet
Week 7
27 pages
Analysis Modeling of Parallel Programs
No ratings yet
Analysis Modeling of Parallel Programs
4 pages
An Introduction To Parallel Algorithms
No ratings yet
An Introduction To Parallel Algorithms
66 pages
E - Notes - HPC-Unit 3-1
No ratings yet
E - Notes - HPC-Unit 3-1
26 pages
HPC Unit 456
No ratings yet
HPC Unit 456
25 pages
HPC Parallel
No ratings yet
HPC Parallel
122 pages
Parallelism
No ratings yet
Parallelism
67 pages
Unit1 2 and 3
No ratings yet
Unit1 2 and 3
76 pages
HPC Chapter 2
No ratings yet
HPC Chapter 2
16 pages
Week 7
No ratings yet
Week 7
27 pages
HPC Sem Q-Bank With Ans
No ratings yet
HPC Sem Q-Bank With Ans
32 pages
HPC Overview
No ratings yet
HPC Overview
45 pages
PC 2
No ratings yet
PC 2
44 pages
Intro to Serial & Parallel Computing
No ratings yet
Intro to Serial & Parallel Computing
39 pages
PDC Unit-2
No ratings yet
PDC Unit-2
48 pages
HPC Ut 2
No ratings yet
HPC Ut 2
4 pages
PDC Last Min Notes For MCQS - Theory
No ratings yet
PDC Last Min Notes For MCQS - Theory
39 pages
Parallel Computing: Performance Evaluation
No ratings yet
Parallel Computing: Performance Evaluation
40 pages
2 ND
No ratings yet
2 ND
19 pages
Parallel Program Design Guide
No ratings yet
Parallel Program Design Guide
52 pages
Module 3
No ratings yet
Module 3
104 pages
JaJa Parallel - Algorithms Intro
50% (2)
JaJa Parallel - Algorithms Intro
45 pages
Chapter 7 - Parallel Programming Issues
No ratings yet
Chapter 7 - Parallel Programming Issues
68 pages
Dis Top Tim Notes 1
No ratings yet
Dis Top Tim Notes 1
3 pages
Parallel Programming Essentials
No ratings yet
Parallel Programming Essentials
40 pages
Analytical Modeling in Parallel Computing
No ratings yet
Analytical Modeling in Parallel Computing
19 pages
Lectures5 14
No ratings yet
Lectures5 14
85 pages
OOAD
No ratings yet
OOAD
67 pages
Screenshot 2024-12-05 at 2.01.32 PM
No ratings yet
Screenshot 2024-12-05 at 2.01.32 PM
49 pages
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
No ratings yet
UNIT-8 Forms of Parallelism: 8.1 Simple Parallel Computation: Example 1: Numerical Integration Over Two Variables
12 pages
Assignment On: "Parallel Algorithm"
No ratings yet
Assignment On: "Parallel Algorithm"
4 pages
CS526 3 Design of Parallel Programs
No ratings yet
CS526 3 Design of Parallel Programs
83 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
No ratings yet
Analytical Modeling of Parallel Systems: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
36 pages
Parallel Algorithms Unit 2 by Dr. Choudhary Ravi Singh
No ratings yet
Parallel Algorithms Unit 2 by Dr. Choudhary Ravi Singh
18 pages
Parallel Algorithms Presentation
No ratings yet
Parallel Algorithms Presentation
32 pages
Parallel Algorithm Analysis
No ratings yet
Parallel Algorithm Analysis
11 pages
Unit 2 - 2.1 (Parallel Approaches)
No ratings yet
Unit 2 - 2.1 (Parallel Approaches)
11 pages
Unit 4
No ratings yet
Unit 4
64 pages
Module 1
No ratings yet
Module 1
14 pages
PDC Summers Finals Revision Notes
No ratings yet
PDC Summers Finals Revision Notes
50 pages
2 New Module 2 Performance Analysis of Multiprocessor Architectures Students Version
No ratings yet
2 New Module 2 Performance Analysis of Multiprocessor Architectures Students Version
13 pages
Parallel Computing Metrics
No ratings yet
Parallel Computing Metrics
11 pages
Parallel Programming: Lecture #9
No ratings yet
Parallel Programming: Lecture #9
24 pages
2022 Mid 1
No ratings yet
2022 Mid 1
4 pages
Performance Metrics For Parallel Programs: 8 March 2010
No ratings yet
Performance Metrics For Parallel Programs: 8 March 2010
44 pages
Parallel Algorithms Complete Notes
No ratings yet
Parallel Algorithms Complete Notes
13 pages
002 IntroHPC
No ratings yet
002 IntroHPC
33 pages
HPC Lecture 2 Points
No ratings yet
HPC Lecture 2 Points
7 pages
CS416 - Parallel and Distributed Computing: Lecture # 6 (19-03-2021) Spring 2021 FAST - NUCES, Faisalabad Campus
No ratings yet
CS416 - Parallel and Distributed Computing: Lecture # 6 (19-03-2021) Spring 2021 FAST - NUCES, Faisalabad Campus
31 pages
Parallel Computation Models: Slide 1
No ratings yet
Parallel Computation Models: Slide 1
28 pages
Slides
No ratings yet
Slides
44 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
HPC Notes Unit 3
No ratings yet
HPC Notes Unit 3
7 pages
Performance Metrices
100% (1)
Performance Metrices
18 pages
1-Fireworks 8.x Installation Procedure For Windows 10
100% (1)
1-Fireworks 8.x Installation Procedure For Windows 10
36 pages
AI GUARD Thesis Document
No ratings yet
AI GUARD Thesis Document
102 pages
3d Holographic Display and Its Data Transmission
No ratings yet
3d Holographic Display and Its Data Transmission
4 pages
Business Computer Apps Guide
No ratings yet
Business Computer Apps Guide
10 pages
Rules Beta v3
No ratings yet
Rules Beta v3
56 pages
Release Notes RW 6.08
No ratings yet
Release Notes RW 6.08
15 pages
Dvi2rgb - v1 - 7 Digilent FPGA Core IP Reference
No ratings yet
Dvi2rgb - v1 - 7 Digilent FPGA Core IP Reference
9 pages
CalibratedQ MXF UserGuide
No ratings yet
CalibratedQ MXF UserGuide
47 pages
Handwriting Recognition With Large Multidimensional Long Short-Term Memory Recurrent Neural Networks
No ratings yet
Handwriting Recognition With Large Multidimensional Long Short-Term Memory Recurrent Neural Networks
6 pages
Informatics Study Guide
100% (1)
Informatics Study Guide
8 pages
Class 9 Notes PT1 - New
No ratings yet
Class 9 Notes PT1 - New
3 pages
Ecdl Mod4
No ratings yet
Ecdl Mod4
91 pages
415 Media SQP T1
No ratings yet
415 Media SQP T1
7 pages
Relat Performance
No ratings yet
Relat Performance
25 pages
Instruction Timing and Execution in 8085
No ratings yet
Instruction Timing and Execution in 8085
50 pages
PDF Preview For Windows 10 and Windows Server 2016 (PDF Preview Handler For Windows)
No ratings yet
PDF Preview For Windows 10 and Windows Server 2016 (PDF Preview Handler For Windows)
2 pages
Lab 1
No ratings yet
Lab 1
4 pages
ATOP ABLELinkTM Serial Server GW21x Series Software Development Kit
No ratings yet
ATOP ABLELinkTM Serial Server GW21x Series Software Development Kit
103 pages
Accelerating Large Graph Algorithms On The GPU Using CUDA
No ratings yet
Accelerating Large Graph Algorithms On The GPU Using CUDA
12 pages
Res Map
No ratings yet
Res Map
2 pages
Narrativa Transmedia - Estudio Del Caso Gorillaz
No ratings yet
Narrativa Transmedia - Estudio Del Caso Gorillaz
73 pages
Ar and VR
No ratings yet
Ar and VR
52 pages
Log
No ratings yet
Log
4 pages
Softron Project List 1 2021 - 1528976670 2
No ratings yet
Softron Project List 1 2021 - 1528976670 2
31 pages
Mad Sample Viva Questions
No ratings yet
Mad Sample Viva Questions
19 pages
Procedural Terrain Generator For Platform Games Using Markov Chain
No ratings yet
Procedural Terrain Generator For Platform Games Using Markov Chain
4 pages
Installation of OpenVMS Into Personal Alpha
No ratings yet
Installation of OpenVMS Into Personal Alpha
25 pages
ManualSparkPOR V26
No ratings yet
ManualSparkPOR V26
30 pages
OS-CO2-Session 12 Implicit Threading
No ratings yet
OS-CO2-Session 12 Implicit Threading
33 pages
Java Foundations Introduction To Program Design and Data Structures 4th Edition Lewis Test Bank Instant Download
100% (15)
Java Foundations Introduction To Program Design and Data Structures 4th Edition Lewis Test Bank Instant Download
40 pages

PDC ch#5

Uploaded by

PDC ch#5

Uploaded by

📌 Key Points: Parallel Algorithms and Systems

 Run different sizes of problems,

📌 Key Points for Exam:

1. Wall Clock Time:

Here are the main sources of overhead shown in the figure:

📝 Key Points for Exams:

Great question! Here's the main point of the discussion:

 Communication between processors,

5. Cost-Optimality Preservation: If your algorithm was cost-optimal with n processors, it

Imagine sorting 1,000 numbers:

 With 1,000 processors: Each handles 1 number (fine granularity)

For our example where:

 Sequential time = 10,000 time units

The ratio is 20,000 ÷ 10,000 = 2

For example, if we doubled our problem size:

 Sequential time might become 20,000 time units

You might also like