0% found this document useful (0 votes)

110 views28 pages

All Numerical Unit-1

The document discusses the adaptation of programs for multiprocessor systems, detailing how execution time is affected by both computing and overhead times. It provides calculations for per-processor execution times and speedup ratios for various numbers of processors, illustrating the impact of overhead on performance. Additionally, Amdahl's law is introduced to explain the theoretical limits of speedup in parallel computing based on the proportion of the task that can be parallelized.

Uploaded by

Luvsingh Rajput

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

110 views28 pages

All Numerical Unit-1

Uploaded by

Luvsingh Rajput

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

When a program is adapted to run on multiple processors in a multiprocessor

system, the execution time on each processor is composed of computing time and
the overhead time required for locked critical sections and/or to send data from one
processor to another. Assume a program requires t=100s of execution time on one
processor. When run on p processors, each processor requires pts, as well as an
additional 4s of overhead, irrespective of the number of processors.
Compute the per-processor execution time for 2, 4, 8, 16, 32, 64, and 128
processors. For each case, list the corresponding speedup relative to a single
processor and the ratio between actual speedup versus ideal speedup (speedup if
there was no overhead).

The per-processor execution time for 2, 4, 8, 16, 32, 64, and 128 processors,
with each having an additional 4 s of overhead, is t/p + 4 seconds for each
case. Speedup and the ratio between actual speedup and ideal speedup can
be calculated accordingly.

Explanation:

When adapting a program to run on multiple processors in a multiprocessor

system, the execution time on each processor is divided into the computing
time and overhead time. In this scenario, we have a program that requires 100
seconds (t) of execution time on a single processor.

The execution time on p processors is given by (t/p) + 4 seconds. This is

because each processor takes t/p seconds for computation and an additional 4
seconds for overhead, which remains constant regardless of the number of
processors.

Now, let's calculate the per-processor execution time for different cases:

 For 2 processors: (100/2) + 4 = 54 seconds

 For 4 processors: (100/4) + 4 = 29 seconds
 For 8 processors: (100/8) + 4 = 16.5 seconds
 For 16 processors: (100/16) + 4 = 10.25 seconds
 For 32 processors: (100/32) + 4 = 7.125 seconds
 For 64 processors: (100/64) + 4 = 5.5625 seconds
 For 128 processors: (100/128) + 4 = 4.78125 seconds

Next, let's calculate the speedup for each case relative to a single processor:

 Speedup = Execution time on a single processor / Execution time on p

processors
 For 2 processors: 100 / 54 ≈ 1.85x
 For 4 processors: 100 / 29 ≈ 3.45x
 For 8 processors: 100 / 16.5 ≈ 6.06x
 For 16 processors: 100 / 10.25 ≈ 9.76x
 For 32 processors: 100 / 7.125 ≈ 14.04x
 For 64 processors: 100 / 5.5625 ≈ 17.97x
 For 128 processors: 100 / 4.78125 ≈ 20.94x

Finally, we can calculate the ratio between actual speedup and ideal speedup:

 Ideal speedup would be equal to the number of processors used (2, 4, 8, 16,
32, 64, 128) for each case. Comparing the actual speedup to ideal speedup
provides insights into the impact of overhead.
 The ratio can be calculated by dividing the actual speedup by the ideal
speedup.
2/21/25, 12:45 PM CS301: Amdahl's Law | Saylor Academy | Saylor Academy

Mark as done

This article introduces Amdahl's law, which can be used to measure how certain improvements to performance can be measured. A general form of
Amdahl's law is: (that part of the time that is improved/the improvement factor) plus that part of the time that is not improved equals the new improved
time. Speedup factor is the old time divided by the new time. Be sure to study the examples.

In computer architecture, Amdahl's law (or Amdahl's argument) is a formula which gives the theoretical speedup in latency of the execution
of a task at fixed workload that can be expected of a system whose resources are improved. It is named after computer scientist Gene
Amdahl, and was presented at the AFIPS Spring Joint Computer Conference in 1967.

Amdahl's law is often used in parallel computing to predict the theoretical speedup when using multiple processors. For example, if a
program needs 20 hours to complete using a single thread, but a one hour portion of the program cannot be parallelized, therefore only the
remaining 19 hours (p = 0.95) of execution time can be parallelized, then regardless of how many threads are devoted to a parallelized
execution of this program, the minimum execution time cannot be less than one hour. Hence, the theoretical speedup is limited to at most 20
1
times the single thread performance, ( = 20) .
1 − p

Amdahl's Law
20

18
Parallel portion
16 50%
75%
14 90%
95%
12
Speedup

0
1

128

256

512

1024

2048

4096

8192

16384

32768

65536

Number of processors

The theoretical speedup of the latency of the execution of a program as a function of the number of processors executing it, according to
Amdahl's law. The speedup is limited by the serial part of the program. For example, if 95% of the program can be parallelized, the
theoretical maximum speedup using parallel computing would be 20 times.

Definition
Amdahl's law can be formulated in the following way:
1
Slatency (s) = p
(1 − p) +
s

where

Slatency is the theoretical speedup of the execution of the whole task;

s is the speedup of the part of the task that benefits from improved system resources;
p is the proportion of execution time that the part benefiting from improved resources originally occupied.

Furthermore,
https://learn.saylor.org/mod/page/view.php?id=27029 1/4
2/21/25, 12:45 PM CS301: Amdahl's Law | Saylor Academy | Saylor Academy
⎧ 1
⎪S
⎪ latency (s) ≤
1 − p
⎨
1
⎪
⎩ (8pt] lim Slatency (s) =
⎪ .
s→∞ 1 − p

shows that the theoretical speedup of the execution of the whole task increases with the improvement of the resources of the system and
that regardless of the magnitude of the improvement, the theoretical speedup is always limited by the part of the task that cannot benefit
from the improvement.

Amdahl's law applies only to the cases where the problem size is fixed. In practice, as more computing resources become available, they
tend to get used on larger problems (larger datasets), and the time spent in the parallelizable part often grows much faster than the inherently
serial work. In this case, Gustafson's law gives a less pessimistic and more realistic assessment of the parallel performance.

Derivation
A task executed by a system whose resources are improved compared to an initial similar system can be split up into two parts:

a part that does not benefit from the improvement of the resources of the system;
a part that benefits from the improvement of the resources of the system.

An example is a computer program that processes files from disk. A part of that program may scan the directory of the disk and create a list
of files internally in memory. After that, another part of the program passes each file to a separate thread for processing. The part that scans
the directory and creates the file list cannot be sped up on a parallel computer, but the part that processes the files can.

The execution time of the whole task before the improvement of the resources of the system is denoted as T . It includes the execution time
of the part that would not benefit from the improvement of the resources and the execution time of the one that would benefit from it. The
fraction of the execution time of the task that would benefit from the improvement of the resources is denoted by p. The one concerning the
part that would not benefit from it is therefore 1 − p. Then:

T = (1 − p)T + pT

It is the execution of the part that benefits from the improvement of the resources that is accelerated by the factor s after the improvement of
the resources. Consequently, the execution time of the part that does not benefit from it remains the same, while the part that benefits from it
becomes:
p
T
s

The theoretical execution time T (s) of the whole task after the improvement of the resources is then:
p
T (s) = (1 − p)T + T
s

Amdahl's law gives the theoretical speedup in latency of the execution of the whole task at fixed workload W , which yields
TW T 1
Slatency (s) = = = p
T (s)W T (s) 1−p+
s

Parallel programs
If 30% of the execution time may be the subject of a speedup, p will be 0.3; if the improvement makes the affected part twice as fast, s will
be 2. Amdahl's law states that the overall speedup of applying the improvement will be:
1 1
Slatency = p = = 1.18
0.3
1−p+ 1−0.3+
s 2

For example, assume that we are given a serial task which is split into four consecutive parts, whose percentages of execution time are p1 =
0.11, p2 = 0.18, p3 = 0.23, and p4 = 0.48 respectively. Then we are told that the 1st part is not sped up, so s1 = 1, while the 2nd part is sped up
5 times, so s2 = 5, the 3rd part is sped up 20 times, so s3 = 20, and the 4th part is sped up 1.6 times, so s4 = 1.6. By using Amdahl's law, the
overall speedup is
1 1
Slatency = p1 p2 p3 p4
= 0.11 0.18 0.23 0.48
= 2.19
+ + + + + +
s1 s2 s3 s4 1 5 20 1.6

Notice how the 5 times and 20 times speedup on the 2nd and 3rd parts respectively don't have much effect on the overall speedup when the
4th part (48% of the execution time) is accelerated by only 1.6 times.

Serial programs

https://learn.saylor.org/mod/page/view.php?id=27029 2/4
2/21/25, 12:45 PM CS301: Amdahl's Law | Saylor Academy | Saylor Academy

Two independent parts A B

Original process

Make B 5x faster

Make A 2x faster

Assume that a task has two independent parts, A and B. Part B takes roughly 25% of the time of the whole computation. By working very
hard, one may be able to make this part 5 times faster, but this reduces the time of the whole computation only slightly. In contrast, one may
need to perform less work to make part A perform twice as fast. This will make the computation much faster than by optimizing part B, even
though part B's speedup is greater in terms of the ratio, (5 times versus 2 times).

For example, with a serial program in two parts A and B for which TA = 3 s and TB = 1 s,

if part B is made to run 5 times faster, that is s = 5 and p = TB/(TA + TB) = 0.25, then
1
Slatency = 0.25
= 1.25
1−0.25+
5

if part A is made to run 2 times faster, that is s = 2 and p = TA/(TA + TB) = 0.75, then
1
Slatency = = 1.60
0.75
1−0.75+
2

Therefore, making part A to run 2 times faster is better than making part B to run 5 times faster. The percentage improvement in speed can
be calculated as

1
percentage improvement = 100 (1 − )
S latency

Improving part A by a factor of 2 will increase overall program speed by a factor of 1.60, which makes it 37.5% faster than the original
computation.
However, improving part B by a factor of 5, which presumably requires more effort, will achieve an overall speedup factor of 1.25 only,
which makes it 20% faster.

Optimizing the sequential part of parallel programs

If the non-parallelizable part is optimized by a factor of O, then
T p
T (O, s) = (1 − p) + T.
O s

It follows from Amdahl's law that the speedup due to parallelism is given by
1
T (O) (1−p) +p
O
Slatency (O, s) = = 1−p p
T (O,s) +
O s

When s = 1, we have S (O, s) = 1, meaning that the speedup is measured with respect to the execution time after the non-
latency

parallelizable part is optimized.

When s = ∞,
1

T (O) (1−p) +p
O
p
Slatency (O, ∞) = = 1−p p
= 1 + O
T (O,s) 1−p
+
O s

If 1 − p = 0.4 , O = 2, and s = 5, then:

1
T (O) 0.4 +0.6
2
Slatency (O, s) = = = 2.5
0.4 0.6
T (O,s) +
2 5

Transforming sequential parts of parallel programs into parallelizable

Next, we consider the case wherein the non-parallelizable part is reduced by a factor of O , and the parallelizable part is correspondingly
′

increased. Then
1−p 1−p T
′ ′
T (O , s) = ′
T + (1 − ′
)
O O s

It follows from Amdahl's law that the speedup due to parallelism is given by

https://learn.saylor.org/mod/page/view.php?id=27029 3/4
2/21/25, 12:45 PM CS301: Amdahl's Law | Saylor Academy | Saylor Academy
′ ′
T (O ) 1
′ ′
S (O , s) = ′ ′
=
latency T (O ,s)
1−p 1−p 1
+(1− )
′ ′ s
O O

The derivation above is in agreement with Jakob Jenkov's analysis of the execution time vs. speedup tradeoff.

Relation to the law of diminishing returns

Amdahl's law is often conflated with the law of diminishing returns, whereas only a special case of applying Amdahl's law demonstrates law
of diminishing returns. If one picks optimally (in terms of the achieved speedup) what to improve, then one will see monotonically decreasing
improvements as one improves. If, however, one picks non-optimally, after improving a sub-optimal component and moving on to improve a
more optimal component, one can see an increase in the return. Note that it is often rational to improve a system in an order that is "non-
optimal" in this sense, given that some improvements are more difficult or require larger development time than others.

Amdahl's law does represent the law of diminishing returns if on considering what sort of return one gets by adding more processors to a
machine, if one is running a fixed-size computation that will use all available processors to their capacity. Each new processor added to the
system will add less usable power than the previous one. Each time one doubles the number of processors the speedup ratio will diminish, as
the total throughput heads toward the limit of 1/(1 − p).

This analysis neglects other potential bottlenecks such as memory bandwidth and I/O bandwidth. If these resources do not scale with the
number of processors, then merely adding processors provides even lower returns.

An implication of Amdahl's law is that to speedup real applications which have both serial and parallel portions, heterogeneous
computing techniques are required. For example, a CPU-GPU heterogeneous processor may provide higher performance and energy
efficiency than a CPU-only or GPU-only processor.

Source: Wikipedia, https://en.wikipedia.org/wiki/Amdahl%27s_law

 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License .

Last modified: Wednesday, July 22, 2020, 12:27 PM

Contact site support 

You are currently using guest access (Log in)

Policies
Get the mobile app

© Saylor Academy 2010-2024 except as otherwise noted. Excluding course final exams, content authored by Saylor Academy is available under a
Creative Commons Attribution 3.0 Unported license . Third-party materials are the copyright of their respective owners and shared under various
licenses. See detailed licensing information. Saylor Academy®, Saylor.org®, and Harnessing Technology to Make Education Free® are trade names
of the Constitution Foundation, a 501(c)(3) organization through which our educational activities are conducted.

b ~ y  

Terms & Conditions

https://learn.saylor.org/mod/page/view.php?id=27029 4/4
Amdahl’s Law
No text on concurrency is complete without mentioning Amdahl’s Law. Blindly adding
threads to speed up program execution may not always be a good idea. The law
specifies the cap on the maximum speedup that can be achieved when parallelizing the
execution of a program.

Amdahl’s Law Graph

Context
If you have a poultry farm where a hundred hens lay eggs each day, then no matter how

many people you hire to process the laid eggs, you still need to wait an entire day for the

100 eggs to be laid. Increasing the number of workers on the farm can’t shorten the time

it takes for a hen to lay an egg. Similarly, software programs consist of parts that can’t be

sped up even if the number of processors is increased. These parts of the program must

execute serially and aren’t amenable to parallelism.

The Law Itself

Amdahl’s law describes the theoretical speedup a program can achieve at best by using

additional computing resources.

Amdahl’s Law

Example
Let’s say our program has a parallelizable portion of P = 90% = 0.9. Now let’s see how

the speed-up occurs as we increase the number of processes.

Example of Amdahl’s Law

The speed-up steadily increases as we increase the number of processors or threads.

However, as you can see the theoretical maximum speed-up for our program with 10%
serial execution will be 10. We can’t speed up our program execution more than 10 times
compared to when we run the same program on a single CPU or thread. To achieve greater
speed-ups than 10 we must optimize or parallelize the serially executed portion of the code.

Example:
https://gateoverflow.in/422797/gate-cse-2024-set-1-question-
45?show=445501#a445501
2/21/25, 12:58 PM Amdahl's law - Wikipedia

Amdahl's law
In computer architecture, Amdahl's
law (or Amdahl's argument[1]) is a
formula that shows how much faster
a task can be completed when more
resources are added to the system.

The law can be stated as:

"the overall performance

improvement gained by
optimizing a single part of
a system is limited by the
fraction of time that the
improved part is actually
used".[2]
Amdahl's Law demonstrates the theoretical maximum speedup of
It is named after computer scientist an overall system and the concept of diminishing returns. Plotted
Gene Amdahl, and was presented at here is logarithmic parallelization vs linear speedup. If exactly 50%
of the work can be parallelized, the best possible speedup is 2
the American Federation of
times. If 95% of the work can be parallelized, the best possible
Information Processing Societies
speedup is 20 times. According to the law, even with an infinite
(AFIPS) Spring Joint Computer number of processors, the speedup is constrained by the
Conference in 1967. unparallelizable portion.

Amdahl's law is often used in parallel

computing to predict the theoretical speedup when using multiple processors.

Definition
In the context of Amdahl's law, speedup can be defined as: [3]

Amdahl's law can be formulated in the following way: [4]

https://en.wikipedia.org/wiki/Amdahl%27s_law 1/8
2/21/25, 12:58 PM Amdahl's law - Wikipedia

where

represents the total speedup of a program

represents time spent on the code where parallelism is used
represents the extent of the improvement
The is frequently much lower than one might expect. For instance, if a programmer
enhances a part of the code that represents 10% of the total execution time (i.e. of
0.10) and achieves a of 10,000, then becomes 1.11 which means
only 11% improvement in total speedup of the program. So, despite a massive improvement in one
section, the overall benefit is quite small. In another example, if the programmer optimizes a
section that accounts for 99% of the execution time (i.e. of 0.99) with a speedup
factor of 100 (i.e. of 100), the only reaches 50. This indicates that
half of the potential performance gain ( will reach 100 if 100% of the execution time
is covered) is lost due to the remaining 1% of execution time that was not improved. [4]

Implications
Followings are implications of Amdahl's law: [5][6]

Diminishing Returns: Adding more processors gives diminishing returns. Beyond a certain
point, adding more processors doesn't significantly increase speedup.
Limited Speedup: Even with many processors, there's a limit to how much faster a task can
be completed due to parts of the task that cannot be parallelized.

Limitations
Followings are limitations of Amdahl's law: [7][3][8]

Assumption of Fixed Workload: Amdahl's Law assumes that the workload remains constant.
It doesn't account for dynamic or increasing workloads, which can impact the effectiveness of
parallel processing.
Overhead Ignored: Amdahl's Law neglects overheads associated with concurrency, including
coordination, synchronization, inter-process communication, and concurrency control. Notably,
merging data from multiple threads or processes incurs significant overhead due to conflict
resolution, data consistency, versioning, and synchronization. [9]
Neglecting extrinsic factors: Amdahl's Law addresses computational parallelism, neglecting
extrinsic factors such as data persistence, I/O operations, and memory access overheads, and
assumes idealized conditions.
Scalability Issues: While it highlights the limits of parallel speedup, it doesn't address practical
scalability issues, such as the cost and complexity of adding more processors.
Non-Parallelizable Work: Amdahl's Law emphasizes the non-parallelizable portion of the task
as a bottleneck but doesn’t provide solutions for reducing or optimizing this portion.
Assumes Homogeneous Processors: It assumes that all processors are identical and
contribute equally to speedup, which may not be the case in heterogeneous computing
environments.
https://en.wikipedia.org/wiki/Amdahl%27s_law 2/8
2/21/25, 12:58 PM Amdahl's law - Wikipedia

Amdahl's law applies only to the cases where the problem size is fixed. In practice, as more
computing resources become available, they tend to get used on larger problems (larger datasets),
and the time spent in the parallelizable part often grows much faster than the inherently serial
work. In this case, Gustafson's law gives a less pessimistic and more realistic assessment of the
parallel performance.[10]

Universal Scalability Law (USL), developed by Neil J. Gunther, extends the Amdahl's law and
accounts for the additional overhead due to inter-process communication. USL quantifies
scalability based on parameters such as contention and coherency. [11]

Derivation
A task executed by a system whose resources are improved compared to an initial similar system
can be split up into two parts:

a part that does not benefit from the improvement of the resources of the system;
a part that benefits from the improvement of the resources of the system.
An example is a computer program that processes files. A part of that program may scan the
directory of the disk and create a list of files internally in memory. After that, another part of the
program passes each file to a separate thread for processing. The part that scans the directory and
creates the file list cannot be sped up on a parallel computer, but the part that processes the files
can.

The execution time of the whole task before the improvement of the resources of the system is
denoted as . It includes the execution time of the part that would not benefit from the
improvement of the resources and the execution time of the one that would benefit from it. The
fraction of the execution time of the task that would benefit from the improvement of the resources
is denoted by . The one concerning the part that would not benefit from it is therefore .
Then:

It is the execution of the part that benefits from the improvement of the resources that is
accelerated by the factor after the improvement of the resources. Consequently, the execution
time of the part that does not benefit from it remains the same, while the part that benefits from it
becomes:

The theoretical execution time of the whole task after the improvement of the resources is
then:

Amdahl's law gives the theoretical speedup in latency of the execution of the whole task at fixed
workload , which yields

https://en.wikipedia.org/wiki/Amdahl%27s_law 3/8
2/21/25, 12:58 PM Amdahl's law - Wikipedia

Parallel programs
If 30% of the execution time may be the subject of a speedup, p will be 0.3; if the improvement
makes the affected part twice as fast, s will be 2. Amdahl's law states that the overall speedup of
applying the improvement will be:

For example, assume that we are given a serial task which is split into four consecutive parts,
whose percentages of execution time are p1 = 0.11, p2 = 0.18, p3 = 0.23, and p4 = 0.48
respectively. Then we are told that the 1st part is not sped up, so s1 = 1, while the 2nd part is sped
up 5 times, so s2 = 5, the 3rd part is sped up 20 times, so s3 = 20, and the 4th part is sped up 1.6
times, so s4 = 1.6. By using Amdahl's law, the overall speedup is

Notice how the 5 times and 20 times speedup on the 2nd and 3rd parts respectively don't have
much effect on the overall speedup when the 4th part (48% of the execution time) is accelerated by
only 1.6 times.

Serial programs
For example, with a serial program in
two parts A and B for which TA = 3 s
and TB = 1 s,

if part B is made to run 5 times

faster, that is s = 5 and
p = TB/(TA + TB) = 0.25, then

Assume that a task has two independent parts, A and B. Part B

takes roughly 25% of the time of the whole computation. By
working very hard, one may be able to make this part 5 times
faster, but this reduces the time of the whole computation only
slightly. In contrast, one may need to perform less work to make
part A perform twice as fast. This will make the computation much
faster than by optimizing part B, even though part B's speedup is
greater in terms of the ratio, (5 times versus 2 times).

https://en.wikipedia.org/wiki/Amdahl%27s_law 4/8
2/21/25, 12:58 PM Amdahl's law - Wikipedia

if part A is made to run 2 times faster, that is s = 2 and p = TA/(TA + TB) = 0.75, then

Therefore, making part A to run 2 times faster is better than making part B to run 5 times faster.
The percentage improvement in speed can be calculated as

Improving part A by a factor of 2 will increase overall program speed by a factor of 1.60, which
makes it 37.5% faster than the original computation.
However, improving part B by a factor of 5, which presumably requires more effort, will achieve
an overall speedup factor of 1.25 only, which makes it 20% faster.

Optimizing the sequential part of parallel programs

If the non-parallelizable part is optimized by a factor of , then

It follows from Amdahl's law that the speedup due to parallelism is given by

When , we have , meaning that the speedup is measured with respect to

the execution time after the non-parallelizable part is optimized.

When ,

If , and , then:

Transforming sequential parts of parallel programs into parallelizable

Next, we consider the case wherein the non-parallelizable part is reduced by a factor of , and the
parallelizable part is correspondingly increased. Then

https://en.wikipedia.org/wiki/Amdahl%27s_law 5/8
2/21/25, 12:58 PM Amdahl's law - Wikipedia

It follows from Amdahl's law that the speedup due to parallelism is given by

Relation to the law of diminishing returns

Amdahl's law is often conflated with the law of diminishing returns, whereas only a special case of
applying Amdahl's law demonstrates law of diminishing returns. If one picks optimally (in terms of
the achieved speedup) what is to be improved, then one will see monotonically decreasing
improvements as one improves. If, however, one picks non-optimally, after improving a sub-
optimal component and moving on to improve a more optimal component, one can see an increase
in the return. Note that it is often rational to improve a system in an order that is "non-optimal" in
this sense, given that some improvements are more difficult or require larger development time
than others.

Amdahl's law does represent the law of diminishing returns if one is considering what sort of
return one gets by adding more processors to a machine, if one is running a fixed-size computation
that will use all available processors to their capacity. Each new processor added to the system will
add less usable power than the previous one. Each time one doubles the number of processors the
speedup ratio will diminish, as the total throughput heads toward the limit of 1/(1 − p).

This analysis neglects other potential bottlenecks such as memory bandwidth and I/O bandwidth.
If these resources do not scale with the number of processors, then merely adding processors
provides even lower returns.

An implication of Amdahl's law is that to speed up real applications which have both serial and
parallel portions, heterogeneous computing techniques are required.[12] There are novel speedup
and energy consumption models based on a more general representation of heterogeneity, referred
to as the normal form heterogeneity, that support a wide range of heterogeneous many-core
architectures. These modelling methods aim to predict system power efficiency and performance
ranges, and facilitates research and development at the hardware and system software levels.[13][14]

See also
Gustafson's law
Universal Law of Computational Scalability
Analysis of parallel algorithms
Critical path method
Moore's law

References
1. Rodgers, David P. (June 1985). "Improvements in multiprocessor system design". ACM
SIGARCH Computer Architecture News. 13 (3). New York, NY, USA: ACM: 225–231 [p. 226].
doi:10.1145/327070.327215 (https://doi.org/10.1145%2F327070.327215). ISBN 0-8186-0634-

https://en.wikipedia.org/wiki/Amdahl%27s_law 6/8
The Pentium 4 Prescott processor, released in 2004, had a clock rate of 3.6 GHz
and voltage of 1.25 V. Assume that, on average, it consumed 10 W of static
power and 90 W of dynamic power. The Core i5 Ivy Bridge, released in 2012, had
a clock rate of 3.4 GHz and voltage of 0.9 V. Assume that, on average, it
consumed 30 W of static power and 40 W of dynamic power.

1.8.1 For each processor find the average capacitive loads.

1.8.2 Find the percentage of the total dissipated power comprised by static
power and the ratio of static power to dynamic power for each technology.

1.8.3 If the total dissipated power is to be reduced by 10%, how much should
the voltage be reduced to maintain the same leakage current? Note: power is
defined as the product of voltage and current

Short Answer

Expert verified
1.8.1)

average capacitive loads = 2.904 * 10-8

1.8.2)

Ratio of the static power to the dynamic power for each of the technology is 0.75

1.8.3)

Reduction in voltage is a 9.8 % reduction from 0.9 V

Step by step solution

01
Determine the formulas

Write the formula for capacitive load

Capacitiveload=Dynamicpower0.5V2Clockrate ..... (1)

Write the formula for Pentium r prescoot processor

Pentiumrprescootprocessor=StaticDynamic
….. (2)

Write the formula for percentage reduction is

Newvoltage-OldvoltageNewvoltage×100

Write the formula for Total power.

Totalpower=Staticpower+Dynamicpower ..… (4)

02
Determine the average capacitive load

The average capacitive load for both of the processors is given by the formula:

Capacitiveload=DynamicPower0.5*v2*clockrate

For Pentium 4 Prescoott Processor, Capacitive Load

role="math" localid="1650926173043" =900.5·1.252·3.6×109=3.2×10-8 (Since 1

GHz - 109 Hz)

For core i5 capacitive load

=400.5·0.92·3.4×109=2.904×10-8

So, we can see that the capacitive load on the core i5 processor is less than that
of the Pentium 4 Prescoot processor.
03
Find the percentage of the total dissipated power

1.8.2)

Ratio of the static power to the dynamic power for each of the technology is given
as:

For pentium 4 prescoot processor

=StaticpowerDynamicPower=1090=0.11
For core i5
=StaticpowerDynamicPower=3040=0.75
04
Find after dissipated power is to be reduced by 10%

1.8.3)

We know , Total Power=Static Power+Dynamic Power

And that leakage current is due to static power.

Since P=IV ,

I (leakage current) = P/V

Because the total power dissipated is diminished by 10%,

P2=1-0.1P1=0.9P1

where P₁is the total power dissipated before the 10% reduction and P₂is the new
power dissipated after the 10% reduction in total dissipated power.

Let new total dissipated power,

For the Pentium 4 Prescott processor,

localid="1650927872055" P2=NewstaticpowerI2V2+newdynamicpower12×V22×
f2P2=I2V2+12×V22×f2=0.9P1

Since I₂ (leakage current)=StaticpowerVoltage

localid="1650928232901" =10Watt1.25V=8A

(because the leakage current is fixed), we have

8A×V2+12×32×10F×V22×3.6×10Hz=0.9×100

8V2+57.6V2=90V=1.18Vor-1.32V. As a result, the quadratic equation is formed.

By the quadratic formula,

The new voltage is 1.18 V if you select the positive option.

The percentage reduction is=

newvoltage-oldvoltagenewvoltage×100%=1.18-1.251.18×100%

.Which is a 5.9 % reduction from 1.25 V

Since

I(leakage current)=staticpowervoltage=30W0.9v=33.33A

(because the leakage current is fixed)

33.33A×V2+12×29.05×10F×V22×3.4×10Hz=0.9×7033.33V+49.385V2=63

As a result, the quadratic equation is formed.

33.33V+49.385V2-63=0

By the quadratic formula,

V=0.82V or -2.30V

The new voltage is 0.82 V if you select the positive option

. Percentage reduction=0.82-0.90.82×100%=-9.8%

Which is a 9.8 % reduction from 0.9 V

1.9 Assume for arithmetic, load/store, and branch instructions, a processor has
CPIs of 1, 12, and 5, respectively. Also assume that on a single processor a
program requires the execution of 2.56E9 157 arithmetic instructions, 1.28E9
load/store instructions, and 256 million branch instructions. Assume that each
processor has a 2 GHz clock frequency. Assume that, as the program is
parallelized to run over multiple cores, the number of arithmetic and load/store
instructions per processor is divided by 0.7 × p (where p is the number of
processors) but the number of branch instructions per processor remains the
same. 1.9.1 [5] <§1.7> Find the total execution time for this program on 1, 2, 4,
and 8 processors, and show the relative speedup of the 2, 4, and 8 processors
result relative to the single processor result. 1.9.2 [10] <§§1.6, 1.8> If the CPI of
the arithmetic instructions was doubled, what would the impact be on the
execution time of the program on 1, 2, 4, or 8 processors? 1.9.3 [10] <§§1.6, 1.8>
To what should the CPI of load/store instructions be reduced in order for a single
processor to match the performance of four processors using the original CPI
values?
Answer
1.9.1 Execution Time and Speedup
The execution time for a program can be calculated using the formula:
Execution Time = (Number of Instructions * CPI) / Clock Frequency
The speedup can be calculated using the formula:
Speedup = Execution Time (Single Processor) / Execution Time (Multiple
Processors)
Let's calculate the execution time and speedup for 1, 2, 4, and 8 processors.
Number of Processors (p)Execution Time (seconds)Speedup

1 1

1.9.2 Impact of Doubling the CPI of Arithmetic Instructions

If the CPI of the arithmetic instructions was doubled, the execution time would
increase. This is because the CPI (Cycles Per Instruction) is directly proportional
to the execution time. The higher the CPI, the longer it takes to execute an
instruction.
1.9.3 Reducing the CPI of Load/Store Instructions
To match the performance of four processors using the original CPI values, the
CPI of load/store instructions should be reduced. This can be calculated using
the formula:
CPI (Load/Store) = Total Execution Time (4 processors) / (Number of Load/Store
Instructions / Clock Frequency)
Please note that these are general explanations and the actual calculations
would require the specific values provided in the question.

Lecture 3.1.4 (Amdahl's Law)
No ratings yet
Lecture 3.1.4 (Amdahl's Law)
4 pages
Performance Metrics
No ratings yet
Performance Metrics
34 pages
5 Problems PDF
No ratings yet
5 Problems PDF
32 pages
5 Amdahl
No ratings yet
5 Amdahl
3 pages
Amdahl's Law
No ratings yet
Amdahl's Law
5 pages
Lecture 3.1.4 (Amdahl's Law)
No ratings yet
Lecture 3.1.4 (Amdahl's Law)
13 pages
B38DF LS2b Performance
No ratings yet
B38DF LS2b Performance
20 pages
COE4590 12 Amdahls Law
No ratings yet
COE4590 12 Amdahls Law
18 pages
Document
No ratings yet
Document
10 pages
Parallel Algorithm Performance
No ratings yet
Parallel Algorithm Performance
10 pages
Amdahl's Law, Also Known As Amdahl's Argument,: Parallel Computing Speedup Computer Architect Gene Amdahl Afips
No ratings yet
Amdahl's Law, Also Known As Amdahl's Argument,: Parallel Computing Speedup Computer Architect Gene Amdahl Afips
3 pages
Lecture-11 Amdhals Law Gustafsons Law
No ratings yet
Lecture-11 Amdhals Law Gustafsons Law
16 pages
Cao AMDAHL's Law
No ratings yet
Cao AMDAHL's Law
4 pages
Laraib Cs - 39 Assig 1
No ratings yet
Laraib Cs - 39 Assig 1
4 pages
Parallel Algorithm Analysis
No ratings yet
Parallel Algorithm Analysis
11 pages
Principles of Scalable Performance
No ratings yet
Principles of Scalable Performance
61 pages
AMDAHL's LAW
No ratings yet
AMDAHL's LAW
3 pages
Amdahl Law
No ratings yet
Amdahl Law
2 pages
Parallel Systems Performance Guide
No ratings yet
Parallel Systems Performance Guide
40 pages
Understanding Amdahl's Law in Computing
No ratings yet
Understanding Amdahl's Law in Computing
18 pages
Module 1-Performance Measure
No ratings yet
Module 1-Performance Measure
14 pages
Lecture 6 (Amdahl's Law)
No ratings yet
Lecture 6 (Amdahl's Law)
13 pages
CS-3006 10 PerformanceAnalysis
No ratings yet
CS-3006 10 PerformanceAnalysis
52 pages
Speed Up Laws
No ratings yet
Speed Up Laws
21 pages
Coa Presentation
No ratings yet
Coa Presentation
20 pages
Amdahl's Law 3.1.6
No ratings yet
Amdahl's Law 3.1.6
3 pages
3.1 Computer Performance
No ratings yet
3.1 Computer Performance
16 pages
Amdahls Law - Advanced Computer Architecture
No ratings yet
Amdahls Law - Advanced Computer Architecture
2 pages
Amdahl's Law & Parallel Processing
No ratings yet
Amdahl's Law & Parallel Processing
16 pages
Amdahl's Law (Autosaved)
No ratings yet
Amdahl's Law (Autosaved)
12 pages
Pc7 Performance
No ratings yet
Pc7 Performance
50 pages
Speedup Amdahl Law
No ratings yet
Speedup Amdahl Law
8 pages
Unit 4 - Analytical Modeling of Parallel Programs
No ratings yet
Unit 4 - Analytical Modeling of Parallel Programs
37 pages
Computer Architecture - Lect 6
No ratings yet
Computer Architecture - Lect 6
7 pages
Lecture 2 Amdahl's Law and Karp-Flatt Metric
0% (1)
Lecture 2 Amdahl's Law and Karp-Flatt Metric
14 pages
Reevaluating Amdahl's Law
No ratings yet
Reevaluating Amdahl's Law
3 pages
Zindagi Zama Da
No ratings yet
Zindagi Zama Da
21 pages
Big Data - Merged
No ratings yet
Big Data - Merged
24 pages
Amdahl's Law: Parallel Computing Limits
No ratings yet
Amdahl's Law: Parallel Computing Limits
9 pages
Lect 02
No ratings yet
Lect 02
51 pages
Principles of Scalable Performance
0% (1)
Principles of Scalable Performance
7 pages
Performance Evaluation of Parallel Computers
No ratings yet
Performance Evaluation of Parallel Computers
37 pages
02 Gustafsons Law
No ratings yet
02 Gustafsons Law
2 pages
Performance and Scalability Class
No ratings yet
Performance and Scalability Class
63 pages
Amdals Law Notes
No ratings yet
Amdals Law Notes
8 pages
Amdahl's Law & Parallel Computing
No ratings yet
Amdahl's Law & Parallel Computing
18 pages
Amdahl's Law: Example 1
No ratings yet
Amdahl's Law: Example 1
12 pages
Amdahl’s Law & Parallel Computing
No ratings yet
Amdahl’s Law & Parallel Computing
13 pages
Lecture 3
No ratings yet
Lecture 3
8 pages
Parallel Computing Metrics
No ratings yet
Parallel Computing Metrics
11 pages
Aitsam B21F0230CS015 PDC ASS02
No ratings yet
Aitsam B21F0230CS015 PDC ASS02
5 pages
Lecture 4 Amdahl Law 1
No ratings yet
Lecture 4 Amdahl Law 1
22 pages
Parallel Computing Answer Key
No ratings yet
Parallel Computing Answer Key
8 pages
HW2 Solutions
No ratings yet
HW2 Solutions
4 pages
Amdahl Law
No ratings yet
Amdahl Law
2 pages
Sales Persons 3
No ratings yet
Sales Persons 3
1 page
Customers 2
No ratings yet
Customers 2
2 pages
Sold Cars1
No ratings yet
Sold Cars1
1 page
Untitled Document
No ratings yet
Untitled Document
8 pages
Available Cars3
No ratings yet
Available Cars3
1 page
IOOM Endsem 2023
No ratings yet
IOOM Endsem 2023
4 pages
IOOM Endsem 2022
No ratings yet
IOOM Endsem 2022
2 pages
CCNA 3 Student Lab Manual
No ratings yet
CCNA 3 Student Lab Manual
192 pages
Build SoftEther VPN For UNIX
No ratings yet
Build SoftEther VPN For UNIX
3 pages
BCOM-USB User Guide 1.3.1 PDF
0% (1)
BCOM-USB User Guide 1.3.1 PDF
39 pages
Splunk Enterprise Search Tutorial 8.2.2: Generated: 8/27/2021 9:31 PM
No ratings yet
Splunk Enterprise Search Tutorial 8.2.2: Generated: 8/27/2021 9:31 PM
4 pages
PSL
No ratings yet
PSL
22 pages
Guide To Syncsort PDF
No ratings yet
Guide To Syncsort PDF
41 pages
Aradial AAA Server Performance Settings
No ratings yet
Aradial AAA Server Performance Settings
4 pages
After U900
0% (1)
After U900
11 pages
Week 2 - Performing Computer Operations
No ratings yet
Week 2 - Performing Computer Operations
9 pages
Computer Generations Overview
No ratings yet
Computer Generations Overview
20 pages
Pham Quang Tung (BKC12475) - Networking - Asm 2 lần 1
No ratings yet
Pham Quang Tung (BKC12475) - Networking - Asm 2 lần 1
21 pages
OPC UA Setup for HMI and WinCC
No ratings yet
OPC UA Setup for HMI and WinCC
23 pages
NLI's Cisco CCIE Security Lab Guide Sample Pages
No ratings yet
NLI's Cisco CCIE Security Lab Guide Sample Pages
16 pages
Chapter - 5 Parallel Processing
No ratings yet
Chapter - 5 Parallel Processing
117 pages
Cisco Nexus 9000 Series NX-OS VXLAN Configuration Guide, Release 9.3 (X)
No ratings yet
Cisco Nexus 9000 Series NX-OS VXLAN Configuration Guide, Release 9.3 (X)
560 pages
WIFI Instructions New 13.09.22
No ratings yet
WIFI Instructions New 13.09.22
3 pages
Site Survey: Easy WLAN Deployment and Maintenance
No ratings yet
Site Survey: Easy WLAN Deployment and Maintenance
3 pages
Nse4 Manual Fortinet
100% (8)
Nse4 Manual Fortinet
46 pages
M.S. Embedded Systems
No ratings yet
M.S. Embedded Systems
1 page
NetSDK JAVA ProgrammingManual
No ratings yet
NetSDK JAVA ProgrammingManual
86 pages
39sf010a SST
No ratings yet
39sf010a SST
22 pages
Embedded DDR Interfaces: Ten Tips To Success For Your Soc
No ratings yet
Embedded DDR Interfaces: Ten Tips To Success For Your Soc
13 pages
DX Diag
No ratings yet
DX Diag
39 pages
AVR Interrupt Programming in Assembly and C
No ratings yet
AVR Interrupt Programming in Assembly and C
38 pages
The Definitive Guide To Hyperconverged Infrastructure: How Nutanix Works
100% (1)
The Definitive Guide To Hyperconverged Infrastructure: How Nutanix Works
26 pages
Control Unit in Computer Architecture
No ratings yet
Control Unit in Computer Architecture
13 pages
RCIM User Guide 0898007 1200
No ratings yet
RCIM User Guide 0898007 1200
134 pages
Bankersdaily in Computer Knowledge Quiz Set 78
No ratings yet
Bankersdaily in Computer Knowledge Quiz Set 78
7 pages
VxRail Appliance - VxRail Installation Procedures-VxRail E660 - E660F - E660N
100% (1)
VxRail Appliance - VxRail Installation Procedures-VxRail E660 - E660F - E660N
102 pages
MS-DOS Commands Overview
100% (1)
MS-DOS Commands Overview
20 pages

All Numerical Unit-1

Uploaded by

All Numerical Unit-1

Uploaded by

When a program is adapted to run on multiple processors in a multiprocessor

When adapting a program to run on multiple processors in a multiprocessor

The execution time on p processors is given by (t/p) + 4 seconds. This is

 For 2 processors: (100/2) + 4 = 54 seconds

 Speedup = Execution time on a single processor / Execution time on p

Slatency is the theoretical speedup of the execution of the whole task;

Two independent parts A B

Optimizing the sequential part of parallel programs

parallelizable part is optimized.

If 1 − p = 0.4 , O = 2, and s = 5, then:

Transforming sequential parts of parallel programs into parallelizable

Relation to the law of diminishing returns

Source: Wikipedia, https://en.wikipedia.org/wiki/Amdahl%27s_law

Last modified: Wednesday, July 22, 2020, 12:27 PM

Contact site support 

You are currently using guest access (Log in)

Terms & Conditions

Amdahl’s Law Graph

execute serially and aren’t amenable to parallelism.

The Law Itself

additional computing resources.

the speed-up occurs as we increase the number of processes.

Example of Amdahl’s Law

The speed-up steadily increases as we increase the number of processors or threads.

The law can be stated as:

"the overall performance

Amdahl's law is often used in parallel

Amdahl's law can be formulated in the following way: [4]

represents the total speedup of a program

if part B is made to run 5 times

Assume that a task has two independent parts, A and B. Part B

Optimizing the sequential part of parallel programs

When , we have , meaning that the speedup is measured with respect to

Transforming sequential parts of parallel programs into parallelizable

Relation to the law of diminishing returns

1.8.1 For each processor find the average capacitive loads.

average capacitive loads = 2.904 * 10-8

Reduction in voltage is a 9.8 % reduction from 0.9 V

Write the formula for capacitive load

Capacitiveload=Dynamicpower0.5*V2*Clockrate ..... (1)

Write the formula for Pentium r prescoot processor

Write the formula for percentage reduction is

Write the formula for Total power.

Totalpower=Staticpower+Dynamicpower ..… (4)

For Pentium 4 Prescoott Processor, Capacitive Load

role="math" localid="1650926173043" =900.5·1.252·3.6×109=3.2×10-8 (Since 1

For core i5 capacitive load

For pentium 4 prescoot processor

We know , Total Power=Static Power+Dynamic Power

And that leakage current is due to static power.

I (leakage current) = P/V

Because the total power dissipated is diminished by 10%,

Let new total dissipated power,

For the Pentium 4 Prescott processor,

Since I₂ (leakage current)=StaticpowerVoltage

(because the leakage current is fixed), we have

8V2+57.6V2=90V=1.18Vor-1.32V. As a result, the quadratic equation is formed.

By the quadratic formula,

The percentage reduction is=

.Which is a 5.9 % reduction from 1.25 V

(because the leakage current is fixed)

As a result, the quadratic equation is formed.

By the quadratic formula,

The new voltage is 0.82 V if you select the positive option

Which is a 9.8 % reduction from 0.9 V

1.9.2 Impact of Doubling the CPI of Arithmetic Instructions

You might also like

Capacitiveload=Dynamicpower0.5V2Clockrate ..... (1)