Computer Architecture
and Performance
Objectives:
 Understand the concepts of computer
architecture
 Understand how performance is measured
 Know the different ways to measure
computer performance
Computer Architecture
 The task the computer designer faces is a
complex one:
Determine what attributes are important
for a new computer, then design a
computer to maximize performance while
staying within cost, power, and
availability constraints.
Computer Architecture contd
 In the past, the term computer
architecture often referred only to
instruction set design.
 Other aspects of computer design were
called implementation.
Instruction Set Architecture
 Instruction set architecture (ISA) refers to
the actual programmer-visible instruction
set
 Ex, LMC instruction set
 The ISA serves as the boundary between
the software and hardware.
Implementation
 The implementation of a computer has
two components: organization and
hardware.
 The term organization includes the highlevel aspects of a computers design, such
as the memory system, the memory
interconnect, and the design of the
internal processor or CPU.
 Hardware refers to the specifics of a
computer, including the detailed logic
design and the packaging technology of
Goal
Computer architects must
design a computer to meet
functional requirements.
Time discovers
truth.
Seneca
Performance
 In general, performance describes how
quickly a given system can execute a
program or programs.
 Systems that execute programs in less
time are said to have higher performance
Response Time/Execution Time
 The time between the start and
completion of a task
 To maximize performance, we want to
minimize response time or execution time
for some task.
Response Time/Execution Time
 Thus we can relate performance and
execution time for a computer X:
PerformanceX =
1________
Execution TimeX
Response Time/Execution Time
 This means that for two computers X and
Y, if the performance of X is greater than
the performance of Y, we have
PerformanceX > PerformanceY
_
1____ > ____1_____
Execution TimeX Execution TimeY
Execution timeY > Execution timeX
Response Time/Execution Time
 In discussing a computer design, we often
want to relate the performance of two
different computers quantitatively. We will
use the phrase X is n times faster than
Yor equivalently X is n times as fast as
Yto mean
PerformanceX = n
PerformanceY
Response Time/Execution Time
 If X is n times faster than Y, then the
execution time on Y is n times longer than
it is on X:
PerformanceX = ExecutionY = n
PerformanceY
ExecutionX
Relative Performance
 Ex. If computer A runs a program in
10 seconds and computer B runs the
same program in 15 seconds, how
much faster is A than B?
Ex.
PerformanceA = ExecutionB = n
PerformanceB
ExecutionA
Thus, the performance ratio is
15 = 1.5
10
and A is therefore 1.5 times faster than B.
Measuring Performance
Time is the measure of computer
performance: the computer that
performs the same amount of work in
the least time is the fastest.
Performance Metrics
 Cycles per Instruction (CPI)
 Number of clock cycles required to execute each
instruction
 CPI = number of clock cycles required to execute
program
number of instructions executed in running the
program
 Instructions executed Per Cycle (IPC)
 For systems that can execute more than one instruction
per cycle, the IPC is used instead of CPI
 IPC = number of instructions executed in running a
program
number of clock cycles required to execute
Ex.
 A given program consists of a 100instruction loop that is executed 42 times.
If it takes 16,000 cycles to execute the
program on a given system, what are the
systems CPI and IPC values for the
program?
Soln:
Benchmark Suites
 Consists of a set of programs that are
believed to be typical of the programs that
will be run on the system
 They generate estimates of a systems
performance on different types of
applications.
 Ex. SPEC  Standard Performance Evaluation
Corporation
 is a non-profit corporation formed to establish,
maintain and endorse a standardized set of relevant
benchmarks that can be applied to the newest
generation of high-performance computers.
 SPEC CPU2006,SPEC CPUv6
Speedup
 Used to describe how the performance of
an architecture changes as different
improvements are made to the
architecture
 It is the ratio of the execution times before
and after a change is made
Speedup = Execution Time before
Execution Time
after
Ex
 If a program takes 25 seconds to run on
one version of an architecture and 15
seconds to run on a new version, the
overall speedup
= 25 sec/15 sec = 1.67
Amdahls Law
 The most important rule for designing
high-performance computer systems is
make the common case fast.
 Qualitatively, this means that the impact
of a given performance on overall
performance is dependent on both how
much the improvement improves
performance when it is in use and how
often the improvement is in use
Amdahls Law
Execution Timenew =
Execution Timeold X [ Fracunused +
Speedupused
Frac
used
where:
 Frac unused = fraction of time that the
improvement is not in use
 Fracused = fraction of time that the
improvement is in use
 Speedupused = speedup that occurs when
the improvement is used
Note that Fracused and Fracunused are
computed using the the execution time
before the modification is applied.
Amdahls Law can be rewritten
using the definition of speedup:
Speedup = Execution Timeold
Execution Timenew
= ________
1_____________
[ Fracunused +
Frac used ]
Speedupused
Ex.
Suppose that a given architecture does not
have hardware support for multiplication, so
multiplication have to be done through
repeated addition (this was the case on
some early microprocessors). If it takes 200
cycles to perform multiplication in software,
and 4 cycles to perform multiplication in
hardware, what is the overall speedup from
hardware support for multiplication if a
program spends 10% of its time doing
multiplications? What about a program that
spends 40% of its time doing
Soln:
Seatwork:
1. If the 2011 version of a computer
executes a program in 200ns and the
version of the computer made in the year
2013 executes the same program in 150ns,
what is the speedup that the manufacturer
had achieved over the two-year period?
2. To achieve a speedup of 3 on a program
that originally took 78s to execute, what
must be the execution time of the program
be reduced to?
3. When run on a given system, a program
takes 1,000,000 cycles. If the system
achieves a CPI of 40, how many instructions
were executed in running the program?
4. What is the IPC of a program that
executes 35,000 instructions and requires
17,000 cycles to execute?
5. Suppose a computer spends 90% of its time
handling a particular type of computation when
running a given program, and its manufacturers
make a change that improves its performance on
that type of computation by a factor of 10.
a. If the program originally took 100s to execute,
what will its execution time be after the
change?
b. What is the speedup from the old system to the
new system?