Topics Discussed
Book: Computer Architecture – A Quantitative Approach (4th Ed)
Sections: 1.1, 1.2, 1.3, 1.4 (Performance trends), 1.9
1.11 – Reading Assignment
History and Evaluation of Computers (not in this book) -- Chapter 2 of Computer
Architecture and Organization (6th Edition)
EXERCISES:
1. Suppose that we want to enhance the processor used for Web serving. The new
   processor is 10 times faster on computation in the Web serving application than the
   original processor. Assuming that the original processor is busy with computation
   40% of the time and is waiting for I/O 60% of the time, what is the overall speedup
   gained by incorporating the enhancement?
2. A common transformation required in graphics processors is square root.
   Implementations of floating-point (FP) square root vary significantly in performance,
   especially among processors designed for graphics. Suppose FP square root (FPSQR)
   is responsible for 20% of the execution time of a critical graphics benchmark. One
   proposal is to enhance the FPSQR hardware and speed up this operation by a factor of
   10. The other alternative is just to try to make all FP instructions in the graphics
   processor run faster by a factor of 1.6; FP instructions are responsible for half of the
   execution time for the application. The design team believes that they can make all
   FP instructions run 1.6 times faster with the same effort as required for the fast square
   root. Compare these two design alternatives.
3. Calculate the reliability improvement.
   The improvement/speedup of fraction is 4150.
4. Your company has just bought a new dual Pentium processor, and you have been
   tasked with optimizing your software for this processor. You will run two
   applications on this dual Pentium, but the resource requirements are not equal. The
   first application needs 80% of the resources, and the other only 20% of the resources.
   a. Given that 40% of the first application is parallelizable, how much speedup would
   you achieve with that application if ran in isolation?
   b. Given that 99% of the second application is parallelizable, how much speedup
   would this application observe if run in isolation?
   c. Given that 40% of the first application is parallelizable, how much overall system
   speedup would you observe if you parallelized it?
   d. Given that 99% of the second application is parallelizable, how much overall
   system speedup would you get?
5. Program-I runs in 10 sec on machine A, which has 400 MHz clock rate. We are trying
   to design a machine B with factor clock rate so as to reduce the total execution time
   to 6 seconds. The increase of clock rate will affect the rest of the CPU design causing
   B to require 1.2 times as many clock cycles as machine A. you have to determine the
   clock rate of machine B.
6. Calculate performance of the machine in terms of MIPS (millions of instruction per
   seconds), having following statistics.
                  Instruction Type         Instruction Count      CPI
            Integer Arithmetic                   45000              1
7.          Data Transfer                        32000              2
            Floating Point Instruction           15000              2
            Control transfer                      8000              2
Compare performance of the following three machines in terms of MIPS (millions of
   instruction per seconds), having following specifications:
                            A              B                  C
                                                                                IC
                         200MHz         230MHz             300MHz
Avg. CPI of CPU
                             3              2                 2               30000
dependent Instructions
Avg. CPI of memory
                             8              8                 12              20000
dependent Instructions