Exam 2
109暑 計算機組織 蔡⽂錦
                       密碼: COExam2
                       (https://docs.google.com/forms/d/e/1FAIpQLSeKhBjNur0Ipc6g2uUE5DoQwXhCJjqhRzAgkeTsN4fzLxqzvg/viewform)
                       Q1. Given the single cycle CPU and register information below. Which of following
                       statements are correct when the instruction "beq" below is executed? Assume the
                       "beq" is located at address 0x60000008, and the "L" is at 0x60000000. "beq $t1,
                       $t2, L"
                       a. The value on the path with label (a) is 0×09.
                       b. The value on the path with label (e) is 0xFFFFFFF4.
                       c. The value on the path with label (f) is 0×00200020.
                       d. The value on the path with label (i) is 0×60000000.
                       e. The value on the path with label (j) is 0×60000000.
                       Ans: a. c. d.
                       b. (e) should be 0xFFFFFFFD. Since 0×60000008(PC) + 4 - (x              × 4 (since shift 2 bit)) =
                       0×60000000.
                       e. (j) should be 0×6000000C (PC + 4). Since $t1 != $t2.
                       Q2. Given the control value definitions for the forwarding MUXs as below, where the
                       ForwardA controls the MUX of first source operand of ALU and ForwardB controls
                       the second one. For the code sequence below, which of following statements are
                       correct?
https://hackmd.io/@pinchen/COExam2                                                                                             2025/4/28, 10:45
                                                                                                                                    Page 1 of 9
                       a. For the 2nd instruction lw, forwardA = 10 and forwardB = 00
                       b. For the 3rd instruction, forwardA = 00 and forwardB = 01
                       c. For the 4th instruction, forwardA = 10 and forwardB = 00
                       d. For the 5th instruction, forwardA = 01 and forwardB = 10
                       e. For the 6th instruction, forwardA = 00 and forward = 01
                       Ans: d. e.
                          clock      forwarding & hardware     forwardA    forwardB
                             1          add $t1, $t2, $t3         00            00
                             2           lw $t4, 4($t1)           01            00
                             3                stall               00            00
                             4          add $t2, $t1, $t4         00            10
                             5          add $t1, $t4, $t3         00            00
                             6          add $t1, $t1, $t2         01            10
                             7          add $t2, $t1, $t1         00            01
                       Q3. Consider the five-stage piplelined CPU introduced in the textbook. Please select
                       the signals that are required to insert a stall cycle when data hazard is detected at
                       ID stage.
                       a. Set PCWrite = 0 (disable the write to Program Counter (PC) )
                       b. Set IF/Dwrite = 0 (disable the write to IF/ID)
                       c. IF.Flush (clear IF/ID)
                       e. ID.Flush (set 9 control signals (used in EX, MEM, WB stages) to 0
                       d. EX.Flush (set 5 control signals (used in MEM, WB stages) to 0
                       Ans: a. b. e.
                       Flush ID (this instruction) and keep PC to next clock.
                       Q4. Consider the five-stage piplelined CPU introduced in the textbook. Please select
                       the signals that are required when exception is detected in EX stage and need to
                       flush the instructions currently in IF, ID and EX stages.
                       a. Set PCWrite = 0 (disable the write to Program Counter (PC) )
                       b. Set IF/Dwrite = 0 (disable the write to IF/ID)
                       c. IF.Flush (clear IF/ID)
                       e. ID.Flush (set 9 control signals (used in EX, MEM, WB stages) to 0
                       d. EX.Flush (set 5 control signals (used in MEM, WB stages) to 0
                       Ans: c. d. e.
                       Need to jump to handler and flush next all instructions.
                       Q5. Consider the following sequence of actual outcomes for a branch. T means the
                       branch is taken. N means not taken. Assume both predictors are initialized to predict
                       "taken". Branch: T-N-T-N-N-T-N.
                       (Please write in capital letters T and N, and there is no space and symbols in between. For
                       example, TNTTTTT.)
https://hackmd.io/@pinchen/COExam2                                                                                   2025/4/28, 10:45
                                                                                                                          Page 2 of 9
                       Ans:
                       List the preditions for this branch using 1-bit predictor: TTNTNNT.
                       List the preditions for this branch using 2-bit predictor: TTTTTNT.
                       a. If the same patterns are repeated thousands of times, the accuracy rate is about 1/7
                       when 1-bit predictor is used.
                       b. If the same patterns are repeated thousands of times, the accuracy rate is about 2/7
                       when 2-bit predictor is used.
                       Ans: a.
                          Branch         TNTNNTN       TNTNNTN      TNTNNTN    repeated thousands
                          1-bit          TTNTNNT       NTNTNNT      NTNTNNT             1/7
                          2-bit          TTTTTNT      NTNTNNN       NNNNNNN             4/7
                       Q6. Assume the latencies for logic blocks are listed as table below. Please ignore the
                       latency for control decoder, control signal delay, or other unspecified blocks. Which
                       of following statements are correct?
                       a. The speedup of the pipelined processor versus the single-cycle processor for executing
                       100 lw instructions is about 3.2 (Assume there is no any data dependency in the code).
                       b. For a classic MIPS CPU with a 5-stage pipeline, the minimum clock cycle time it can use
                       is 120ps.
                       c. The latency of instruction lw in the pipelined processor is 400.
                       d. For a single-cycle MIPS CPU, the minimum clock cycle time it can use is 380ps.
                       Ans: a. b.
                       a. (400       × 100) / (120 × 100 + 120 × 4) = ~3.2.
                       b. max(IF, ID, EX, MEM, WB) = 120ps (D-Mem).
                       c. 120     × 5 = 600.
                       d. 100(I-Mem) + 50(Regs) + 80(ALU) + 50(Regs) + 120(D-Mem) = 400.
                       Q7. Consider the optimized 4-bit multiplication block diagram below. Which of
                       following statements are correct?
https://hackmd.io/@pinchen/COExam2                                                                                  2025/4/28, 10:45
                                                                                                                         Page 3 of 9
                       a. Consider a 16-bit multiplication with multiplier 0×64A5. It requires 7 additions of
                       multiplicand if traditional multiplication algorithm is used.
                       b. When performing 3×(-5) using Booth’s algorithm, the Product register contains 1110
                       0101(binary) (not including previous bit) after step1.
                       c. Consider a 16-bit multiplication with multiplier 0×64A5. It requires 6 additions and 6
                       subtractions of multiplicand if Booth’s algorithm is used.
                       d. When performing 3×(-5) using traditional multiplication algorithm, the Product register
                       contains 0001 0101(binary) after step1.
                       e. Consider a 16-bit multiplication with multiplier 0xE4A5. It requires 5 additions and 6
                       subtractions of multiplicand if Booth’s algorithm is used
                       Ans: a. c. e.
                       a. 0×64A5 = 0110 0100 1010 0101 (binary) which with seven 1's
                       b. the Product register should contain 1110 1101(binary).
                       c. 0×64A5 = 0110 0100 1010 0101 (binary) which with six 1's list.
                       d. the Product register should contain 0001 1101(binary).
                       e. 0xE4A5 = 1110 0100 1010 0101 (binary) which with siz 1's list and end with 1's.
                       Q8. For the static 2-issue pipelined processor given in the textbook and the code
                       sequence below.
                       a. If we unroll 2 more copies of the code (i.e., 3 copies in total), the best IPC for the
                       scheduled code is 11/6 (if we ignore the remaining 4 stages for the last instruction.)
                       b. If we unroll 2 more copies of the code (i.e., 3 copies in total), it will take 6 cycles to
                       schedule the code on the 2-issue CPU (using minimum    number cycles).
                                                                    (https://hackmd.io?utm_source=view-page&utm_medium=logo-nav)
         (https://hackmd.io/@pinchen) Exam 2      Try        HackMD
                       c. Assume the loop sequence will be executed for 9 iterations. Compared to the scheduled
                       code without unrolling, the speedup is 2 by running the scheduled code of unrolling 3
                       copies.
                       d. Without code unrolling, it will take 4 cycles to schedule the code sequence on the 2-
                       issue CPU (using minimum number cycles).
                       Ans: a. b. c. d.
                       c. (4   × 9) / (6 × 3) = 2
https://hackmd.io/@pinchen/COExam2                                                                                     2025/4/28, 10:45
                                                                                                                            Page 4 of 9
                          clock         ALU or branch          lw & sw
                             1         addi $s1, $s1, -4    lw $t0, 0($s1)
                             2               nop                 nop
                             3        addu $t0, $t0, $s2         nop
                             4       bne $s1, $zero, Loop   sw $t0, 4($s1)
                          clock        ALU or branch           lw & sw
                             1        addi $s1, $s1, -8     lw $t0, 0($s1)
                             2               nop            lw $t1, 4($s1)
                             3       addu $t0, $t0, $s2     lw $t2, 8($s1)
                             4        addu $t1, $t1, $s2    sw $t0, 12($s1)
                             5       addu $t2, $t2, $s2     sw $t1, 8($s1)
                             6       bne $s1, $zero, loop   sw $t2, 4($s1)
                       Q9. Given XXX CPU below (assume that stall and forwarding mechanism have been
                       implemented) and consider the following code sequence. When the code runs to the
                       Yth cycle, what are the values of control signals RegWrite, ALUsrc, RegDst and
                       MemWrite, respectively. (Please write down your answer in order. For example, if
                       your answer is “RegWrite=0, ALUsrc=1, RegDst=1 and MemWrite=0”, then write your
                       answer as 0110).
                       If given the single-cycle CPU and the code runs to the 5th cycle.
                       Ans: 1100
                       If given the pipelined CPU and the code runs to the 5th cycle.
                       Ans: 1011
                       If given the pipelined CPU and the code runs to the 9th cycle.
                       Ans: 1010
https://hackmd.io/@pinchen/COExam2                                                                        2025/4/28, 10:45
                                                                                                               Page 5 of 9
                                                                      pipelined
                          clock      nonpipelined      5th cycle          &            5th cycle    9th cycle
                                                                     forwarding
                                                                       lw $t1,           WB:
                             1       lw $t1, 0($s1)
                                                                       0($s1)         RegWrite=1
                                        sw $s1,                        sw $s1,          MEM:
                             2
                                        0($s2)                         0($s2)         MemWrite=1
                                                                                          EX:
                                     add $t2, $s2,                    add $t2,
                             3                                                        ALUsrc=0,
                                          $s3                         $s2, $s3
                                                                                       RegDst=1
                                     add $t3, $t2,                    add $t3,
                             4                                                            ID
                                          $s2                         $t2, $s2
                                                      RegWrite=1,
                                                       ALUsrc=1,       lw $t1,                        WB:
                             5       lw $t1, 0($t2)                                       IF
                                                      RegDst=0,        0($t2)                      RegWrite=1
                                                      MemWrite=0
                                     add $t1, $t1,                                                   MEM:
                             6                                          stall
                                          $s1                                                      MemWrite=0
                                                                                                       EX:
                                     add $t1, $t1,                    add $t1,
                             7                                                                     ALUsrc=0,
                                          $s2                          $t1, $s1
                                                                                                    RegDst=1
                                     add $t2, $s2,                    add $t1,
                             8                                                                         ID
                                          $t1                         $t1, $s2
                                                                      add $t2,
                             9                                                                         IF
                                                                      $s2, $t1
                       Q10. Consider the code sequence on the right side. Assume the two branch
                       instructions are predicted taken, but actually both of them are NOT taken. (Note:
                       register reads/writes can happen in the same cycle). Which of following statements
                       are correct?
                       a. Assume the CPU support data forwarding and the branch outcome is determined in
                       MEM stage. At the 10th cycle, no instruction is at WB stage.
                       b. Assume the CPU support data forwarding and the branch outcome is determined in
https://hackmd.io/@pinchen/COExam2                                                                              2025/4/28, 10:45
                                                                                                                     Page 6 of 9
                       MEM stage. It takes 19 cycles to complete the execution of the code.
                       c. Assume the CPU support data forwarding and the branch outcome is determined in ID
                       stage. It takes 15 cycles to complete the execution of the code.
                       d. Assume the CPU does "not" support data forwarding and the branch outcome is
                       determined in MEM stage. It takes 22 cycles to complete the execution of the code.
                       e. Assume the CPU does "not" support data forwarding and the branch outcome is
                       determined in MEM stage. At the 10th cycle, the instruction at WB stage is bne.
                       f. Assume the CPU support data forwarding and the branch outcome is determined in ID
                       stage. At the 10th cycle, no instruction is at WB stage.
                       Ans: a. b. d.
                       c. 17 cycles.
                       e. stall cycle.
                       f. lw instruction.
https://hackmd.io/@pinchen/COExam2                                                                            2025/4/28, 10:45
                                                                                                                   Page 7 of 9
                                                       without        forwarding &      forwarding &     10th
                          clock      nonpipelined
                                                     forwarding        MEM stage          ID stage       cycle
                                       add $s2,        add $s2,       add $s2, $s0,     add $s2, $s0,
                             1
                                       $s0, $s1        $s0, $s1            $s1               $s1
                                     add $s3, $s1,     add $s3,       add $s3, $s1,     add $s3, $s1,
                             2
                                         $s0           $s1, $s0            $s0               $s0
                                       bne $s2,
                             3                           stall       bne $s2, $s3, 10        stall
                                       $s3, 10
                                        lw $s0,                                         bne $s2, $s3,
                             4                           stall             stall
                                        0($s2)                                                10
                                        lw $s1,        bne $s2,
                             5                                             stall             stall
                                        0($s2)         $s3, 10
                                     bne $s0, $s1,
                             6                           stall             stall        lw $s0, 0($s2)   WB
                                          10
                                       add $s2,
                             7                           stall        lw $s0, 0($s2)    lw $s1, 0($s2)   MEM
                                       $s0, $t2
                                     add $s1, $s1,
                             8                           stall        lw $s1, 0($s2)         stall        EX
                                         $t4
                                                       lw $s0,
                             9                                             stall             stall        ID
                                                        0($s2)
                                                        lw $s1,                         bne $s0, $s1,
                            10                                       bne $s0, $s1, 10                     IF
                                                        0($s2)                                10
                             11                          stall             stall             stall
                                                                                        add $s2, $s0,
                            12                           stall             stall
                                                                                             $t2
                                                       bne $s0,                         add $s1, $s1,
                            13                                             stall
                                                        $s1, 10                              $t4
                                                                      add $s2, $s0,
                            14                           stall
                                                                           $t2
                                                                      add $s1, $s1,
                            15                           stall
                                                                           $t4
                            16                           stall
                                                       add $s2,
                            17
                                                       $s0, $t2
                                                     add $s1, $s1,
                            18
                                                         $t4
https://hackmd.io/@pinchen/COExam2                                                                               2025/4/28, 10:45
                                                                                                                      Page 8 of 9