0% found this document useful (0 votes)
16 views8 pages

Mid 2

The document contains an exam with multiple questions related to computer organization and CPU architecture, focusing on instruction execution, data hazards, control signals, and branch prediction. Each question includes a set of statements with correct answers provided, as well as explanations for certain answers. The exam covers topics such as single-cycle and pipelined CPUs, forwarding mechanisms, and multiplication algorithms.

Uploaded by

j.terryyang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views8 pages

Mid 2

The document contains an exam with multiple questions related to computer organization and CPU architecture, focusing on instruction execution, data hazards, control signals, and branch prediction. Each question includes a set of statements with correct answers provided, as well as explanations for certain answers. The exam covers topics such as single-cycle and pipelined CPUs, forwarding mechanisms, and multiplication algorithms.

Uploaded by

j.terryyang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Exam 2

109暑 計算機組織 蔡⽂錦

密碼: COExam2
(https://docs.google.com/forms/d/e/1FAIpQLSeKhBjNur0Ipc6g2uUE5DoQwXhCJjqhRzAgkeTsN4fzLxqzvg/viewform)

Q1. Given the single cycle CPU and register information below. Which of following
statements are correct when the instruction "beq" below is executed? Assume the
"beq" is located at address 0x60000008, and the "L" is at 0x60000000. "beq $t1,
$t2, L"

a. The value on the path with label (a) is 0×09.


b. The value on the path with label (e) is 0xFFFFFFF4.
c. The value on the path with label (f) is 0×00200020.
d. The value on the path with label (i) is 0×60000000.
e. The value on the path with label (j) is 0×60000000.

Ans: a. c. d.

b. (e) should be 0xFFFFFFFD. Since 0×60000008(PC) + 4 - (x × 4 (since shift 2 bit)) =


0×60000000.
e. (j) should be 0×6000000C (PC + 4). Since $t1 != $t2.

Q2. Given the control value definitions for the forwarding MUXs as below, where the
ForwardA controls the MUX of first source operand of ALU and ForwardB controls
the second one. For the code sequence below, which of following statements are
correct?

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 1 of 9
a. For the 2nd instruction lw, forwardA = 10 and forwardB = 00
b. For the 3rd instruction, forwardA = 00 and forwardB = 01
c. For the 4th instruction, forwardA = 10 and forwardB = 00
d. For the 5th instruction, forwardA = 01 and forwardB = 10
e. For the 6th instruction, forwardA = 00 and forward = 01

Ans: d. e.

clock forwarding & hardware forwardA forwardB

1 add $t1, $t2, $t3 00 00

2 lw $t4, 4($t1) 01 00

3 stall 00 00

4 add $t2, $t1, $t4 00 10

5 add $t1, $t4, $t3 00 00

6 add $t1, $t1, $t2 01 10

7 add $t2, $t1, $t1 00 01

Q3. Consider the five-stage piplelined CPU introduced in the textbook. Please select
the signals that are required to insert a stall cycle when data hazard is detected at
ID stage.

a. Set PCWrite = 0 (disable the write to Program Counter (PC) )


b. Set IF/Dwrite = 0 (disable the write to IF/ID)
c. IF.Flush (clear IF/ID)
e. ID.Flush (set 9 control signals (used in EX, MEM, WB stages) to 0
d. EX.Flush (set 5 control signals (used in MEM, WB stages) to 0

Ans: a. b. e.
Flush ID (this instruction) and keep PC to next clock.

Q4. Consider the five-stage piplelined CPU introduced in the textbook. Please select
the signals that are required when exception is detected in EX stage and need to
flush the instructions currently in IF, ID and EX stages.

a. Set PCWrite = 0 (disable the write to Program Counter (PC) )


b. Set IF/Dwrite = 0 (disable the write to IF/ID)
c. IF.Flush (clear IF/ID)
e. ID.Flush (set 9 control signals (used in EX, MEM, WB stages) to 0
d. EX.Flush (set 5 control signals (used in MEM, WB stages) to 0

Ans: c. d. e.
Need to jump to handler and flush next all instructions.

Q5. Consider the following sequence of actual outcomes for a branch. T means the
branch is taken. N means not taken. Assume both predictors are initialized to predict
"taken". Branch: T-N-T-N-N-T-N.

(Please write in capital letters T and N, and there is no space and symbols in between. For
example, TNTTTTT.)

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 2 of 9
Ans:
List the preditions for this branch using 1-bit predictor: TTNTNNT.
List the preditions for this branch using 2-bit predictor: TTTTTNT.

a. If the same patterns are repeated thousands of times, the accuracy rate is about 1/7
when 1-bit predictor is used.
b. If the same patterns are repeated thousands of times, the accuracy rate is about 2/7
when 2-bit predictor is used.
Ans: a.

Branch TNTNNTN TNTNNTN TNTNNTN repeated thousands

1-bit TTNTNNT NTNTNNT NTNTNNT 1/7

2-bit TTTTTNT NTNTNNN NNNNNNN 4/7

Q6. Assume the latencies for logic blocks are listed as table below. Please ignore the
latency for control decoder, control signal delay, or other unspecified blocks. Which
of following statements are correct?

a. The speedup of the pipelined processor versus the single-cycle processor for executing
100 lw instructions is about 3.2 (Assume there is no any data dependency in the code).
b. For a classic MIPS CPU with a 5-stage pipeline, the minimum clock cycle time it can use
is 120ps.
c. The latency of instruction lw in the pipelined processor is 400.
d. For a single-cycle MIPS CPU, the minimum clock cycle time it can use is 380ps.

Ans: a. b.
a. (400 × 100) / (120 × 100 + 120 × 4) = ~3.2.
b. max(IF, ID, EX, MEM, WB) = 120ps (D-Mem).
c. 120 × 5 = 600.
d. 100(I-Mem) + 50(Regs) + 80(ALU) + 50(Regs) + 120(D-Mem) = 400.

Q7. Consider the optimized 4-bit multiplication block diagram below. Which of
following statements are correct?

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 3 of 9
a. Consider a 16-bit multiplication with multiplier 0×64A5. It requires 7 additions of
multiplicand if traditional multiplication algorithm is used.
b. When performing 3×(-5) using Booth’s algorithm, the Product register contains 1110
0101(binary) (not including previous bit) after step1.
c. Consider a 16-bit multiplication with multiplier 0×64A5. It requires 6 additions and 6
subtractions of multiplicand if Booth’s algorithm is used.
d. When performing 3×(-5) using traditional multiplication algorithm, the Product register
contains 0001 0101(binary) after step1.
e. Consider a 16-bit multiplication with multiplier 0xE4A5. It requires 5 additions and 6
subtractions of multiplicand if Booth’s algorithm is used

Ans: a. c. e.
a. 0×64A5 = 0110 0100 1010 0101 (binary) which with seven 1's
b. the Product register should contain 1110 1101(binary).
c. 0×64A5 = 0110 0100 1010 0101 (binary) which with six 1's list.
d. the Product register should contain 0001 1101(binary).
e. 0xE4A5 = 1110 0100 1010 0101 (binary) which with siz 1's list and end with 1's.

Q8. For the static 2-issue pipelined processor given in the textbook and the code
sequence below.

a. If we unroll 2 more copies of the code (i.e., 3 copies in total), the best IPC for the
scheduled code is 11/6 (if we ignore the remaining 4 stages for the last instruction.)
b. If we unroll 2 more copies of the code (i.e., 3 copies in total), it will take 6 cycles to
schedule the code on the 2-issue CPU (using minimum number cycles).
(https://hackmd.io?utm_source=view-page&utm_medium=logo-nav)
(https://hackmd.io/@pinchen) Exam 2  Try HackMD
c. Assume the loop sequence will be executed for 9 iterations. Compared to the scheduled
code without unrolling, the speedup is 2 by running the scheduled code of unrolling 3
copies.
d. Without code unrolling, it will take 4 cycles to schedule the code sequence on the 2-
issue CPU (using minimum number cycles).

Ans: a. b. c. d.

c. (4 × 9) / (6 × 3) = 2

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 4 of 9
clock ALU or branch lw & sw

1 addi $s1, $s1, -4 lw $t0, 0($s1)

2 nop nop

3 addu $t0, $t0, $s2 nop

4 bne $s1, $zero, Loop sw $t0, 4($s1)

clock ALU or branch lw & sw

1 addi $s1, $s1, -8 lw $t0, 0($s1)

2 nop lw $t1, 4($s1)

3 addu $t0, $t0, $s2 lw $t2, 8($s1)

4 addu $t1, $t1, $s2 sw $t0, 12($s1)

5 addu $t2, $t2, $s2 sw $t1, 8($s1)

6 bne $s1, $zero, loop sw $t2, 4($s1)

Q9. Given XXX CPU below (assume that stall and forwarding mechanism have been
implemented) and consider the following code sequence. When the code runs to the
Yth cycle, what are the values of control signals RegWrite, ALUsrc, RegDst and
MemWrite, respectively. (Please write down your answer in order. For example, if
your answer is “RegWrite=0, ALUsrc=1, RegDst=1 and MemWrite=0”, then write your
answer as 0110).

If given the single-cycle CPU and the code runs to the 5th cycle.
Ans: 1100
If given the pipelined CPU and the code runs to the 5th cycle.
Ans: 1011
If given the pipelined CPU and the code runs to the 9th cycle.
Ans: 1010

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 5 of 9
pipelined
clock nonpipelined 5th cycle & 5th cycle 9th cycle
forwarding

lw $t1, WB:
1 lw $t1, 0($s1)
0($s1) RegWrite=1

sw $s1, sw $s1, MEM:


2
0($s2) 0($s2) MemWrite=1

EX:
add $t2, $s2, add $t2,
3 ALUsrc=0,
$s3 $s2, $s3
RegDst=1

add $t3, $t2, add $t3,


4 ID
$s2 $t2, $s2

RegWrite=1,
ALUsrc=1, lw $t1, WB:
5 lw $t1, 0($t2) IF
RegDst=0, 0($t2) RegWrite=1
MemWrite=0

add $t1, $t1, MEM:


6 stall
$s1 MemWrite=0

EX:
add $t1, $t1, add $t1,
7 ALUsrc=0,
$s2 $t1, $s1
RegDst=1

add $t2, $s2, add $t1,


8 ID
$t1 $t1, $s2

add $t2,
9 IF
$s2, $t1

Q10. Consider the code sequence on the right side. Assume the two branch
instructions are predicted taken, but actually both of them are NOT taken. (Note:
register reads/writes can happen in the same cycle). Which of following statements
are correct?

a. Assume the CPU support data forwarding and the branch outcome is determined in
MEM stage. At the 10th cycle, no instruction is at WB stage.
b. Assume the CPU support data forwarding and the branch outcome is determined in

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 6 of 9
MEM stage. It takes 19 cycles to complete the execution of the code.
c. Assume the CPU support data forwarding and the branch outcome is determined in ID
stage. It takes 15 cycles to complete the execution of the code.
d. Assume the CPU does "not" support data forwarding and the branch outcome is
determined in MEM stage. It takes 22 cycles to complete the execution of the code.
e. Assume the CPU does "not" support data forwarding and the branch outcome is
determined in MEM stage. At the 10th cycle, the instruction at WB stage is bne.
f. Assume the CPU support data forwarding and the branch outcome is determined in ID
stage. At the 10th cycle, no instruction is at WB stage.

Ans: a. b. d.

c. 17 cycles.
e. stall cycle.
f. lw instruction.

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 7 of 9
without forwarding & forwarding & 10th
clock nonpipelined
forwarding MEM stage ID stage cycle

add $s2, add $s2, add $s2, $s0, add $s2, $s0,
1
$s0, $s1 $s0, $s1 $s1 $s1

add $s3, $s1, add $s3, add $s3, $s1, add $s3, $s1,
2
$s0 $s1, $s0 $s0 $s0

bne $s2,
3 stall bne $s2, $s3, 10 stall
$s3, 10

lw $s0, bne $s2, $s3,


4 stall stall
0($s2) 10

lw $s1, bne $s2,


5 stall stall
0($s2) $s3, 10

bne $s0, $s1,


6 stall stall lw $s0, 0($s2) WB
10

add $s2,
7 stall lw $s0, 0($s2) lw $s1, 0($s2) MEM
$s0, $t2

add $s1, $s1,


8 stall lw $s1, 0($s2) stall EX
$t4

lw $s0,
9 stall stall ID
0($s2)

lw $s1, bne $s0, $s1,


10 bne $s0, $s1, 10 IF
0($s2) 10

11 stall stall stall

add $s2, $s0,


12 stall stall
$t2

bne $s0, add $s1, $s1,


13 stall
$s1, 10 $t4

add $s2, $s0,


14 stall
$t2

add $s1, $s1,


15 stall
$t4

16 stall

add $s2,
17
$s0, $t2

add $s1, $s1,


18
$t4

https://hackmd.io/@pinchen/COExam2 2025/4/28, 10:45


Page 8 of 9

You might also like