100% found this document useful (1 vote)

21 views8 pages

Disc08 Sols

The document discusses RISC-V pipelining and hazards, explaining concepts such as pipelining's impact on CPU performance, the necessity of stalling for certain hazards, and the structure of a pipelined CPU with its five stages. It outlines types of hazards including structural, data, and control hazards, and methods to resolve them, such as forwarding and stalling. Performance analysis is also provided, comparing single-cycle and pipelined datapaths, highlighting speedup factors and the implications of hazards on efficiency.

Uploaded by

fawad.sidd17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

21 views8 pages

Disc08 Sols

Uploaded by

fawad.sidd17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

CS 61C RISC-V Pipelining and Hazards

Fall 2022 Discussion 8

1 Pre-Check
This section is designed as a conceptual check for you to determine if you conceptually
understand and have any misconceptions about this topic. Please answer true/false
to the following questions, and include an explanation:

1.1 By pipelining the CPU datapath, each single instruction will execute faster (latency
is reduced), resulting in a speed-up in performance.

False. Because we implement registers between each stage of the datapath, the time
it takes for an instruction to finish execution through the 5 stages will be longer
than the single-cycle datapath we were first introduced with. A single instruction
will take multiple clock cycles to get through all the stages, with the clock cycle
based on the stage with the longest timing.

1.2 A pipelined CPU datapath results in instructions being executed with higher through-
put (than the single-cycle CPU).

True. Recall that throughput is the number of instructions processed per unit time.
Pipelining results in a higher throughput because more instructions are run at once,
which utilizes more parts of the datapath simultaneously.

1.3 Through adding additional hardware, we can implement two ’read’ ports as well
as a ’write’ port to the RegFile (where registers can be accessed). This solves the
hazard of two instructions reading and writing to the same register simultaneously.

False. The addition of independent ports to the RegFile allows for multiple instruc-
tions to access the RegFile at the same time (such as one instruction reading values
of two operands, while another instruction is writing to a return register). However,
this does not work if both instructions are reading and writing to the same register.
Some solutions to this data hazard could be to stall the latter instruction by 1 cycle
or to forward the read value from a previous instruction, bypassing the RegFile
completely.

1.4 All data hazards can be resolved with forwarding.

False. Hazards following lw cannot be fully resolved with forwarding because the
output is not known until after the MEM stage, making a stall necessary.

1.5 As stalling reduces performance significantly, we generally prefer other solutions to

fixing pipeline hazards, even at the cost of complexity or hardware. In a modern-day
pipelined CPU, are there still use-cases for stalling to resolve potential hazards? If
so, describe a program that would.

True. Say we have the RISC-V program where a0 is a pointer to an array of integers,
and we want to load a1 with the first element * 2:
2 RISC-V Pipelining and Hazards

lw t1 0(a0)
add t2 t1 t1
mv a1 t2

In this program, there are no other instructions to move into the load delay slot,
so we are forced to nop the next instruction and repeat it afterwards, essentially
stalling for one cycle. While we do have many tools and alternative solutions to
lessen possible performance loss, in some cases it is unavoidable.

2 Pipelining Registers
In order to pipeline, we separate the datapath into 5 discrete stages, each completing
a different function and accessing different resources on the way to executing an
entire instruction.

In the IF stage, we use the Program Counter to access our instruction as it is stored
in IMEM. Then, we separate the distinct parts we need from the instruction bits in
the ID stage and generate our immediate, the register values from the RegFile, and
other control signals. Afterwards, using these values and signals, we complete the
necessary ALU operations in the EX stage. Next, anything we do in regards with
DMEM (not to be confused with RegFile or IMEM) is done in the MEM stage,
before we hit the WB stage, where we write the computed value that we want back
into the return register in the RegFile.

These 5 stages, divided by registers as shown in the figure, allow the datapath
to provide a pipeline for multiple instructions to operate at the same time, each
accessing different resources. A pipelined datapath is provided for you on the last
page. Use it to answer the following questions.

2.1 What is the purpose of the new registers?

When we pipeline the datapath, the values from each stage need to be passed on at
each clock cycle. Each stage in the pipeline only operates on a small set of values,
but those values need to be correct with respect to the instruction that is currently
being processed. Say we use load word (lw) as an example: if it is in the EX stage,
then the EX stage should look like a snapshot of the single-cycle datapath. The
values on the rs1, rs2, immediate, and PC values should be as if lw was the only
instruction in the entire path. This also includes the control logic: the instruction is
passed in at each stage, the appropriate control signals are generated for the stage
of interest, and that stage can execute properly.

2.2 Looking at the way PC is passed through the datapath, there are two places where
+4 is added to the PC, once in the IF and MEM stage. Why do we add +4 to the
PC again in the memory stage?

We add +4 to the PC again in the memory stage so we don’t need to pass both
PC and PC+4 along the whole pipeline. This would use more registers, adding
unnecessary hardware. We also can’t just pass only PC+4 through the pipeline, as
we need the original PC value in operands like auipc.

2.3 Why do we need to save the instruction in a register multiple times?

RISC-V Pipelining and Hazards 3

We need to save the instruction in a register multiple times because each pipeline
stage needs to receive the right control signals for the instruction currently in that
stage.

3 Performance Analysis
Register clk-to-q 30 ps Branch comp. 75 ps Memory write 200 ps

Register setup 20 ps ALU 200 ps

RegFile read 100 ps
Register hold 10 ps Imm. Gen. 15 ps

Mux 25 ps Memory read 250 ps RegFile setup 20 ps

Given above are sample delays for each of the datapath components and register
timings. In the questions below, use these in conjunction with the pipelined datapath
on the last page to answer them.

3.1 What would be the fastest possible clock time for a single cycle datapath? Recall
from last week’s discussion that one instruction which exercises the critical path is
lw.
(HINT: tclk-cycle ≥ tclk-to-q + tlongest-combinational-path + tsetup )

tclk ≥ tPC clk-to-q + tIMEM read + tRF read + tmux + tALU + tDMEM read + tmux + tRF setup
≥ 30 + 250 + 100 + 25 + 200 + 250 + 25 + 20
≥ 900 ps

Note that the delay in the immediate generator as well as the branch comparator
are omitted because the immediate generator and branch comparison is done in
parallel with the RegFile read and ALU computation respectively, the latter two
taking much longer time.

3.2 What is the fastest possible clock time for a pipelined datapath?

IF : tPC clk-to-q + tIMEM read + tReg setup = 30 + 250 + 20 = 300 ps

ID : tReg clk-to-q + tRF read + tReg setup = 30 + 100 + 20 = 150 ps
EX : tReg clk-to-q + tmux + tALU + tReg setup = 30 + 25 + 200 + 20 = 275 ps
MEM : tReg clk-to-q + tDMEM read + tReg setup = 30 + 250 + 20 = 300 ps
WB : tReg clk-to-q + tmux + tRF setup = 30 + 25 + 20 = 75 ps

tclk ≥ max(IF, ID, EX, MEM, WB) = 300 ps

Again, the immediate generator and branch comparator delays are overshadowed by
the longer delays of RegFile read and ALU.
4 RISC-V Pipelining and Hazards

3.3 What is the speedup from the single cycle datapath to the pipelined datapath? Why
is the speedup less than 5×?
900 ps
300 ps ,or a 3 times speedup. The speedup is less than 5 because of (1) the necessity
of adding pipeline registers, which have clk-to-q and setup times, and (2) the need
to set the clock to the maximum of the five stages, which take different amounts of
time.

Note: Due to hazards, which require additional logic to resolve, the actual speedup
would likely be even less than 3 times.

4 Hazards
One of the costs of pipelining is that it introduces pipeline hazards. Hazards,
generally, are issues with something in the CPU’s instruction pipeline that could
cause the next instruction to execute incorrectly.

The 5-stage pipelined CPU introduces three types: structural hazards (hardware not
sufficient), data hazards (using wrong values in computation), and control hazards
(executing the wrong instruction).

Structural Hazards
Structural hazards occur when more than one instruction needs to use the same
datapath resource at the same time. In the standard 5-stage pipeline, there aren’t
structural hazards, unless there are active changes to the pipeline. The structural
hazards that could exist are prevented by RV32I’s hardware requirements.

There are two main causes of structural hazards:

• Register File: The register file is accessed both during ID, when it is read to
decode the instruction, and the corresponding register values; and during WB,
when it is written to in the rd register. If the RegFile only had one port, then
it wouldn’t work since we have one instruction being decoded and another
writing back.

– We resolve this by having separate read and write ports. However, this
only works if the read/written registers are different.

– To account for reads and writes to the same register, processors usually
write to the register during the first half of the clock cycle, and read from
it during in the second half. This is an implementation of the idea of
double pumping, which is when data is transferred along data buses at
double the rate, by utilising both the rising and falling clock edges in a
clock cycle.

• Main Memory: Main memory is accessed for both instructions and data.
If memory could only support one read/write at a time, then instruction A
going through IF and attempting to fetch an instruction from memory cannot
happen at the same time as instruction B attempting to read (or write) to
data portions of memory.
RISC-V Pipelining and Hazards 5

– Having a separate instruction memory (abbreviated IMEM) and data

memory (abbreviated DMEM) solves this hazard.

Something to remember about structural hazards is that they can always be resolved
by adding more hardware.

Data Hazards
Data hazards are caused by data dependencies between instructions. In CS 61C,
where we always assume that instructions go through the processor in order, we see
data hazards when an instruction reads a register before a previous instruction has
finished writing to that register.

There are two types of data hazards:

• EX-ID: this hazard exists because the output from the execute stage is not
written back to the RegFile until the writeback stage, yet can be requested by
the subsequent instruction in the decode stage.

• MEM-ID: this hazard exists because the output from the memory access
stage is not written back to the RegFile until the writeback stage, but can be
requested from the decode stage, just as in EX-ID.

Solving Data Hazards

For all questions, assume no branch prediction or double-pumping.

Forwarding

Most data hazards can be resolved by forwarding, which is when the result of the
EX or MEM stage is sent to the EX stage for a following instruction to use.

4.1 Look for data hazards in the code below, and figure out how forwarding could be
used to solve them.
Instruction C1 C2 C3 C4 C5 C6 C7
1. addi t0, a0, -1 IF ID EX MEM WB
2. and s2, t0, a0 IF ID EX MEM WB
3. sltiu a0, t0, 5 IF ID EX MEM WB
There are two data hazards, between instructions 1 and 2, and between instructions
1 and 3. The first could be resolved by forwarding the result of the EX stage in
C3 to the beginning of the EX stage in C4, and the second could be resolved by
forwarding the result of the MEM stage in C4 to the beginning of the EX stage in
C5.

4.2 Imagine you are a hardware designer working on a CPU’s forwarding control logic.
How many instructions after the addi instruction could be affected by data hazards
created by this addi instruction?

Three instructions. For example, with the addi instruction, any instruction that uses
t0 that has its ID stage in C3, C4, or C5 will not have the result of addi’s writeback
in C5. If, however, we are allowed to assume double-pumping (write-then-read to
registers), then it would only affect two instructions since the ID stage of instruction
6 RISC-V Pipelining and Hazards

4 would be allowed to line up with the WB stage of intruction 1. (Side note: how
is this implemented in hardware? We add 2 wires: one from the beginning of the
MEM stage for the output of the ALU and one from the beginning of the WB stage.
Both of these wires will connect to the A/B muxes in the EX stage.)

Stalls

4.3 Look for data hazards in the code below. One of them cannot be solved with
forwarding—why? What can we do to solve this hazard?
Instruction C1 C2 C3 C4 C5 C6 C7 C8
1. addi s0, s0, 1 IF ID EX MEM WB
2. addi t0, t0, 4 IF ID EX MEM WB
3. lw t1, 0(t0) IF ID EX MEM WB
4. add t2, t1, x0 IF ID EX MEM WB
There are two data hazards in the code. The first hazard is between instructions
2 and 3, from t0, and the second is between instructions 3 and 4, from t1. The
hazard between instructions 2 and 3 can be resolved with forwarding, but the hazard
between instructions 3 and 4 cannot be resolved with forwarding. This is because
even with forwarding, instruction 4 needs the result of instruction 3 at the beginning
of C6, and it won’t be ready until the end of C6.

We can fix this by stalling: insert a nop (no-operation) between instructions 3 and 4.

4.4 Say you are the compiler and can re-order instructions to minimize data hazards
while guaranteeing the same output. How can you fix the code above?

Reorder the instructions 2-3-1-4, because instruction 1 has no dependencies.

Detecting Data Hazards

Say we have the rs1, rs2, RegWEn, and rd signals for two instructions (instruction
n and instruction n + 1) and we wish to determine if a data hazard exists across the
instructions. We can simply check to see if the rd for instruction n matches either
rs1 or rs2 of instruction n + 1, indicating that such a hazard exists (why does this
make sense?).

We could then use our hazard detection to determine which forwarding paths/number
of stalls (if any) are necessary to take to ensure proper instruction execution. In
pseudo-code, part of this could look something like the following:

if (rs1(n + 1) == rd(n) && RegWen(n) == 1) {

set ASel for (n + 1) to forward ALU output from n
}
if (rs2(n + 1) == rd(n) && RegWen(n) == 1) {
set BSel for (n + 1) to forward ALU output from n
}
RISC-V Pipelining and Hazards 7

Control Hazards
Control hazards are caused by jump and branch instructions, because for all
jumps and some branches, the next PC is not PC + 4, but the result of the ALU
available after the EX stage. We could stall the pipeline for control hazards, but
this decreases performance.

4.5 Besides stalling, what can we do to resolve control hazards?

We can predict which way branches will go, and when this prediction is incorrect,
flush the pipeline and continue with the correct instruction. (The most naive
prediction method is to simply predict that branches are always not taken).

Extra for Experience

4.6 Given the RISC-V code above and a pipelined CPU with no forwarding, how many
hazards would there be? What types are each hazard? Consider all possible hazards
between all instructions.

How many stalls would there need to be in order to fix the data hazard(s), if we use
double-pumping? What about the control hazard(s), if we use branch prediction?
Instruction C1 C2 C3 C4 C5 C6 C7 C8 C9
1. sub t1, s0, s1 IF ID EX MEM WB
2. or s0, t0, t1 IF ID EX MEM WB
3. sw s1, 100(s0) IF ID EX MEM WB
4. bgeu s0, s2, loop IF ID EX MEM WB
5. add t2, x0, x0 IF ID EX MEM WB
There are four hazards: between instructions 1 and 2 (data hazard from t1), instruc-
tions 2 and 3 (data hazard from s0), instructions 2 and 4 (from s0), and instructions
4 and 5 (a control hazard).

Assuming that we can read and write to the RegFile on the same cycle, two stalls are
needed between instructions 1 and 2 (WB→ID), and two stalls are needed between
instructions 2 and 3 (WB→ID). For the control hazard, if we predicted correctly,
then no stalls are needed, but if we predicted incorrectly, then we need 3 stalls
while flushing the pipeline (MEM→1 cycle before IF). We don’t need to stall for the
hazard between 2 and 4 because stalling for instruction 3 already handles that.
8 RISC-V Pipelining and Hazards

Discussion8 2
No ratings yet
Discussion8 2
6 pages
CAO Fall 2024 Lecture 07 RISC V Pipelined Implementation
No ratings yet
CAO Fall 2024 Lecture 07 RISC V Pipelined Implementation
114 pages
Week 11
No ratings yet
Week 11
33 pages
SRM Pipelining 05
No ratings yet
SRM Pipelining 05
42 pages
Lec11 Pipeline 1 Notes
No ratings yet
Lec11 Pipeline 1 Notes
26 pages
Chapter - 04 RISC V
No ratings yet
Chapter - 04 RISC V
132 pages
2014fa CS61C L31 DG PipelineII 6up
No ratings yet
2014fa CS61C L31 DG PipelineII 6up
4 pages
Lec14 Pipeline Riscv - Key
No ratings yet
Lec14 Pipeline Riscv - Key
58 pages
RISC-V Pipeline Performance
No ratings yet
RISC-V Pipeline Performance
144 pages
Lec 1
No ratings yet
Lec 1
30 pages
Module 2
No ratings yet
Module 2
64 pages
Implementation of Multi Stage Processor
No ratings yet
Implementation of Multi Stage Processor
16 pages
FemtoRV32 Piplined Processor Report
No ratings yet
FemtoRV32 Piplined Processor Report
25 pages
Multi-Core Computer Architecture: Instruction Pipeline Hazards
No ratings yet
Multi-Core Computer Architecture: Instruction Pipeline Hazards
23 pages
05 Pipelining
No ratings yet
05 Pipelining
37 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
Chapter 4 Part 2
No ratings yet
Chapter 4 Part 2
50 pages
Pipelining 2019
No ratings yet
Pipelining 2019
82 pages
Chapter 04 RISC V Removed
No ratings yet
Chapter 04 RISC V Removed
99 pages
Advanced Pipelining for CE Students
No ratings yet
Advanced Pipelining for CE Students
43 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Lecture # 8B
No ratings yet
Lecture # 8B
20 pages
CPU Pipelining and Cache Basics
No ratings yet
CPU Pipelining and Cache Basics
61 pages
MIPS Pipelining and Hazards
0% (1)
MIPS Pipelining and Hazards
38 pages
Lecture-5-09 01 2025
No ratings yet
Lecture-5-09 01 2025
25 pages
L04 Pipelining
No ratings yet
L04 Pipelining
48 pages
Week 12
No ratings yet
Week 12
41 pages
Computer Architecture: Introduction To The Concept of Pipelined Processor
No ratings yet
Computer Architecture: Introduction To The Concept of Pipelined Processor
20 pages
Ca07 2014 PDF
No ratings yet
Ca07 2014 PDF
56 pages
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
No ratings yet
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
136 pages
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
No ratings yet
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
21 pages
IT3030E CA Chap5 CPU - Removed
No ratings yet
IT3030E CA Chap5 CPU - Removed
26 pages
Chapter 04 Processor 2
No ratings yet
Chapter 04 Processor 2
28 pages
Chapter 04 Processor 3.5
No ratings yet
Chapter 04 Processor 3.5
52 pages
Disc07 Sols
No ratings yet
Disc07 Sols
6 pages
15IF11 Multicore A PDF
No ratings yet
15IF11 Multicore A PDF
64 pages
CS530 Fall2015 Lecture9
No ratings yet
CS530 Fall2015 Lecture9
5 pages
CS 61C: Great Ideas in Computer Architecture Lecture 13: Pipelining
No ratings yet
CS 61C: Great Ideas in Computer Architecture Lecture 13: Pipelining
47 pages
Chapter4 2
No ratings yet
Chapter4 2
34 pages
FPGA
No ratings yet
FPGA
10 pages
Analysis of CPU
No ratings yet
Analysis of CPU
9 pages
Pipelined Datapath and Control
No ratings yet
Pipelined Datapath and Control
37 pages
Pipelining Lecture
No ratings yet
Pipelining Lecture
39 pages
Pipelining - Modified1
No ratings yet
Pipelining - Modified1
51 pages
Pipeline Architecture
No ratings yet
Pipeline Architecture
33 pages
Pipelining and Parallelism
No ratings yet
Pipelining and Parallelism
41 pages
Reduced Instruction Set Computers Pipelining: (RISC)
No ratings yet
Reduced Instruction Set Computers Pipelining: (RISC)
25 pages
Chapter 10 Principles of Pipelining
No ratings yet
Chapter 10 Principles of Pipelining
124 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
61 pages
Pipelining
No ratings yet
Pipelining
43 pages
Pipelining Preview: Basics & Challenges
No ratings yet
Pipelining Preview: Basics & Challenges
75 pages
Embedded Computer Architecture 5SAI0
No ratings yet
Embedded Computer Architecture 5SAI0
59 pages
L04 Pipelining
No ratings yet
L04 Pipelining
38 pages
Lecture 8 Chapter - 04 RISC-V Pipelining - Student Version
No ratings yet
Lecture 8 Chapter - 04 RISC-V Pipelining - Student Version
59 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
Software Process Models
No ratings yet
Software Process Models
41 pages
Requirements Engineering I
No ratings yet
Requirements Engineering I
29 pages
Requirements Engineering II
No ratings yet
Requirements Engineering II
37 pages
FFT Full
No ratings yet
FFT Full
6 pages
System Modelling
No ratings yet
System Modelling
37 pages
Workout
No ratings yet
Workout
5 pages
Optional (If You Have Experience)
No ratings yet
Optional (If You Have Experience)
1 page
17.8.1 - CCN Lab - Documentation
No ratings yet
17.8.1 - CCN Lab - Documentation
1 page
Disc09 Sols
No ratings yet
Disc09 Sols
7 pages
CCN Assignment01
No ratings yet
CCN Assignment01
7 pages
Toa Final Spring 2024
No ratings yet
Toa Final Spring 2024
7 pages
Automata Sessional-I (Solution) (Fall-2020)
No ratings yet
Automata Sessional-I (Solution) (Fall-2020)
12 pages
Lect5 PDF
No ratings yet
Lect5 PDF
21 pages
CPU-OS Simulator: Pipelining, Scheduling, Multithreading
No ratings yet
CPU-OS Simulator: Pipelining, Scheduling, Multithreading
8 pages
Pipeline in ARM
No ratings yet
Pipeline in ARM
10 pages
Quiz #1 - Verbal: What Registers Are Used To Store The Program Counter and Link Register?
No ratings yet
Quiz #1 - Verbal: What Registers Are Used To Store The Program Counter and Link Register?
6 pages
Coa Unit 4
No ratings yet
Coa Unit 4
10 pages
Answer 1:: Computer Science and Engineering Department Computer Organisation and Architecture (SE-204)
No ratings yet
Answer 1:: Computer Science and Engineering Department Computer Organisation and Architecture (SE-204)
4 pages
Microcode and Microsequencer Reference Sheet: Asel / Bsel / Dsel / Fsel / Updf / Mux1 / Mux2 / Data / Misc
No ratings yet
Microcode and Microsequencer Reference Sheet: Asel / Bsel / Dsel / Fsel / Updf / Mux1 / Mux2 / Data / Misc
2 pages
The Intel Pen Ti Um Processor
No ratings yet
The Intel Pen Ti Um Processor
12 pages
Instruction Set of 8085
No ratings yet
Instruction Set of 8085
122 pages
Computer Instructions
No ratings yet
Computer Instructions
12 pages
Micro Programming
No ratings yet
Micro Programming
15 pages
16-Instruction Set-19-01-2023
No ratings yet
16-Instruction Set-19-01-2023
13 pages
Instruction-Level Parallelism Explained
No ratings yet
Instruction-Level Parallelism Explained
3 pages
Assignment Questions
No ratings yet
Assignment Questions
3 pages
Assembly Instruction Guide
No ratings yet
Assembly Instruction Guide
9 pages
Advanced Computer Architectures
100% (6)
Advanced Computer Architectures
29 pages
Pipelining Seminar
No ratings yet
Pipelining Seminar
14 pages
8085, 8051, 8086 Scrolling Display Programs
No ratings yet
8085, 8051, 8086 Scrolling Display Programs
6 pages
Addressing Mode of 8085 Microprocessor
No ratings yet
Addressing Mode of 8085 Microprocessor
13 pages
94年台大資工計系詳解 (CA)
No ratings yet
94年台大資工計系詳解 (CA)
5 pages
4 MultiIssue 2024
No ratings yet
4 MultiIssue 2024
174 pages
Unit 3-Control Unit Operation
No ratings yet
Unit 3-Control Unit Operation
59 pages
Pipeline Hazards
No ratings yet
Pipeline Hazards
94 pages
Lec14 Control Hazards
No ratings yet
Lec14 Control Hazards
48 pages
High Performance Computing System (CSE 5154) RCS
No ratings yet
High Performance Computing System (CSE 5154) RCS
1 page
AStudyof Techniquesto Increase Instruction Level Parallelism
No ratings yet
AStudyof Techniquesto Increase Instruction Level Parallelism
6 pages
Intel x86 Assembler Instruction Set Opcode Table
No ratings yet
Intel x86 Assembler Instruction Set Opcode Table
4 pages
Classification of Parallel Architecture Designs
No ratings yet
Classification of Parallel Architecture Designs
22 pages
Computer Organization and Architecture Pipelining Set Execution, Stages and Throughput
No ratings yet
Computer Organization and Architecture Pipelining Set Execution, Stages and Throughput
7 pages
Out-of-Order Superscalar Optimization
No ratings yet
Out-of-Order Superscalar Optimization
156 pages

Disc08 Sols

Uploaded by

Disc08 Sols

Uploaded by

CS 61C RISC-V Pipelining and Hazards

Fall 2022 Discussion 8

1.4 All data hazards can be resolved with forwarding.

1.5 As stalling reduces performance significantly, we generally prefer other solutions to

2.1 What is the purpose of the new registers?

2.3 Why do we need to save the instruction in a register multiple times?

Register setup 20 ps ALU 200 ps

Mux 25 ps Memory read 250 ps RegFile setup 20 ps

IF : tPC clk-to-q + tIMEM read + tReg setup = 30 + 250 + 20 = 300 ps

tclk ≥ max(IF, ID, EX, MEM, WB) = 300 ps

There are two main causes of structural hazards:

– Having a separate instruction memory (abbreviated IMEM) and data

There are two types of data hazards:

Solving Data Hazards

Reorder the instructions 2-3-1-4, because instruction 1 has no dependencies.

Detecting Data Hazards

if (rs1(n + 1) == rd(n) && RegWen(n) == 1) {

4.5 Besides stalling, what can we do to resolve control hazards?

Extra for Experience

You might also like