0% found this document useful (0 votes)

7 views22 pages

Lec7 Pipelining

Chapter Six discusses pipelining in computer architecture, emphasizing its role in improving instruction throughput and performance. It outlines the benefits and challenges of pipelining, including structural, control, and data hazards, and introduces concepts like forwarding and hazard detection units to mitigate these issues. The chapter also covers the stages of a pipelined datapath and the control mechanisms necessary for effective operation.

Uploaded by

altunokelifmerve24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views22 pages

Lec7 Pipelining

Uploaded by

altunokelifmerve24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Chapter Six

Pipelining

1
Pipelining

• Improve perfomance by increasing instruction throughput

Program
execution 2 4 6 8 10 12 14 16 18
order Time
(in instructions)
Instruction Data
lw $1, 100($0) Reg ALU Reg
fetch access

Instruction Data
lw $2, 200($0) 8 ns fetch
Reg ALU
access
Reg

Instruction
lw $3, 300($0) 8 ns fetch
...
8 ns

Program
execution 2 4 6 8 10 12 14
Time
order
(in instructions)
Instruction Data
lw $1, 100($0) Reg ALU Reg
fetch access

Instruction Data
lw $2, 200($0) 2 ns Reg ALU Reg
fetch access

Instruction Data
lw $3, 300($0) 2 ns Reg ALU Reg
fetch access

2 ns 2 ns 2 ns 2 ns 2 ns

Ideal speedup is number of stages in the pipeline. Do we achieve this?

2
Pipelining

• What makes it easy

– all instructions are the same length
– just a few instruction formats
– memory operands appear only in loads and stores

• What makes it hard?

– structural hazards: suppose we had only one memory
– control hazards: need to worry about branch instructions
– data hazards: an instruction depends on a previous instruction

• We’ll build a simple pipeline and look at these issues

• We’ll talk about modern processors and what really makes it hard:
– exception handling
– trying to improve performance with out-of-order execution, etc.

3
Basic Idea

IF: Instruction fetch ID: Instruction decode/ EX: Execute/ MEM: Memory access WB: Write back
register file read address calculation
0
M
u
x
1

Add

4 Add Add
result
Shift
left 2

Read
PC Address register 1 Read
data 1
Read
register 2 Zero
Instruction Registers Read ALU ALU
Write 0 Read
data 2 result Address 1
register M data
Instruction M
u Data
memory Write x u
memory x
data 1
0
Write
data
16 32
Sign
extend

• What do we need to add to actually split the datapath into stages?

4
Pipelined Datapath

0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add
Add result

Shift
left 2

Read
Instruction

PC Address register 1
Read
Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory x
data 1 0
Write
data
16 32
Sign
extend

Can you find a problem even if there are no dependencies?

What instructions can we execute to manifest the problem?
5
Corrected Datapath

0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

4 Add
Add result

Shift
left 2

Read
Instruction

PC Address register 1 Read

data 1
Read
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Address Read
data 2 result data 1
register M M
u Data
Write x u
memory x
data 1 0
Write
data
16 32
Sign
extend

6
Graphically Representing Pipelines

Time (in clock cycles)

Program
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
execution
order
(in instructions)
lw $10, 20($1) IM Reg ALU DM Reg

sub $11, $2, $3 IM Reg ALU DM Reg

• Can help with answering questions like:

– how many cycles does it take to execute this code?
– what is the ALU doing during cycle 4?
– use this representation to help understand datapaths

7
Pipeline Control

PCSrc

0
M
u
x
1

IF/ID ID/EX EX/MEM MEM/WB

Add

Add
4 Add
result
Branch
Shift
RegWrite left 2

Read MemWrite
Instruction

PC Address register 1
Read
data 1
Read ALUSrc
Zero
Zero MemtoReg
Instruction register 2
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address 1
register M data
u M
Data u
Write x memory
data x
1
0
Write
data
Instruction
[15– 0] 16 32 6
Sign ALU
extend control MemRead

Instruction
[20– 16]
0
M ALUOp
Instruction u
[15– 11] x
1

RegDst

8
Pipeline control

• We have 5 stages. What needs to be controlled in each stage?

– Instruction Fetch and PC Increment
– Instruction Decode / Register Fetch
– Execution
– Memory Stage
– Write Back

• How would control be handled in an automobile plant?

– a fancy control center telling everyone what to do?
– should we use a finite state machine?

9
Pipeline Control

• Pass control signals along just like the data

Write-back
Execution/Address Calculation Memory access stage stage control
stage control lines control lines lines
Reg ALU ALU ALU Mem Mem Reg Mem to
Instruction Dst Op1 Op0 Src Branch Read Write write Reg
R-format 1 1 0 0 0 0 0 1 0
lw 0 0 0 1 0 1 0 1 1
sw X 0 0 1 0 0 1 0 X
beq X 0 1 0 1 0 0 0 X

Instruction
Control M WB

EX M WB

IF/ID ID/EX EX/MEM MEM/WB

10
Datapath with Control
PCSrc

ID/EX
0
M
u WB
x EX/MEM
1
Control M WB
MEM/WB

EX M WB
IF/ID

Add

4 Add
Add result

RegWrite
Shift Branch
left 2

MemWrite
ALUSrc

MemtoReg
Read
Instruction

PC Address register 1 Read

Read data 1
register 2 Zero
Instruction
Registers Read ALU ALU
memory Write 0 Read
data 2 result Address data 1
register M M
u Data
Write x memory u
data x
1 0
Write
data

Instruction 16 32 6
[15–0] Sign ALU MemRead
extend control

Instruction
[20– 16]
0 ALUOp
M
Instruction u
[15– 11] x
1
RegDst

11
Dependencies

• Problem with starting next instruction before first is finished

– dependencies that “go backward in time” are data hazards

Time (in clock cycles)

Value of CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
register $2: 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Program
execution
order
(in instructions)
sub $2, $1, $3 IM Reg DM Reg

and $12, $2, $5 IM Reg DM Reg

or $13, $6, $2 IM Reg DM Reg

add $14, $2, $2 IM Reg DM Reg

sw $15, 100($2) IM Reg DM Reg

12
Software Solution

• Have compiler guarantee no hazards

• Where do we insert the “nops” ?

sub $2, $1, $3

and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)

• Problem: this really slows us down!

13
Forwarding

• Use temporary results, don’t wait for them to be written

– register file forwarding to handle read/write to same register
– ALU forwarding
Time (in clock cycles)
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
Value of register $2 : 10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
Value of EX/MEM : X X X – 20 X X X X X
Value of MEM/WB : X X X X – 20 X X X X

Program
execution order
(in instructions)
sub $2, $1, $3 IM Reg DM Reg

and $12, $2, $5 IM Reg DM Reg

or $13, $6, $2 IM Reg DM Reg

add $14, $2, $2 IM Reg DM Reg

sw $15, 100($2) IM Reg DM Reg

what if this $2 was $13? 14

Forwarding

ID/EX

WB
EX/MEM

Control M WB
MEM/WB

IF/ID EX M WB

M
Instruction

u
x
Registers
Instruction Data
PC ALU
memory memory M
u
M x
u
x

IF/ID.RegisterRs Rs
IF/ID.RegisterRt Rt
IF/ID.RegisterRt Rt
M EX/MEM.RegisterRd
IF/ID.RegisterRd Rd u
x
Forwarding MEM/WB.RegisterRd
unit

15
Can't always forward

• Load word can still cause a hazard:

– an instruction tries to read a register following a load instruction
that writes to the same register.
Time (in clock cycles)
Program CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
execution
order
(in instructions)
lw $2, 20($1) IM Reg DM Reg

and $4, $2, $5 IM Reg DM Reg

–
or $8, $2, $6 IM Reg DM Reg

add $9, $4, $2 IM Reg DM Reg

slt $1, $6, $7 IM Reg DM Reg

• Thus, we need a hazard detection unit to “stall” the load instruction

16
Stalling

• We can stall the pipeline by keeping an instruction in the same stage

Program Time (in clock cycles)

execution CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9 CC 10
order
(in instructions)

lw $2, 20($1) IM Reg DM Reg

and $4, $2, $5 IM Reg Reg DM Reg

or $8, $2, $6 IM IM Reg DM Reg

bubble

add $9, $4, $2 IM Reg DM Reg

slt $1, $6, $7 IM Reg DM Reg

17
Hazard Detection Unit

• Stall by letting an instruction that won’t write anything go forward

Hazard ID/EX.MemRead
detection
unit ID/EX
IF/IDWrite

WB
EX/MEM
M
Control u M WB
x MEM/WB
0
IF/ID EX M WB
PCWrite

M
Instruction

u
x
Registers
Instruction Data
PC ALU
memory memory M
u
M x
u
x

IF/ID.RegisterRs
IF/ID.RegisterRt
IF/ID.RegisterRt Rt M EX/MEM.RegisterRd
IF/ID.RegisterRd Rd u
x
ID/EX.RegisterRt Rs Forwarding MEM/WB.RegisterRd
Rt unit

18
Branch Hazards

• When we decide to branch, other instructions are in the pipeline!

Program Time (in clock cycles)

execution CC 1 CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 CC 8 CC 9
order
(in instructions)

40 beq $1, $3, 7 IM Reg DM Reg

44 and $12, $2, $5 IM Reg DM Reg

48 or $13, $6, $2 IM Reg DM Reg

52 add $14, $2, $2 IM Reg DM Reg

72 lw $4, 50($7) IM Reg DM Reg

• We are predicting “branch not taken”

– need to add hardware for flushing instructions if we are wrong
19
Flushing Instructions

IF.Flush

Hazard
detection
unit
M ID/EX
u
x
WB
EX/MEM
M
Control u M WB
x MEM/WB
0

IF/ID EX M WB

4 Shift
left 2
M
u
x
Registers =
Instruction Data
PC ALU
memory memory M
u
M x
u
x

Sign
extend

M
u
x
Forwarding
unit

20
Improving Performance

• Try and avoid stalls! E.g., reorder these instructions:

lw $t0, 0($t1)
lw $t2, 4($t1)
sw $t2, 0($t1)
sw $t0, 4($t1)

• Add a “branch delay slot”

– the next instruction after a branch is always executed
– rely on compiler to “fill” the slot with something useful

• Superscalar: start more than one instruction in the same cycle

21
Dynamic Scheduling

• The hardware performs the “scheduling”

– hardware tries to find instructions to execute
– out of order execution is possible
– speculative execution and dynamic branch prediction
• All modern processors are very complicated
– DEC Alpha 21264: 9 stage pipeline, 6 instruction issue
– PowerPC and Pentium: branch history table
– Compiler technology important

• This class has given you the background you need to learn more

Chapter Six: 2004 Morgan Kaufmann Publishers
No ratings yet
Chapter Six: 2004 Morgan Kaufmann Publishers
25 pages
Pipelining in MIPs Architecture
100% (3)
Pipelining in MIPs Architecture
23 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
50 pages
Lec12 Pipeline 2 Notes
No ratings yet
Lec12 Pipeline 2 Notes
58 pages
L13 Stalls and Flushes
No ratings yet
L13 Stalls and Flushes
27 pages
Lecture 4.3 - The Processor - Pipelining
No ratings yet
Lecture 4.3 - The Processor - Pipelining
27 pages
Pipe 2 New
No ratings yet
Pipe 2 New
41 pages
Pipelined Datapath and Control
No ratings yet
Pipelined Datapath and Control
26 pages
Unit 7 - Basic Processing
No ratings yet
Unit 7 - Basic Processing
85 pages
Two Forms of Pipelining: - E.g., Floating Point Operations
No ratings yet
Two Forms of Pipelining: - E.g., Floating Point Operations
36 pages
Unit 5 Pipeline Hazard
No ratings yet
Unit 5 Pipeline Hazard
31 pages
Forwarding Assignment
No ratings yet
Forwarding Assignment
35 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
71 pages
Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
No ratings yet
Pipeline Review: Here Is The Example Instruction Sequence Used To Illustrate Pipelining On The Previous Page
11 pages
Computer Architecture Hazards
No ratings yet
Computer Architecture Hazards
31 pages
8 Pipeline DDP Control
No ratings yet
8 Pipeline DDP Control
54 pages
CA Unit 3 Answers
No ratings yet
CA Unit 3 Answers
10 pages
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
No ratings yet
CS 162 Computer Architecture Lecture 3: Pipelining Contd.: Instructor: L.N. Bhuyan
21 pages
Embedded Computer Architecture 5SAI0
No ratings yet
Embedded Computer Architecture 5SAI0
59 pages
Processor Organization & Instruction Cycle
No ratings yet
Processor Organization & Instruction Cycle
31 pages
2014fa CS61C L31 DG PipelineII 6up
No ratings yet
2014fa CS61C L31 DG PipelineII 6up
4 pages
Lecture10 - Chapter4-P2
No ratings yet
Lecture10 - Chapter4-P2
46 pages
Pipelined Processor Design: Computer Architecture and Assembly Language
No ratings yet
Pipelined Processor Design: Computer Architecture and Assembly Language
22 pages
CPU Structure & Functions
No ratings yet
CPU Structure & Functions
44 pages
Chapter 2 Lecture 4 and 5
No ratings yet
Chapter 2 Lecture 4 and 5
56 pages
DDCO Notes-162-171
No ratings yet
DDCO Notes-162-171
10 pages
Design of 32bit MIPS Processor
No ratings yet
Design of 32bit MIPS Processor
23 pages
L117-19 MIPS Pipeline Implementation
No ratings yet
L117-19 MIPS Pipeline Implementation
37 pages
Module-5 DDCO
No ratings yet
Module-5 DDCO
35 pages
Pipe Lining
No ratings yet
Pipe Lining
43 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
85 pages
Lect8 Pipelined DP Control
No ratings yet
Lect8 Pipelined DP Control
59 pages
Pipelining 3
No ratings yet
Pipelining 3
37 pages
Controlling A Pipelined Datapath
No ratings yet
Controlling A Pipelined Datapath
17 pages
U33
No ratings yet
U33
61 pages
L2.1 CSE-4821 Instruction Set Architecture
No ratings yet
L2.1 CSE-4821 Instruction Set Architecture
13 pages
COA Unit 3
No ratings yet
COA Unit 3
89 pages
Data Hazards: Danger!Danger!Danger!
No ratings yet
Data Hazards: Danger!Danger!Danger!
7 pages
Pipelining 2
No ratings yet
Pipelining 2
33 pages
15IF11 Multicore A PDF
No ratings yet
15IF11 Multicore A PDF
64 pages
MIPS Pipelining and Hazards
0% (1)
MIPS Pipelining and Hazards
38 pages
CH10-Processor Structure and Function
No ratings yet
CH10-Processor Structure and Function
14 pages
Computer Architecture: Introduction To The Concept of Pipelined Processor
No ratings yet
Computer Architecture: Introduction To The Concept of Pipelined Processor
20 pages
Module 5 - Processor Structure and Function
No ratings yet
Module 5 - Processor Structure and Function
74 pages
Pipelining ControlUnitAndHazards
No ratings yet
Pipelining ControlUnitAndHazards
109 pages
Chapter V Processor Architecture
No ratings yet
Chapter V Processor Architecture
140 pages
Lecture-4-08 01 2025
No ratings yet
Lecture-4-08 01 2025
35 pages
L8 PipelineHazards 1
No ratings yet
L8 PipelineHazards 1
28 pages
Chapter4 2
No ratings yet
Chapter4 2
34 pages
Lec13 Pipe Control
No ratings yet
Lec13 Pipe Control
19 pages
Arch4 Pipelined Processor Design Afterlecture
No ratings yet
Arch4 Pipelined Processor Design Afterlecture
130 pages
Ca HW5
No ratings yet
Ca HW5
4 pages
Pipelined Processor Datapath Guide
No ratings yet
Pipelined Processor Datapath Guide
16 pages
Module 5
No ratings yet
Module 5
46 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
38 pages
Application of Synchronized Phasor Measurements Units in Power Systems
No ratings yet
Application of Synchronized Phasor Measurements Units in Power Systems
16 pages
FDA Make List
No ratings yet
FDA Make List
1 page
ps3 Remote Control
No ratings yet
ps3 Remote Control
2 pages
SF6 Gas Dew Point Meter Manual
No ratings yet
SF6 Gas Dew Point Meter Manual
22 pages
Back To Basics 02 - Safety Integrity Level (SIL) - Exida
No ratings yet
Back To Basics 02 - Safety Integrity Level (SIL) - Exida
3 pages
Nokia n9 rm-696 Service Schematics v1
No ratings yet
Nokia n9 rm-696 Service Schematics v1
12 pages
030 Mental Health PPT Presentation Template
No ratings yet
030 Mental Health PPT Presentation Template
26 pages
Vpls Configuration Ios XR With BGP and LDP Autodiscovery
No ratings yet
Vpls Configuration Ios XR With BGP and LDP Autodiscovery
104 pages
DP-440 430 340 330 Service Manual PDF
No ratings yet
DP-440 430 340 330 Service Manual PDF
311 pages
Wa 2a Product Manual
No ratings yet
Wa 2a Product Manual
16 pages
Smart Glasses: Farhana Abdullah, Arjun Vishwakarma
No ratings yet
Smart Glasses: Farhana Abdullah, Arjun Vishwakarma
5 pages
4200, R8.0.1, Configuration Guide Volume 5 X4 and X9 Modules, Rev. A, 009-2011-461 PDF
No ratings yet
4200, R8.0.1, Configuration Guide Volume 5 X4 and X9 Modules, Rev. A, 009-2011-461 PDF
264 pages
InstallationProgramming Repeater TFT
No ratings yet
InstallationProgramming Repeater TFT
28 pages
Joyeria
No ratings yet
Joyeria
8 pages
COLOR 3D Laser Scanning Microscope
No ratings yet
COLOR 3D Laser Scanning Microscope
28 pages
8960 - DWM Experiment 5
No ratings yet
8960 - DWM Experiment 5
6 pages
BRKSPG 2002
No ratings yet
BRKSPG 2002
90 pages
Cyber Security Practical File
No ratings yet
Cyber Security Practical File
21 pages
Worksheets
No ratings yet
Worksheets
4 pages
Result
No ratings yet
Result
48 pages
RTM Sit Status 30-04
No ratings yet
RTM Sit Status 30-04
10 pages
The Good Earth Introduction To Earth Science 5th Edition McConnell Full Download
0% (1)
The Good Earth Introduction To Earth Science 5th Edition McConnell Full Download
405 pages
Tutorial-Using SCI For Real-Time Monitoring in TI F28335 Target
No ratings yet
Tutorial-Using SCI For Real-Time Monitoring in TI F28335 Target
12 pages
An Introductory Textbook On Cyber-Physical Systems
No ratings yet
An Introductory Textbook On Cyber-Physical Systems
7 pages
Sap Law 20221208
No ratings yet
Sap Law 20221208
7 pages
小六閱讀理解 - PDF
No ratings yet
小六閱讀理解 - PDF
8 pages
Chrome Notification Analysis
No ratings yet
Chrome Notification Analysis
24 pages
Introduction To Electrical Safety
No ratings yet
Introduction To Electrical Safety
5 pages
Momo Statement Report
No ratings yet
Momo Statement Report
34 pages
Syllabus For BCom SEC
No ratings yet
Syllabus For BCom SEC
12 pages

Lec7 Pipelining

Uploaded by

Lec7 Pipelining

Uploaded by

Chapter Six

• Improve perfomance by increasing instruction throughput

Ideal speedup is number of stages in the pipeline. Do we achieve this?

• What makes it easy

• What makes it hard?

• We’ll build a simple pipeline and look at these issues

• What do we need to add to actually split the datapath into stages?

IF/ID ID/EX EX/MEM MEM/WB

Can you find a problem even if there are no dependencies?

IF/ID ID/EX EX/MEM MEM/WB

PC Address register 1 Read

Time (in clock cycles)

sub $11, $2, $3 IM Reg ALU DM Reg

• Can help with answering questions like:

IF/ID ID/EX EX/MEM MEM/WB

• We have 5 stages. What needs to be controlled in each stage?

• How would control be handled in an automobile plant?

• Pass control signals along just like the data

IF/ID ID/EX EX/MEM MEM/WB

PC Address register 1 Read

• Problem with starting next instruction before first is finished

Time (in clock cycles)

and $12, $2, $5 IM Reg DM Reg

or $13, $6, $2 IM Reg DM Reg

add $14, $2, $2 IM Reg DM Reg

sw $15, 100($2) IM Reg DM Reg

• Have compiler guarantee no hazards

sub $2, $1, $3

• Problem: this really slows us down!

• Use temporary results, don’t wait for them to be written

and $12, $2, $5 IM Reg DM Reg

or $13, $6, $2 IM Reg DM Reg

add $14, $2, $2 IM Reg DM Reg

sw $15, 100($2) IM Reg DM Reg

what if this $2 was $13? 14

• Load word can still cause a hazard:

and $4, $2, $5 IM Reg DM Reg

add $9, $4, $2 IM Reg DM Reg

slt $1, $6, $7 IM Reg DM Reg

• Thus, we need a hazard detection unit to “stall” the load instruction

• We can stall the pipeline by keeping an instruction in the same stage

Program Time (in clock cycles)

lw $2, 20($1) IM Reg DM Reg

and $4, $2, $5 IM Reg Reg DM Reg

or $8, $2, $6 IM IM Reg DM Reg

add $9, $4, $2 IM Reg DM Reg

slt $1, $6, $7 IM Reg DM Reg

• Stall by letting an instruction that won’t write anything go forward

• When we decide to branch, other instructions are in the pipeline!

Program Time (in clock cycles)

40 beq $1, $3, 7 IM Reg DM Reg

44 and $12, $2, $5 IM Reg DM Reg

48 or $13, $6, $2 IM Reg DM Reg

52 add $14, $2, $2 IM Reg DM Reg

72 lw $4, 50($7) IM Reg DM Reg

• We are predicting “branch not taken”

• Try and avoid stalls! E.g., reorder these instructions:

• Add a “branch delay slot”

• Superscalar: start more than one instruction in the same cycle

• The hardware performs the “scheduling”

You might also like