Implementing MIPS
We're ready to look at an implementation of the MIPS instruction set
Simplified to contain only
arithmetic-logic instructions: add, sub, and, or, slt
memory-reference instructions: lw, sw
control-flow instructions: beq, j
6 bits
5 bits
5 bits
5 bits
5 bits
op
rs
rt
rd
6 bits
5 bits
5 bits
16 bits
op
rs
rt
offset
6 bits
shamt funct
6 bits
26 bits
op
address
R-Format
I-Format
J-Format
Implementing MIPS: the
Fetch/Execute Cycle
High-level abstract view of fetch/execute implementation
use the program counter (PC) to read instruction address
fetch the instruction from memory and increment PC
use fields of the instruction to select registers to read
execute depending on the instruction
repeat
Data
PC
Address
Instruction
memory
Instruction
Register #
Registers
Register #
ALU
Address
Data
memory
Register #
Data
Overview: Processor Implementation
Styles
Single Cycle
Multi-Cycle
perform each instruction in 1 clock cycle
clock cycle must be long enough for slowest instruction; therefore,
disadvantage: only as fast as slowest instruction
break fetch/execute cycle into multiple steps
perform 1 step in each clock cycle
advantage: each instruction uses only as many cycles as it needs
Pipelined
execute each instruction in multiple steps
perform 1 step / instruction in each clock cycle
process multiple instructions in parallel assembly line
Functional Elements
Two types of functional elements in the hardware:
elements that operate on data (called combinational elements)
elements that contain data (called state or sequential elements)
Combinational Elements
Works as an input output function, e.g., ALU
Combinational logic reads input data from one register and
writes output data to another, or same, register
read/write happens in a single cycle combinational element
cannot store data from one cycle to a future one
Combinational logic hardware units
State
element
1
Clock cycle
Combinational logic
State
element
2
State
element
Combinational logic
State Elements
State elements contain data in internal storage, e.g., registers
and memory
All state elements together define the state of the machine
Flipflops and latches are 1-bit state elements, equivalently,
they are 1-bit memories
The output(s) of a flipflop or latch always depends on the bit
value stored, i.e., its state, and can be called 1/0 or high/low
or true/false
The input to a flipflop or latch can change its state depending
on whether it is clocked or not
State Elements on the Datapath:
Register File
Registers are implemented with arrays of D-flipflops
Clock
5 bits
R ead reg ister
nu m ber 1
5 bits
R ead reg ister
nu m ber 2
5 bits
W rite
re giste r
32 bits
R ead
data 1
32 bits
R ead
data 2
32 bits
Register file
W rite
da ta
W rite
Control signal
Register file with two read ports and
one write port
State Elements on the Datapath:
Register File
Port implementation:
Clock
Clock
Write
Read register
number 1
0
Register 0
Register 1
Register n 1
Register n
M
u
x
Read data 1
Register number
C
Register 0
n-to-1
decoder
n 1
Register 1
D
Read register
number 2
M
u
x
C
Register n 1
D
Read data 2
C
Register n
Register data
Read ports are implemented
with a pair of multiplexors 5
bit multiplexors for 32 registers
Write port is implemented using
a decoder 5-to-32 decoder for
32 registers. Clock is relevant to
write as register state may change
only at clock edge
Verilog
All components that we have discussed and shall discuss
can be fabricated using Verilog
Single-cycle Implementation of MIPS
Our first implementation of MIPS will use a single long clock
cycle for every instruction
Every instruction begins on one up (or, down) clock edge
and ends on the next up (or, down) clock edge
This approach is not practical as it is much slower than a
multicycle implementation where different instruction
classes can take different numbers of cycles
in a single-cycle implementation every instruction must take
the same amount of time as the slowest instruction
in a multicycle implementation this problem is avoided by
allowing quicker instructions to use fewer cycles
Even though the single-cycle approach is not practical it is
simple and useful to understand first
Note : we shall implement jump at the very end
Datapath: Instruction Store/Fetch &
PC Increment
Instruction
address
Add
PC
Instruction
Add Sum
Instruction
memory
PC
a. Instruction memory
b. Program counter
Read
address
c. Adder
Instruction
Three elements used to store
and fetch instructions and
increment the PC
Instruction
memory
Datapath
Animating the Datapath
Instruction <- MEM[PC]
PC <- PC + 4
ADD
4
PC
ADDR
Memory
RD
Instruction
Datapath: R-Type Instruction
5
Register
numbers
5
5
Data
Read
register 1
Read
data 1
Read
register 2
Registers
Write
register
Read
data 2
Write
data
Data
ALU control
Zero
ALU ALU
result
Instruction
Read
register 2
Registers
Write
register
Write
data
RegWrite
Read
register 1
Read
data 1
Zero
ALU ALU
result
Read
data 2
RegWrite
a. Registers
b. ALU
Two elements used to implement
R-type instructions
ALU operation
Datapath
Animating the Datapath
add rd, rs, rt
Instruction
op
rs
rt
5
rd
shamt funct
Operation
R[rd] <- R[rs] + R[rt];
RN1
RN2
WN
RD1
rs
Register File
ALU
WD
RD2
RegWrite
rt
Zero
Datapath: Load/Store Instruction
MemWrite
Instruction
Address
Write
data
Read
data
Data
memory
Read
register 1
16
Sign
extend
32
MemWrite
Read
data 1
Read
register 2
Registers
Write
register
Read
data 2
Write
data
Zero
ALU ALU
result
16
Sign
extend
32
Read
data
Data
memory
MemRead
b. Sign-extension unit
Two additional elements used
To implement load/stores
Address
Write
data
RegWrite
MemRead
a. Data memory unit
ALU operation
Datapath
Animating the Datapath
lw rt, offset(rs)
R[rt] <- MEM[R[rs] + s_extend(offset)];
rs
data from mem to reg
Animating the Datapath
sw rt, offset(rs)
MEM[R[rs] + sign_extend(offset)] <- R[rt]
WD of mem
takes in reg
from which data
is taken
Datapath: Branch Instruction
PC + 4 from instruction datapath
No shift hardware required:
simply connect wires from
input to output, each shifted
left 2 bits
Instruction
Add Sum
Branch target
Shift
left 2
3
Read
register 1
ALU operation
Read
data 1
Read
register 2
Registers
Write
register
Read
data
2
Write
data
RegWrite
16
Sign
extend
32
Datapath
ALU Zero
To branch
control logic
Animating the Datapath
op
rs
rt
offset/immediate
16
PC +4 from
instruction
datapath
ADD
Operation
<<2
RN1
RN2
WN
RD1
Register File
ALU
Zero
WD
RD2
RegWrite
16
E
X
T
N
D
32
beq rs, rt, offset
if (R[rs] == R[rt]) then
PC <- PC+4 + s_extend(offset<<2)
MIPS Datapath I: Single-Cycle
Input is either register (R-type) or sign-extended
lower half of instruction (load/store)
Data is either
from ALU (R-type)
or memory (load)
Combining the datapaths for R-type instructions
and load/stores using two multiplexors
Animating the Datapath:
R-type Instruction
add rd,rs,rt
Instruction
32
16
rs
rd
rt
RN1
Operation
RN2
WN
RD1
rs
Register File
WD
ALU
Zero
rs+rt
DM remains unused.
rt
rd
RD2
RegWrite
16
E
X
T
N
D
32
M
U
X
rt
ALUSrc
MemWrite
MemtoReg
ADDR
Data
Memory
WD
MemRead
RD
M
U
X
Animating the Datapath:
Load Instruction
lw rt,offset(rs)
Instruction
32
16
RN1
RN2
Operation
rt
WN
RD1
Register File
ALU
Zero
WD
RD2
RegWrite
16
E
X
T
N
D
32
M
U
X
ALUSrc
MemWrite
MemtoReg
ADDR
Data
Memory
RD
WD
MemRead
Since for writing, in R-Type rd is used and in LW/SW rt is used, a MUX is reqd there.
M
U
X
Animating the Datapath:
Store Instruction
sw rt,offset(rs)
Instruction
32
16
Operation
RN1
RN2
WN
RD1
Register File
ALU
Zero
WD
RD2
RegWrite
16
E
X
T
N
D
32
M
U
X
ALUSrc
MemWrite
MemtoReg
ADDR
Data
Memory
WD
MemRead
RD
M
U
X
MIPS Datapath II: Single-Cycle
Separate adder as ALU operations and PC
increment occur in the same clock cycle
Add
4
PC
Read
address
Instruction
Instruction
memory
Registers
Read
register 1
Read
Read
data
1
register 2
Read
Write
data 2
register
3
ALUSrc
M
u
x
RegWrite
Separate instruction memory
as instruction and data read
occur in the same clock cycle
MemWrite
MemtoReg
Write
data
16
ALU operation
Sign 32
extend
Adding instruction fetch
Zero
ALU ALU
result
Address
Read
data
Data
memory
Write
data
MemRead
M
u
x
MIPS Datapath III: Single-Cycle
New multiplexor
PCSrc
M
u
x
Add
Add ALU
result
4
Shift
left 2
PC
Registers
Read
register 1
Read
Read
data 1
register 2
Read
address
Instruction
Instruction
memory
Write
register
Write
data
RegWrite
16
Instruction address is either
PC+4 or branch target address
ALUSrc
Read
data 2
M
u
x
Extra adder needed as both
adders operate in each cycle
3
ALU operation
Zero
ALU ALU
result
MemtoReg
Address
Write
data
Sign
extend
MemWrite
Data
memory
32
MemRead
Adding branch capability and another multiplexor
Important note: in a single-cycle implementation data cannot be stored
during an instruction it only moves through combinational logic
Read
data
M
u
x
Datapath Executing add
ADD
M
U
X
ADD
ADD
PC
<<2
Instruction
ADDR
Instruction
Memory
RD
32
RN1
RN2
16
PCSrc
Operation
WN
RD1
Register File
ALU
Zero
WD
RD2
RegWrite
16
add rd, rs, rt
E
X
T
N
D
32
M
U
X
ALUSrc
MemWrite
ADDR
Data
Memory
WD
MemRead
MemtoReg
RD
M
U
X
Datapath Executing lw
ADD
M
U
X
ADD
ADD
PC
<<2
Instruction
ADDR
Instruction
Memory
RD
32
RN1
RN2
16
PCSrc
Operation
WN
RD1
Register File
ALU
Zero
WD
RD2
RegWrite
16
lw rt,offset(rs)
E
X
T
N
D
32
M
U
X
ALUSrc
MemWrite
ADDR
Data
Memory
WD
MemRead
MemtoReg
RD
M
U
X
Datapath Executing sw
ADD
M
U
X
ADD
ADD
PC
<<2
Instruction
ADDR
Instruction
Memory
RD
32
RN1
RN2
16
PCSrc
Operation
WN
RD1
Register File
ALU
Zero
WD
RD2
RegWrite
16
sw rt,offset(rs)
E
X
T
N
D
32
M
U
X
ALUSrc
MemWrite
ADDR
Data
Memory
WD
MemRead
MemtoReg
RD
M
U
X
Datapath Executing beq
ADD
M
U
X
ADD
ADD
PC
<<2
Instruction
ADDR
Instruction
Memory
RD
32
RN1
RN2
16
PCSrc
Operation
WN
RD1
Register File
ALU
Zero
WD
RD2
RegWrite
16
beq r1,r2,offset
E
X
T
N
D
32
M
U
X
ALUSrc
MemWrite
ADDR
Data
Memory
WD
MemRead
MemtoReg
RD
M
U
X