0% found this document useful (0 votes)
16 views23 pages

Control

The document outlines the design of a single-cycle implementation for a subset of MIPS instructions, focusing on the ALU control unit that generates control signals for various operations such as load, store, branch, and R-type arithmetic. It details the mapping of ALU operations based on opcode and funct fields, and describes the control mechanism that simplifies the main control unit's design through multi-level decoding. Additionally, it provides insights into the enhanced datapath and control signal settings necessary for executing instructions in a single clock cycle.

Uploaded by

yoosefelbooz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views23 pages

Control

The document outlines the design of a single-cycle implementation for a subset of MIPS instructions, focusing on the ALU control unit that generates control signals for various operations such as load, store, branch, and R-type arithmetic. It details the mapping of ALU operations based on opcode and funct fields, and describes the control mechanism that simplifies the main control unit's design through multi-level decoding. Additionally, it provides insights into the enhanced datapath and control signal settings necessary for executing instructions in a single clock cycle.

Uploaded by

yoosefelbooz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

COMPUTER

ARCHITECTURE
Control
ALU Control
This section introduces a single-cycle implementation for a subset of MIPS instructions—load word (lw),
store word (sw), branch equal (beq), and R-type arithmetic-logical instructions
(add, sub, AND, OR, slt)—using the datapath from the previous section and adding a control function.
It focuses on designing the ALU control as the first step in building the control unit,
with plans to later add support for the jump (j) instruction.

ALU Functions

Load/Store (lw/sw)
ALU performs addition (0010) to compute the memory address (base register + sign-extended offset).

Branch Equal (beq) ALU control lines Function


ALU performs subtraction (0110) to compare two registers, 0000 AND
using the Zero output to detect equality. 0001 OR
0010 add
R-type Instructions 0110 subtract
ALU performs one of five operations (add, sub, AND, OR, slt), 0111 set on less than
determined by the 6-bit funct field in the instruction’s low-order bits.
1100 NOR
ALU Control Unit

Function Inputs
Combines ALUOp and funct field to generate the 6-bit funct field: Specifies the exact operation for
appropriate 4-bit ALU control input. R-type instructions (e.g., 100000 for add, 101010 for slt).
For lw/sw and beq, ALUOp alone determines 2-bit ALUOp: A control signal indicating the instruction class:
the operation. • 00: Add (for lw and sw address calculation).
• 01: Subtract (for beq comparison).
For R-type, ALUOp (10) signals the control unit to
• 10: Use funct field (for R-type instructions
decode the funct field and select the corresponding
to select add, sub, AND, OR, or slt).
ALU operation.

Output
A 4-bit ALU control signal (e.g., 0010 for add,
0110 for subtract) directly controls the ALU operation.

The ALU is reused across all instruction types, with its operation tailored by the ALU control unit.
The ALUOp field simplifies control by categorizing operations,
while the funct field provides fine-grained control for R-type instructions.
Control Mechanism
Multi-Level Decoding
Structure Advantages
The main control unit generates the ALUOp bits, Reduces the size of the main control unit by delegating
which are then decoded by a smaller ALU control detailed decoding to a secondary unit. Potentially increases
unit to produce the final ALU control signals. control unit speed, critical for minimizing clock cycle time.

Implementation Approach
Mapping Logic Design
Only a subset of the 64 possible funct field values A small piece of logic recognizes the relevant funct values
(2⁶) is relevant, and the funct field is used only and sets the ALU control bits accordingly.
when ALUOp is 10 (R-type). A truth table (Figure 4.13) lists the combinations of ALUOp
and funct field that require specific ALU control values.
Truth Table The full truth table (256 entries, 2⁸) is simplified by showing
A logical representation showing input only entries where the ALU control must be asserted,
values and corresponding outputs. omitting “don’t care” or deasserted cases.
Control Mechanism (Cont.)
Instruction Instruction Desired ALU control Illustrates how ALU control bits are derived from
ALUOp Funct field
opcode operation ALU action input ALUOp and funct field, showing the mapping for
LW 00 load word XXXXXX add 0010 lw, sw, beq, and R-type instructions in binary.
SW 00 store word XXXXXX add 0010
Notes
Branch equal 01 branch equal XXXXXX subtract 0110
When ALUOp is 00 or 01, the funct field
R-type 10 add 100000 add 0010
is irrelevant (don’t care).
R-type 10 subtract 100010 subtract 0110
When ALUOp is 10, the funct field
R-type 10 AND 100100 AND 0000 determines the ALU operation.
R-type 10 OR 100101 OR 0001
R-type 10 set on less than 101010 set on less than 0111

Figure 4.12

Don’t-Care Term
An input where the output is independent Don’t-care terms reduce complexity by allowing flexibility in unused input combinations.
of its value (marked as X),
simplifying logic design.
Main Control Unit
Objective: Design the main control unit to manage the datapath (Figure 4.11),
generating write signals, multiplexor selectors, and ALU control inputs.
Builds on the ALU control design (previous section) by connecting instruction fields to datapath operations.

0 rs rt rd shamt funct R-type instruction


31:26 25:21 20:16 15:11 10:6 5:0

35 or 43 rs rt address Load or store instruction Figure 4.14


31:26 25:21 20:16 15:0

4 rs rt address Branch instruction


31:26 25:21 20:16 15:0

Datapath Control Requirements (Figure 4.11)


Write Signals Multiplexor Selectors ALU Control
For state elements (PC, ALU input (register vs. offset). PC source (PC + 4 vs. branch target). 4-bit signal based on ALUOp
register file, data memory). Register write data (ALU vs. memory). Register destination (rt vs. rd). and funct field.
Main Control Unit (Cont.)
ALUOp Funct field Truth table for ALU control, showing only asserted
ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 Operation outputs with don’t-care terms (X) for unused
or irrelevant inputs.
0 0 X X X X X X 0010
X 1 X X X X X X 0110
Notes
1 X X X 0 0 0 0 0010
X in ALUOp (e.g., 1X, X1) where 11 is unused.
1 X X X 0 0 1 0 0110
XX in F5–F4 (funct field) for R-type,
1 X X X 0 1 0 0 0000 as they are always 10.
1 X X X 0 1 0 1 0001
XXXXXX in funct for non-R-type, as ALUOp
1 X X X 1 0 1 0 0111 alone determines the operation.

Figure 4.13

Design Process
The opcode (Op[5:0]) drives the main control unit to set ALUOp and other signals.
The ALUOp and funct field feed into the ALU control unit (previously designed) to generate the 4-bit ALU control.
A multiplexor is added to select the destination register (rt for lw, rd for R-type) based on instruction type.
Enhanced Datapath
PCSrc

Figure 4.15 0

M
Add u
x
4 ALU
Add
result 1
RegWrite Shift
left 2
Instruction [25:21] Read
register 1 MemWrite
Read
PC address Instruction [20:16] Read
Read data 1 MemtoReg
register 2 ALUSrc Zero
Instruction 0
[31:0] M ALU
Write Read ALU Read
u 0 Address 1
Instruction register data 2 result data
Instruction [15:11] x M M
memory
1 u u
Write
Data x x
Registers
RegDst 1 0
Data
Write memory
data

Instruction [15:0] 16 Sign- 32


ALU
extend control
MemRead

Instruction [5:0] ALUOp


Figure 4.15: Enhances datapath from Figure 4.11.
Enhanced Datapath (Cont.)
Additions to Figure 4.11 Signal
name
Effect when deasserted
(Input: 0)
Effect when asserted
(Input: 1)

• Instruction labels: Indicate fields like opcode, rs, rt, rd, and offset. RegDst
Write register is rt (bits Write register is rd (bits
20:16, for lw). 15:11, for R-type).
• Multiplexor: Selects the write register number for the register file
Write register with data on
(rt [20:16] for lw, rd [15:11] for R-type). RegWrite No register write.
the write input.
• ALU control block: Generates the 4-bit ALU control signal (from prior design). Second ALU operand is Second ALU operand is
ALUSrc register data (Read data 2, sign-extended 16-bit offset
Control lines (shown in color): for R-type/beq). (for lw/sw).
• Write signals: For register file (RegWrite) and data memory (MemWrite). PC = PC + 4 (sequential PC = branch target
PCSrc
• Read signal: For data memory (MemRead). instruction). address (for beq if taken).
• Multiplexor controls: For ALU input, register write data, Memory outputs data to
MemRead No memory read.
PC source, and write register number. read data line (for lw).

Memory writes data from


Control Signals: MemWrite No memory write.
write input (for sw).
• Seven single-bit signals: RegDst, RegWrite, ALUSrc, PCSrc, Register write data from Register write data from
MemtoReg
MemRead, MemWrite, MemtoReg. ALU (for R-type). memory (for lw).
• 2-bit ALUOp: Determines ALU operation category
(e.g., 00 for add, 10 for R-type). Figure 4.16
Describes the effect of seven control
signals on the datapath, detailing
asserted (1) and deasserted (0) states.
Setting Control Signals
Inputs to Control Unit: Six opcode bits (Op[5:0], bits 31:26).
Control Signal Generation:
Most signals (RegDst, RegWrite, ALUSrc, MemRead, MemWrite,
MemtoReg, ALUOp) are set based solely on the opcode.
PCSrc Exception: Requires both:
• A Branch signal from the control unit (asserted for beq).
• The ALU’s Zero output (asserted if beq registers are equal).
• PCSrc = Branch AND Zero: Asserted only when beq is taken.
Process: The control unit decodes the opcode to set the seven signals and ALUOp;
PCSrc combines opcode-derived Branch with ALU feedback.

Next Steps
Figure 4.17 Figure 4.18

Shows the datapath with the control Informally defines control signal values (0, 1, or X [don’t care])
unit and all nine control signals for each opcode, derived from Figures 4.12 (ALU control),
(7 single-bit + 2-bit ALUOp). 4.16 (signal effects), and 4.17 (datapath).
Operation of the Datapath
Figure 4.17 0
M
Add u
RegDst x
4 ALU
Branch Add 1
result
MemRead
Shift
MemtoReg
left 2
Instruction [31:26] ALUOp
Control
MemWrite

ALUSrc
RegWrite

Instruction [25:21] Read


register 1
Read
PC address Instruction [20:16] Read
Read data 1
register 2 Zero
Instruction 0
[31:0] M ALU
Write Read ALU Read
u 0 Address 1
Instruction register data 2 result data
Instruction [15:11] x M M
memory
1 u u
Write
Data x x
Registers
1 0
Data
Write memory
data

Instruction [15:0] 16 Sign- 32


ALU
Purpose: Illustrates how the datapath extend control
(Figure 4.17) executes instructions in
one clock cycle, using control signals Instruction [5:0]
from Figure 4.16 and their settings
from Figure 4.18.
Control Unit Design
Inputs PCSrc
6-bit opcode (bits 31:26). Derived from Branch (opcode-based)
ANDed with ALU Zero output.

Outputs RegDst
Branch
Nine control signals (7 single-bit + 2-bit ALUOp). MemRead
MemtoReg
ALUOp
Control
Signal Settings MemWrite

ALUSrc
Figure 4.18 RegWrite

Instruction RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1

Figure 4.18 Control signal settings per opcode, with X for don’t cares.
Figure 4.19 0
M
Add u
RegDst x
4 Branch
MemRead
MemtoReg
Instruction [31:26] ALUOp
Control
MemWrite

ALUSrc
RegWrite

Instruction [25:21] Read


register 1
Read
PC address Instruction [20:16] Read
Read data 1
register 2 Zero
Instruction 0
[31:0] M ALU
Write Read ALU
u register 0
Instruction data 2 result
Instruction [15:11] x M M
memory
1 u u
Write
Data x x
Registers
1 0

Instruction [15:0] ALU


control
Example add $t1, $t2, $t3
Instruction [5:0]

R-type Execution
R-type Execution (Cont.)
Example add $t1, $t2, $t3

Steps
Fetch Register Read
01 Instruction fetched from instruction memory;
PC incremented by 4 (adder active).
02 Register file reads $t2 (rs) and $t3 (rt); control unit sets
signals (e.g., RegDst = 1, RegWrite = 1, ALUOp = 10).

ALU Operation Register Write


03 ALU performs addition (funct field
100000 → ALU control 0010) on register data.
04 ALU result written to $t1 (rd, bits 15:11) in register file.

Signals
RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOp0
1 0 0 1 0 0 0 1 0
Figure 4.20 0
M
Add u
RegDst x
4 Branch
MemRead
MemtoReg
Instruction [31:26] ALUOp
Control
MemWrite

ALUSrc
RegWrite

Instruction [25:21] Read


register 1
Read
PC address Instruction [20:16] Read
data 1
Instruction 0 Zero
[31:0] M ALU
Write ALU Read
u Address 1
Instruction register result data
x M M
memory
1 u u
Write
Data x x
Registers
1
Data
memory

Instruction [15:0] 16 Sign- 32


ALU
extend control
Example lw $t1, offset($t2)

Load Word Execution


Load Word Execution (Cont.)
Example lw $t1, offset($t2)

Steps
Fetch Register Read
01 Instruction fetched; PC incremented by 4. 02 Register file reads $t2 (rs, base register).

Address Calculation Memory Read


03 ALU adds $t2 and sign-extended offset
(bits 15:0) (ALUOp = 00, ALU control 0010).
04 Data memory uses ALU result as address;
outputs data (MemRead = 1).

Register Write
05 Memory data written to $t1 (rt, bits 20:16)
(RegWrite = 1, MemtoReg = 1).

Signals
RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOp0
0 1 1 1 1 0 0 0 0
Figure 4.21 0
M
Add u
RegDst x
4 ALU
Branch Add 1
result
MemRead
Shift
MemtoReg
left 2
Instruction [31:26] ALUOp
Control
MemWrite

ALUSrc
RegWrite

Instruction [25:21] Read


register 1
Read
PC address Instruction [20:16] Read
Read data 1
register 2 Zero
Instruction
[31:0] ALU
Read 0
Instruction data 2 M
memory
u
Registers x

Instruction [15:0] 16 Sign- 32


ALU
extend control
Example beq $t1, $t2, offset

Branch Equal Execution


Branch Equal Execution (Cont.)
Example beq $t1, $t2, offset

Steps
Fetch Register Read
01 Instruction fetched; PC incremented by 4. 02 Register file reads $t1 (rs) and $t2 (rt).

Comparison and Address PC Update


03 • ALU subtracts $t1 - $t2 (ALUOp = 01,
ALU control 0110); Zero output indicates equality.
04 If Zero = 1 (equal), PC = branch target (PCSrc = 1);
else PC = PC + 4 (PCSrc = 0).
• Separate adder computes branch target
(PC + 4 + sign-extended offset << 2).

Signals
RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOp0
X 0 X 0 0 0 1 0 1
Finalizing Control
Purpose: Precisely define the control function for the single-cycle datapath to execute
R-format, load word (lw), store word (sw), and branch equal (beq) instructions.
Method: Use a truth table (Figure 4.22) to map opcode values to control signal settings, derived from Figure 4.18.

Input or output Signal name R-format lw sw beq


Op5 0 1 1 0
Figure 4.22
Op4 0 0 0 0 Truth table for the control unit.
Op3 0 0 1 0
Inputs
Op2 0 0 0 1
Op1 0 1 1 0
Op0 0 1 1 0
RegDst 1 0 X X The truth table fully specifies the control function and can
ALUSrc 0 1 1 0 be directly implemented in gates using automated tools.
MemtoReg 0 1 X X
RegWrite 1 1 0 0
Outputs MemRead 0 1 0 0
MemWrite 0 0 1 0 Single-Cycle Implementation
Branch 0 0 0 1 An approach where each instruction
ALUOp1 1 0 0 0 executes in one clock cycle, simple but
ALUOp0 0 0 0 1 impractical due to speed limitations.
Instruction [25:0] Jump address [31:0]
Shift 1
left 2 M
26 28 PC + 4 [31:28]
Add u
RegDst x
4 Branch
0
Jump MemRead
MemtoReg
Instruction [31:26] ALUOp
Control
MemWrite

ALUSrc
RegWrite

Read
PC address
Instruction Zero
[31:0]
Instruction
memory

Figure 4.24
ALU
control

Jump Instruction
Jump Instruction (Cont.) opcode
31:26
address
25:0

New Components
Multiplexor: Added to select the PC source from three options:
• PC + 4: Sequential instruction (for non-branch, non-jump).
• Branch target address: For beq when taken (from branch adder).
• Jump target address: For jump instructions.
Jump Target Address Calculation:
• Shift Left 2: The 26-bit immediate is shifted left by 2 bits (append 00),
implemented by wiring (no hardware shift needed, as it’s a fixed operation).
• Concatenation: Combines the shifted 26 bits with the upper 4 bits of PC + 4
(bits 31:28) to form a 32-bit address.
Integration: The new multiplexor replaces the previous two-way PC source multiplexor
(PC + 4 vs. branch target) with a three-way multiplexor to include the jump target.

Control Modifications
New Control Signal: Jump (single-bit):
• Asserted (1): When the instruction is a jump (opcode = 000010, decimal 2).
• Deasserted (0): For all other instructions (R-type, lw, sw, beq).
Jump = 1: Selects jump
Function: Controls the new PC source multiplexor to select the jump target address when asserted.
target address for PC.
RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOp0
Signals X X X 0 0 0 0 X X
Inefficiency
Single-Cycle Implementation Overview Inefficiency of Single-Cycle Design
Functionality: Correctly executes instructions Fixed Clock Cycle: The clock cycle length is determined
(e.g., R-type, lw, sw, beq, j) in one clock cycle using the by the longest possible path in the processor,
datapath and control described previously. typically the load instruction (lw), which uses five
Design: Each instruction uses the entire datapath, functional units in series.
with resources (e.g., ALU, memory) allocated for the Consequence: All instructions, even simpler ones
full cycle, even if not needed. (e.g., R-type or beq), must use the same long clock cycle,
leading to inefficiency.
Performance: Despite a CPI (Cycles Per Instruction) of 1,
Modern Challenges the long clock cycle results in poor overall performance
due to low clock frequency.
Floating-point units: Require longer computation times.
Complex instructions: Increase the worst-case delay,
further lengthening the clock cycle.

Violation of Design Principle


Making the Common Case Fast: A key idea from Chapter 1 emphasizes optimizing for frequently executed operations.
Single-Cycle Violation: The fixed clock cycle is dictated by the worst-case delay (e.g., lw),
slowing down common-case instructions (e.g., R-type) that could execute faster.
Pipelining

Alternative Approach
The next section introduces pipelining,
which uses a similar datapath but achieves
higher throughput.

How It Works
Pipelining executes multiple instructions simultaneously
by dividing the datapath into stages,
allowing each stage to process a different instruction in parallel.

Benefit
Significantly improves efficiency by reducing the effective
cycle time and increasing instruction throughput.

You might also like