LIST OF INSTRUCTION:
R-type: 0
Opcode Rs1 Rs2 Rd Funct3
15:12 11:9 8:6 5:3 2:0
Order Command Opcode Funct3 Describe
1 ADD 0000 000 rd ← rs1 + ☑
rs2
2 SUB 0000 001 rd ← rs1 – ☑
rs2
3 AND 0000 011 rd ← rs1 & ☑
rs2
4 OR 0000 100 rd ← rs1 | ☑
rs2
5 XOR 0000 101 rd ← rs1 ^ ☑
rs2
6 SLT (set on 0000 110 Rd= (a<b)? ☑
less than) 1:0
7 SGT (set on 0000 111 Rd= (a>b)? ☑
greater 1:0
than)
8 SETE (set on 0000 010 Rd= ☑
equal) (a==b)?1:0
I-type: 1
Opcode Rs1 Rd imm
15:12 11:9 8:6 5:0
Order Command Opcode Describe
1 ADDI 0001 rd ← rs1 + imm ☑
2 SUBI 0010 rd ← rs1 - imm ☑
3 ANDI 0101 rd← rs1 & imm ☑
4 ORI 0110 rd ← rs1 | imm ☑
5 XORI 0100 rd ← rs1 ^ imm ☑
6 LSLI 0011 rd ← rs1 << imm ☑
7 LSRI 1111 rd ← rs1 >> imm ☑
L-type:
LD
Opcode Rs1 Rd imm
15:12 11:9 8:6 5:0
1 LD 0111 rd ← Mem[rs1 + ☑
offset]
LI: 2
Opcode Rd imm
15:12 11:9 8:0
1 LI 1000 rd ← imm ☑
S-type: 3
Opcode Rs1 Rs2 Imm
15:12 11:9 8:6 5:0
Order Command Opcode Describe
1 ST 1001 Mem[rs1 + offset] ☑
← rs2
B-type: 4
Opcode Rs1 Imm
15:12 11:9 8:0
Order Command Opcode Describe
1 BEQZ 1010 if(rs1 == 0) PC ☑
+= imm
2 BNQZ 1011 if(rs1 ≠ 0) PC += ☑
imm
J-type: 5
Opcode Imm
15:12 11:0
Orde Comman Opcod Describe
r d e
1 JMP 1100 PC ← PC + ☑
imm
2 CALL 1101 PC -> ☑
Mem[sp]
PC = PC +
offset
SYS-type: 6
Opcode Imm Funct3
15:12 11:3 2:0
Order Command Opcode Funct3 Describe
1 NOP 1110 000 DO NOTHING ☑
2 RET 1110 001 COME BACK ☑
FROM
INTERRUPT
3 EI 1110 010 ENA INTERRUPT ☑
4 DI 1110 011 DIS INTERRUPT ☑
In CPU RISC, basiclly, there are 5 stages in it which is IF- fetch instruction
from memory, ID- decode instruction and read registers, EX- perform ALU
operation, MEM- access memory if needed( use in load, store, call, return),
WB- write result to register( usually uses in R-type, I-type).
I/ CONSTRUCTION:
1.1. INSTRUCTION MEMORY:
Belonging to the above diagram, Instruction memory will be built with (2 12
/2 ) = 2048 16-bit instructions. My CPU is organised that followed Harvard
architecture, which is divided into 2 parts: Data memory and Instruction
memory. This action helps CPU to reduce conflict between bus and decrease
power consumption. And I will combine between instruction memory block
with instruction decoder block. This action will simplify the whole block. With
initial instructions inside Ins_Mem which are represent for commands that we
want CPU comply. And I will divide it into 2 parts – 8-bits MSB and 8-bits LSB
for ease controll.
In RISC CPU, instructions are designed to be more simple and unified so that
it can optimise the processing procedure. Each instructions in MIPS has also
the 32-bit static length and divided into 3 baisc forms: R-type, I-type, B-type,
J-type and S-type.
Instruction consists of:
+ R-type: This instruction will process inside ALU which are addition,
subtraction, logic algorithms and shift left, right. Its construction includes
classes:
Opcode Rs1 Rs2 Rd Funct3
15:12 11:9 8:6 5:3 2:0
Where opcode represents what kind of action that ALU will process, which
occupies 4 most significant bits. Rs1, Rs2, Rd is a source register and
destination register, respectively. That means it contain values for
calculating. Due to lack of instruction length so we also design funct3 to
expand opcode function, hence we can have more options without increasing
of opcode length. And I will show you all things that my CPU can process
below:
Order Command Opcode Funct3 Describe
1 ADD 0000 000 rd ← rs1 + rs2
2 SUB 0000 001 rd ← rs1 – rs2
3 AND 0000 011 rd ← rs1 & rs2
4 OR 0000 100 rd ← rs1 | rs2
5 XOR 0000 101 rd ← rs1 ^ rs2
6 LSL (logical 0000 110 rd ← rs1 <<
shift left) rs2
7 LSR (logical 0000 111 rd ← rs1 >>
shift right) rs2
8 Set on less 0000 010 Rd= (a<b)?
than 1:0
+ I-type: this type is almost the same to R-type but this action will work with
constant instead of registers like R-type. Specially, in I-type, we have 2
commands to load directly value into registers. Its construction includes
classes:
Opcode Rs1 Rd imm
15:12 11:9 8:6 5:0
Where Opcode, Rs1, Rd is the same to R-type, the different point is that we
have immadiate slot which can contain constant for further calculations.
Order Command Opcode Describe
1 ADDI 0001 rd ← rs1 + imm
2 SUBI 0010 rd ← rs1 - imm
3 ANDI 0101 rd← rs1 & imm
4 ORI 0110 rd ← rs1 | imm
5 XORI 0100 rd ← rs1 ^ imm
6 LSLI 0011 rd ← rs1 << imm
7 LD 0111 rd ← Mem[rs1 + offset]
Opcode Rd imm
15:12 11:9 8:0
1 LI 1000 rd ← imm
+ S-type: fundamentally, this is a command that write back value after
calculating into data memory, it is completely reverse to load command. And
here is construction and detail:
Opcode Rs1 Rs2 Imm
15:12 11:9 8:6 5:0
Order Command Opcode Describe
1 ST 1001 Mem[rs1 + offset]
← rs2
With the opcode 10001, CPU know that it has to put the value/ address from
ALU’s output then feed to data memory with address rs1 + offser, where
offset equals immediate. Because its control signals from control unit will
different so we decided to separate it into many type that we can control
easily later.
+ B-type: in almost high-level programming languages, they have many
types of conditional command like if, while, for,… So to represent it we have
B-type:
Opcode Rs1 Rs2 Imm
15:12 11:9 8:6 5:0
Order Command Opcode Describe
1 BEQZ 1010 if(rs1 == rs2) PC
+= imm
2 BNQZ 1011 if(rs1 ≠ rs2) PC +=
imm
Actually, we have totally 2 commands such as BEQZ- branch if equal to zero
and BNQZ- branch if not equal to zero. Inside ALU, we have a variable zero to
check whether 2 registers equal or not then generating a signal to control. If
condition is satisfied, index of PC will be increased by immediate.
+ J-type: Just like the above command but in J-type, they don’t need any
condition to execute instead of that, they will jump to any position that we
want to. Such as go to label in C program.
Opcode Imm
15:12 11:0
Order Command Opcode Describe
1 JMP 1100 PC ← PC + imm
2 JAL 1101 x2 (RA) ← PC + 2 (lưu trực tiếp vào
thanh ghi x2)
PC ← PC + imm
+ SYS-type: These are system control instructions, which typically do not
directly manipulate data but instead affect CPU control flow (such as
interrupt handling, function calls, returns, etc.).
Opcode Imm Funct3
15:12 11:3 2:0
Order Command Opcode Funct3 Describe
1 NOP 1110 000 DO NOTHING
2
3 RET 1110 001 COME BACK FROM
INTERRUPT
4 EI 1110 010 ENA INTERRUPT
5 DI 1110 011 DIS INTERRUPT
6 CALL 1110 100 Nhảy đến địa chỉ
và lưu PC hiện tại
vào stack
Input from PC and the output including instruction, opcode, register address
also a funct3 for further support opcode. Here is my code:
module Ins_Mem(address,opcode,rd,rs1,rs2,funct3,instruction);
input [11:0]address;
output [3:0] opcode;
output [2:0] rd;
output [2:0] rs1;
output [2:0] rs2;
output [2:0] funct3;
wire [2:0] instr_type;
reg [7:0] imem[0:17];
output reg [15:0] instruction;
initial begin
imem[0]<=8'b0000_1001; // thuc hien phep cong
imem[1]<=8'b0101_1000;
imem[2]<=8'b1001_0000; // store rs3 vao rs0 + 5
imem[3]<=8'b1100_0101;
imem[4]<=8'b0011_0100;
imem[5]<=8'b0010_0011;
imem[6]<=8'b0100_0000;
imem[7]<=8'b0001_0010;
imem[8]<=8'b0101_0111;
imem[9]<=8'b0010_0010;
imem[10]<=8'b0110_0010;
imem[11]<=8'b0001_0010;
imem[12]<=8'b0111_0001;
imem[13]<=8'b0001_0011;
imem[14]<=8'b1000_0110;
imem[15]<=8'b0001_0011;
imem[16]<=8'b1001_0001;
imem[17]<=8'b0011_0001;
end
always @(*)begin
instruction = {imem[address],imem[address+1]};
end
// Tach cac thanh phan
assign opcode = instruction[15:12];
assign instr_type =
(opcode == 4'b0000) ? 3'd0 : // R-type
(opcode >= 4'b0001 && opcode <= 4'b1000) ? 3'd1 : // I-type
(opcode == 4'b1001) ? 3'd2 : // S-type
(opcode == 4'b1010 || opcode == 4'b1011) ? 3'd3 : // B-type
(opcode == 4'b1100 || opcode == 4'b1101) ? 3'd4 : // J-type
(opcode == 4'b1110) ? 3'd5 : // SYS-type
3'd7; // unknown
// R-type
assign rd = (instr_type == 3'd0 ) ? instruction[5:3] :(instr_type == 3'd1 ) ?
instruction[8:6] :3'b000 ;
assign rs1 = (instr_type <= 3'd3) ? instruction[11:9] : 3'b000;
assign rs2 = (instr_type == 3'd0 || instr_type == 3'd2 || instr_type ==
3'd3) ? instruction[8:6] : 3'b000;
assign funct3 =
(instr_type == 3'd0 || instr_type == 3'd5) ? instruction[2:0] : 3'b000;
endmodule
1.2. REGISTER FILE:
For accessing particular registers to save datas or addresses of further
purposes, we need a block which contain many different kind of registers.
Now, we will build a register file which has 7 registers inside as same as the
firgure that I gave you before. I set initial value for all registers for preventing
faults. Here is my code:
module reg_file(rs1,rs2,rd,data,reg_wrt,readA_out,readB_out,r3,clk);
input reg_wrt, clk;
input [2:0]rs1,rs2,rd;
input [15:0]data;
output [15:0]readA_out,readB_out;
output [15:0] r3;
reg [15:0] x [0:7];
initial begin
x[0]=0;//R0 contains zero
x[1]=2; // Stack pointer
x[2]=4; // Return address
x[3]=6; // Function argument/ result
x[4]=8; // General purpose
x[5]=10; // General purpose
x[6]=12; // General purpose
x[7]=14; // Link register/temp/loop/ var
end
always @(posedge clk)
begin
if(reg_wrt==1)
x[rd]<=data;
end
assign readA_out = x[rs1];
assign readB_out = x[rs2];
assign r3 = x[rd];
endmodule
1.3. MUX:
Using this multiplexer to select what we allow to go through, data from
register file or immediate block. This value will process for different purposes
in the future. This will be reused for many times because in the diagram we
use 4 2-inputs mux.
module MUX_alu_2_1(input[15:0] B, imm,input alu_src, output [15:0] outmux
);
assign outmux = (alu_src == 0) ? B : imm;
endmodule
1.4. ALU CONTROL:
For controlling the operation of ALU, we have to use a block which is named
ALU_control. This block is responsible for combine signals such as alu_op and
funct3 to create control_sig. Then, this signal will be decoded into
ALU_control signal such as ADD, SUB, AND,…
module ALU_control( input [15:0] instr, input [1:0] alu_op, output reg [3:0]
ALU_control);
wire [2:0] funct3;
wire [4:0] control_sig;
assign funct3 = instr[2:0];
assign control_sig = {alu_op,funct3};
always@(control_sig)
begin
case(control_sig)
5'b00_000: ALU_control = 4'b0000; // ADD
5'b00_001: ALU_control = 4'b0001; // SUB
5'b00_010: ALU_control = 4'b0010; // Set on less than
5'b00_011: ALU_control = 4'b0011; // AND
5'b00_100: ALU_control = 4'b0100; // OR
5'b00_101: ALU_control = 4'b0101; // XOR
5'b00_110: ALU_control = 4'b0110; // LSL
5'b00_111: ALU_control = 4'b0111; // LSR
default: ALU_control = 4'b0000;
endcase
end
endmodule
1.5 ALU:
This is one of the most crucial blocks in CPU. It’s responsible for caluculating
numbers, logic algorithms,… And here is my code:
module ALU( input [15:0] A,B,input [3:0]alu_op, output reg [15:0] ALU_out,
output zero );
assign zero = (ALU_out==16'd0) ? 1'b1: 1'b0;
always@(*)
begin
case(alu_op)
4'b0000: ALU_out = A + B;
4'b0001: ALU_out = A - B;
4'b0010: begin if (A<B)
ALU_out = 1;
else ALU_out = 0;
end
4'b0011: ALU_out = A & B;
4'b0100: ALU_out = A | B;
4'b0101: ALU_out = A ^ B;
4'b0110: ALU_out = A << B;
4'b0111: ALU_out = A >> B;
default: ALU_out = A + B;
endcase
end
endmodule
1.6. DATA MEMORY
Because we decided to build a Harvard CPU architecture so we have 2
separate data memory. This memory will store all datas from the output of
ALU or directly register file. Then output of this block will go back to the
register file for storing if we want to access in the future.
module data_mem (
input mem_write_en, clk,
input mem_read_en,
input [15:0] addr,
input [15:0] write_data,
output [15:0] read_data
);
reg [15:0] memory [0:2047];
wire [10:0] addr_data;
assign addr_data = addr[10:0];
always @(posedge clk) begin
if (mem_write_en) begin
memory[addr] <= write_data;
end
end
assign read_data = (mem_read_en) ? memory[addr] : 16'd0;
endmodule
1.7. TOP MODULE
module TOP( input reg_wrt, clk, mem_write_en, mem_read_en,
memtoreg,alu_src, input[2:0]immtype, input[1:0]alu_op, input[11:0] address,
output [15:0] data_reg,readA_out,readB_out,instruction,r3, output [2:0]
rs1,rs2,rd );
wire zero;
wire [3:0] opcode, alu_sel;
wire [2:0] funct3;
wire [15:0] ALU_out, imm_out, outmux, read_data;
Ins_Mem ic1
(.address(address),.opcode(opcode),.rd(rd),.rs1(rs1),.rs2(rs2),.funct3(funct3),
.instruction(instruction));
reg_file ic2
(.reg_wrt(reg_wrt),.rs1(rs1),.rs2(rs2),.rd(rd),.readA_out(readA_out),.readB_out
(readB_out),.data(data_reg),.r3(r3),.clk(clk));
MUX_alu_2_1 ic3
(.B(readB_out),.imm(imm_out),.alu_src(alu_src),.outmux(outmux));
ALU ic4
(.A(readA_out),.B(outmux),.alu_op(alu_sel),.ALU_out(ALU_out),.zero(zero));
ALU_control ic5 (.instr(instruction),.alu_op(alu_op),.ALU_control(alu_sel));
Imm_gen ic6
(.instruction(instruction),.imm_type(immtype),.imm_out(imm_out));
data_mem ic7
(.mem_write_en(mem_write_en),.clk(clk),.mem_read_en(mem_read_en),.addr
(ALU_out),.write_data(readB_out),.read_data(read_data));
MUX_alu_2_1 ic8
(.B(read_data),.imm(ALU_out),.alu_src(memtoreg),.outmux(data_reg));
endmodule
SIMULATION
1)
Here is my schematic of all things that I made, I will observe the outputs like
instruction, rd, rs1, rs2,… This is action for checking whether my circuit work
properly or not. Here is my testbench for adding two registers x4 and x5 with
intial values 8 and 10, respectively and then storing its value into x3 with
initial value is 6.
module HA_tb();
reg reg_wrt, mem_write_en, mem_read_en, memtoreg, alu_src;
reg [2:0] immtype;
reg [1:0] alu_op;
reg [11:0] address;
reg clk;
wire [15:0] data_reg;
wire [15:0] instruction;
wire [2:0] rs1;
wire [2:0] rs2;
wire [2:0] rd;
wire [15:0] readA_out;
wire [15:0] readB_out;
wire [15:0] r3;
// G?i module chia
TOP uut (
.clk(clk),
.reg_wrt(reg_wrt),
.alu_src(alu_src),
.alu_op(alu_op),
.mem_write_en(mem_write_en),
.mem_read_en(mem_read_en),
.memtoreg(memtoreg),
.immtype(immtype),
.address(address),
.data_reg(data_reg),
.instruction(instruction),
.rs1(rs1),
.rs2(rs2),
.rd(rd),
.r3(r3),
.readA_out(readA_out),
.readB_out(readB_out)
);
initial begin
clk = 0;
forever #5 clk = ~clk;
end
initial begin
address = 8'b0000_0000_0000; // dia chi PC = 0
reg_wrt = 0; // khong cho phep ghi vao reg_file
alu_src = 0; // chon dau vao ALU la gia tri tu reg_file
alu_op = 2'b00; // tin hieu tu Unit Control, 2'b00 tuong ung voi phep cong
mem_write_en = 0; // khong cho phep ghi vao data memory
immtype = 0; // khong dung immediate
mem_read_en = 0; // khong cho phep doc du lieu tu data memory
memtoreg = 1; // du lieu tu ALU se luu vao reg_file thay vi data memory
#20;
reg_wrt = 1; // cho phep du lieu tu ALU ghi vao reg_file
// #5;
// reg_wrt = 0; // khong cho phep ghi vao reg_file
// mem_write_en = 1; //
end
endmodule
We can see that when address = 0, it will point to the first instruction inside
instruction memory then we can see that it equals 0000_100_101_011_000.
With 0000: opcode for adding, 100: address of x4 register, 101: address of x5
register, 011: address of x3 register, 000: funct3 for opcode’s
complementary function. When it is decoded into 3 parts like that, addresses
of registers will feed into reg_file to take values of those registers. As a
result, we can see readA_out and readB_out, these are value of register x4
and x5.
Then these values are calculated in ALU then output data_reg[15:0] – output
of ALU which equals to 18. This process works properly, then when reg_wrt
signal is activated, waiting for rising edge of clk, r3[15:0] which is the result
register, it will store the value of the output of ALU or data memory, it will
store the value of the output ALU, then it equals to 18. This process works
perfectly.
2)
Here, I will test I-type, load value from data memory to register file and then
subtract two registers. With the summary as below:
Load Mem[0] x4 where Mem[0] = 20
Load Mem[1] x5 where Mem[1] = 6
Sub, x3 x4 - x5
Testbench:
module HA_tb();
reg reg_wrt, mem_write_en, mem_read_en, memtoreg, alu_src;
reg [2:0] immtype;
reg [1:0] alu_op;
reg [11:0] address;
reg clk;
wire [15:0] data_reg;
wire [15:0] instruction;
wire [2:0] rs1;
wire [2:0] rs2;
wire [2:0] rd;
wire [15:0] readA_out;
wire [15:0] readB_out;
wire [15:0] r3;
// G?i module chia
TOP uut (
.clk(clk),
.reg_wrt(reg_wrt),
.alu_src(alu_src),
.alu_op(alu_op),
.mem_write_en(mem_write_en),
.mem_read_en(mem_read_en),
.memtoreg(memtoreg),
.immtype(immtype),
.address(address),
.data_reg(data_reg),
.instruction(instruction),
.rs1(rs1),
.rs2(rs2),
.rd(rd),
.r3(r3),
.readA_out(readA_out),
.readB_out(readB_out)
);
initial begin
clk = 0;
forever #5 clk = ~clk;
end
initial begin
// load mem[addr + 0] -> x4
address = 8'b0000_0000_0000; // dia chi PC = 0
reg_wrt = 1; // cho phep ghi vao reg_file
alu_src = 1; // chon dau vao ALU la gia tri tu imm_gen
alu_op = 2'b00; // tin hieu tu Unit Control, 2'b00 tuong ung voi phep cong
mem_write_en = 0; // khong cho phep ghi vao data memory
immtype = 0;
mem_read_en = 1; // cho phep doc du lieu tu data memory
memtoreg = 0; // du lieu se duoc lay tu data memory
#10;
// load mem[addr + 1] -> x5
address = 8'b0000_0000_0010;
#10;
//Sub, x3 <- x4 -x5
address = 8'b0000_0000_0100;
alu_src = 0;
memtoreg = 1;
#10;
address = 8'b0000_0000_0110;
end
endmodule