Sample code Implementation of a single cycle cpu
· The Verilog HDL Codes of the CPU
· The following Verilog HDL code implements the single-cycle computer
with interrupt/exception mechanism. It invokes sccpu_intr (single-cycle
CPU), sci_intr (instruction memory), and scd_intr (data memory).
module sc_interrupt (clk,clrn,inst,pc,aluout,memout,memclk,intr,inta);
input clk, clrn; // clock and reset
input memclk; // synch ram clock
input intr; // interrupt request
output inta; // interrupt acknowledge
output [31:0] pc; // program counter
output [31:0] inst; // instruction
output [31:0] aluout; // alu output
output [31:0] memout; // data memory output
wire [31:0] data; // data to data memory
wire wmem; // write data memory
sccpu_intr cpu (clk,clrn,inst,memout,pc,wmem,aluout,data,intr,inta);
sci_intr im (pc,inst); // inst memory
scd_intr dm (memout,data,aluout,wmem,memclk); // data memory
endmodule
· The following Verilog HDL code implements a single-cycle CPU with
interrupt/exception mechanism
module sccpu_intr (clk,clrn,inst,mem,pc,wmem,alu,data,intr,inta);
input [31:0] inst; // inst from inst memory
input [31:0] mem; // data from data memory
input intr; // interrupt request
input clk, clrn; // clock and reset
output [31:0] pc; // program counter
output [31:0] alu; // alu output
output [31:0] data; // data to data memory
output wmem; // write data memory
output inta; // interrupt acknowledge
parameter BASE = 32’h00000008; // exc/int handler entry
parameter ZERO = 32’h00000000; // zero
// instruction fields
wire [5:0] op = inst[31:26]; // op
wire [4:0] rs = inst[25:21]; // rs
wire [4:0] rt = inst[20:16]; // rt
wire [4:0] rd = inst[15:11]; // rd
wire [5:0] func = inst[05:00]; // func
wire [15:0] imm = inst[15:00]; // immediate
wire [25:0] addr = inst[25:00]; // address
// control signals
wire [3:0] aluc; // alu operation control
wire [1:0] pcsrc; // select pc source
wire wreg; // write regfile
wire regrt; // dest reg number is rt
wire m2reg; // instruction is an lw
wire shift; // instruction is a shift
wire aluimm; // alu input b is an i32
wire jal; // instruction is a jal
wire sext; // is sign extension
wire [1:0] mfc0; // move from c0 regs
wire [1:0] selpc; // select for pc
wire v; // overflow
wire exc; // exc or int occurs
wire wsta; // write status reg
wire wcau; // write cause reg
wire wepc; // write epc reg
wire mtc0; // move to c0 regs
// datapath wires
wire [31:0] p4; // pc+4
wire [31:0] bpc; // branch target address
wire [31:0] npc; // next pc, not exc/int
wire [31:0] qa; // regfile output port a
wire [31:0] qb; // regfile output port b
wire [31:0] alua; // alu input a
wire [31:0] alub; // alu input b
wire [31:0] wd; // regfile write port data
wire [31:0] r; // alu out or mem
wire [31:0] sa = {27’b0,inst[10:6]}; // shift amount
wire [15:0] s16 = {16{sext & inst[15]}}; // 16-bit signs
wire [31:0] i32 = {s16,imm}; // 32-bit immediate
wire [31:0] dis = {s16[13:0],imm,2’b00}; // word distance
wire [31:0] jpc = {p4[31:28],addr,2’b00}; // jump target address
wire [4:0] reg_dest; // rs or rt
wire [4:0] wn = reg_dest | {5{jal}}; // regfile write reg #
wire z; // alu zero tag
wire [31:0] sta; // output of status reg
wire [31:0] cau; // output of cause reg
wire [31:0] epc; // output of epc reg
wire [31:0] sta_in; // data in for status reg
wire [31:0] cau_in; // data in for cause reg
wire [31:0] epc_in; // data in for epc reg
wire [31:0] sta_lr; // status left/right shift
wire [31:0] pc_npc; // pc or npc
wire [31:0] cause; // exc/int cause
wire [31:0] res_c0; // r or c0 regs
wire [31:0] n_pc; // next pc
wire [31:0] sta_r = {4’h0,sta[31:4]}; // status >> 4
wire [31:0] sta_l = {sta[27:0],4’h0}; // status << 4
// control unit
sccu_intr cu (op,rs,rd,func,z,wmem,wreg, // control unit
regrt,m2reg,aluc,shift,
aluimm,pcsrc,jal,sext,
intr,inta,v,sta, // exc/int signals
cause,exc,wsta,wcau,
wepc,mtc0,mfc0,selpc);
// datapath
dff32 i_point (n_pc,clk,clrn,pc); // pc register
cla32 pcplus4 (pc,32’h4,1’b0,p4); // pc + 4
cla32 br_addr (p4,dis,1’b0,bpc); // branch target address
mux2x32 alu_a (qa,sa,shift,alua); // alu input a
mux2x32 alu_b (qb,i32,aluimm,alub); // alu input b
mux2x32 alu_m (alu,mem,m2reg,r); // alu out or mem
mux2x32 link (res_c0,p4,jal,wd); // res_c0 or p4
mux2x5 reg_wn (rd,rt,regrt,reg_dest); // rs or rt
mux4x32 nextpc(p4,bpc,qa,jpc,pcsrc,npc); // next pc, not exc/int
regfile rf (rs,rt,wd,wn,wreg,clk,clrn,qa,qb); // register file
alu_ov alunit (alua,alub,aluc,alu,z,v); // alu_ov, z and v tags
dffe32 c0sta (sta_in,clk,clrn,wsta,sta); // c0 status register
dffe32 c0cau (cau_in,clk,clrn,wcau,cau); // c0 cause register
dffe32 c0epc (epc_in,clk,clrn,wepc,epc); // c0 epc register
mux2x32 cau_x (cause,qb,mtc0,cau_in); // mux for cause reg
mux2x32 sta_1 (sta_r,sta_l,exc,sta_lr); // mux1 for status reg
mux2x32 sta_2 (sta_lr,qb,mtc0,sta_in); // mux2 for status reg
mux2x32 epc_1 (pc,npc,inta,pc_npc); // mux1 for epc reg
mux2x32 epc_2 (pc_npc,qb,mtc0,epc_in); // mux2 for epc reg
mux4x32 nxtpc (npc,epc,BASE,ZERO,selpc,n_pc); // mux for pc
mux4x32 fr_c0 (r,sta,cau,epc,mfc0,res_c0); // r or c0 regs
assign data = qb; // regfile output port b
endmodule
· The following Verilog HDL code implements the control unit of the single-
cycle CPU. Control signals related to interrupt and exceptions are added.
module sccu_intr (op,op1,rd,func,z,wmem,wreg,regrt,m2reg,aluc,shift,aluimm,
pcsrc,jal,sext,intr,inta,v,sta,cause,exc,wsta,wcau,wepc,
mtc0,mfc0,selpc); // control unit
input [31:0] sta; // c0 status
input [5:0] op, func; // op, func
input [4:0] op1, rd; // op1, rd
input z, v; // z, v flags
input intr; // interrupt request
output [31:0] cause; // c0 cause
output [3:0] aluc; // alu control
output [1:0] mfc0; // move from c0 regs
output [1:0] selpc; // select for pc
output [1:0] pcsrc; // select pc source
output wreg,regrt,jal,m2reg,shift,aluimm,sext,wmem;
output inta; // interrupt ack
output exc; // exc or int occurs
output wsta; // write status reg
output wcau; // write cause reg
output wepc; // move to c0 regs
output mtc0; // move to c0 regs
wire rtype = ̃|op; // r format
wire i_add = rtype& func[5]&̃func[4]&̃func[3]&̃func[2]&̃func[1]&̃func[0];
wire i_sub = rtype& func[5]&̃func[4]&̃func[3]&̃func[2]& func[1]&̃func[0];
wire i_and = rtype& func[5]&̃func[4]&̃func[3]& func[2]&̃func[1]&̃func[0];
wire i_or = rtype& func[5]&̃func[4]&̃func[3]& func[2]&̃func[1]& func[0];
wire i_xor = rtype& func[5]&̃func[4]&̃func[3]& func[2]& func[1]&̃func[0];
wire i_sll = rtype&̃func[5]&̃func[4]&̃func[3]&̃func[2]&̃func[1]&̃func[0];
wire i_srl = rtype&̃func[5]&̃func[4]&̃func[3]&̃func[2]& func[1]&̃func[0];
wire i_sra = rtype&̃func[5]&̃func[4]&̃func[3]&̃func[2]& func[1]& func[0];
wire i_jr = rtype&̃func[5]&̃func[4]& func[3]&̃func[2]&̃func[1]&̃func[0];
wire i_addi = ̃op[5] &̃op[4] & op[3] &̃op[2] &̃op[1] &̃op[0]; // i format
wire i_andi = ̃op[5] &̃op[4] & op[3] & op[2] &̃op[1] &̃op[0];
wire i_ori = ̃op[5] &̃op[4] & op[3] & op[2] &̃op[1] & op[0];
wire i_xori = o
̃ p[5] &̃op[4] & op[3] & op[2] & op[1] &̃op[0];
wire i_lw = op[5] &̃op[4] &̃op[3] &̃op[2] & op[1] & op[0];
wire i_sw = op[5] &̃op[4] & op[3] &̃op[2] & op[1] & op[0];
wire i_beq = ̃op[5] &̃op[4] &̃op[3] & op[2] &̃op[1] &̃op[0];
wire i_bne = ̃op[5] &̃op[4] &̃op[3] & op[2] &̃op[1] & op[0];
wire i_lui = ̃op[5] &̃op[4] & op[3] & op[2] & op[1] & op[0];
wire i_j = ̃op[5] &̃op[4] &̃op[3] &̃op[2] & op[1] &̃op[0]; // j format
wire i_jal = o
̃ p[5] &̃op[4] &̃op[3] &̃op[2] & op[1] & op[0];
wire c0type = o
̃ p[5] & op[4] &̃op[3] &̃op[2] &̃op[1] &̃op[0];
wire i_mfc0 = c0type &̃op1[4] &̃op1[3] &̃op1[2] &̃op1[1] &̃op1[0];
wire i_mtc0 = c0type &̃op1[4] &̃op1[3] & op1[2] &̃op1[1] &̃op1[0];
wire i_eret = c0type & op1[4] &̃op1[3] &̃op1[2] &̃op1[1] &̃op1[0] &
̃func[5] & func[4] & func[3] &̃func[2] &̃func[1] &̃func[0];
wire i_syscall = rtype &
̃func[5] &̃func[4] & func[3] & func[2] &̃func[1] &̃func[0];
wire unimplemented_inst = ̃(i_mfc0 | i_mtc0 | i_eret | i_syscall |
i_add | i_sub | i_and | i_or | i_xor | i_sll | i_srl | i_sra |
i_jr | i_addi | i_andi | i_ori | i_xori | i_lw | i_sw | i_beq |
i_bne| i_lui| i_j| i_jal);
wire rd_is_status = (rd == 5’d12); // is cp0 status reg
wire rd_is_cause = (rd == 5’d13); // is cp0 cause reg
wire rd_is_epc = (rd == 5’d14); // is cp0 epc reg
wire overflow=v& (i_add | i_sub | i_addi); // overflow
wire int_int = sta[0] & intr; // sta[0]: enable
wire exc_sys = sta[1] & i_syscall; // sta[1]: enable
wire exc_uni = sta[2] & unimplemented_inst; // sta[2]: enable
wire exc_ovr = sta[3] & overflow; // sta[3]: enable
assign inta = int_int; // interrupt ack
// exccode:00: intr // generate exccode
// 0 1 : i_syscall
// 1 0 : unimplemented_inst
// 1 1 : overflow
wire exccode0 = i_syscall | overflow;
wire exccode1 = unimplemented_inst | overflow;
// mfc0: 0 0 : alu_mem // generate mux sel
// 0 1 : sta
// 1 0 : cau
// 1 1 : epc
assign mfc0[0] = i_mfc0 & rd_is_status | i_mfc0 & rd_is_epc;
assign mfc0[1] = i_mfc0 & rd_is_cause | i_mfc0 & rd_is_epc;
// selpc: 0 0 : npc // generate mux sel
// 0 1 : epc
// 1 0 : exc_base
// 11:x
assign selpc[0] = i_eret;
assign selpc[1] = exc;
assign cause = {28’h0,exccode1,exccode0,2’b00}; // cause
assign exc = int_int | exc_sys | exc_uni | exc_ovr; // exc or int occurs
assign mtc0 = i_mtc0; // highest priority
assign wsta = exc | mtc0 & rd_is_status | i_eret; // write status reg
assign wcau = exc | mtc0 & rd_is_cause; // write cause reg
assign wepc = exc | mtc0 & rd_is_epc; // write epc reg
assign regrt = i_addi| i_andi| i_ori| i_xori| i_lw | i_lui| i_mfc0;
assign jal = i_jal;
assign m2reg = i_lw;
assign wmem = i_sw;
assign aluc[3] = i_sra; // refer to alu_ov.v
assign aluc[2] = i_sub| i_or| i_srl| i_sra| i_ori| i_lui;
assign aluc[1] = i_xor| i_sll| i_srl| i_sra| i_xori| i_beq| i_bne| i_lui;
assign aluc[0] = i_and| i_or| i_sll| i_srl| i_sra| i_andi| i_ori;
assign shift = i_sll | i_srl | i_sra;
assign aluimm = i_addi| i_andi| i_ori| i_xori| i_lw | i_lui| i_sw;
assign sext = i_addi| i_lw | i_sw | i_beq | i_bne;
assign pcsrc[1] = i_jr | i_j | i_jal;
assign pcsrc[0] = i_beq&z| i_bne &̃z | i_j | i_jal;
assign wreg = i_add | i_sub | i_and| i_or | i_xor| i_sll| i_srl| i_sra|
i_addi| i_andi| i_ori| i_xori| i_lw | i_lui| i_jal| i_mfc0;
endmodule
· The following Verilog HDL code implements the ALU. Overflow flag output
v is added to the ALU.
module alu_ov (a,b,aluc,r,z,v); // 32-bit alu with zero and overflow flags
input [31:0] a, b; // inputs: a, b
input [3:0] aluc; // input: alu control: // aluc[3:0]:
output [31:0] r; // output: alu result // x 0 0 0 ADD
output z, v; // outputs: zero, overflow // x 1 0 0 SUB
wire [31:0] d_and = a & b; // x 0 0 1 AND
wire [31:0] d_or = a | b; // x 1 0 1 OR
wire [31:0] d_xor = a ̂ b; // x 0 1 0 XOR
wire [31:0] d_lui = {b[15:0],16’h0}; // x 1 1 0 LUI
wire [31:0] d_and_or = aluc[2]? d_or : d_and; // 0 0 1 1 SLL
wire [31:0] d_xor_lui = aluc[2]? d_lui : d_xor; // 0 1 1 1 SRL
wire [31:0] d_as,d_sh; // 1 1 1 1 SRA
// addsub32 (a,b,sub, s);
addsub32 as32 (a,b,aluc[2],d_as); // add/sub
// shift (d,sa, right, arith, sh);
shift shifter (b,a[4:0],aluc[2],aluc[3],d_sh); // shift
// mux4x32 (a0, a1, a2, a3, s, y);
mux4x32 res (d_as,d_and_or,d_xor_lui,d_sh,aluc[1:0],r); // alu result
assign z = ̃|r; // z = (r == 0)
assign v = ̃aluc[2] &̃a[31] &̃b[31] & r[31] &̃aluc[1] &̃aluc[0] |
̃aluc[2] & a[31] & b[31] &̃r[31] &̃aluc[1] &̃aluc[0] |
aluc[2] &̃a[31] & b[31] & r[31] &̃aluc[1] &̃aluc[0] |
aluc[2] & a[31] &̃b[31] &̃r[31] &̃aluc[1] &̃aluc[0];
endmodule
· The following Verilog HDL code implements the instruction memory. Instead of
using general Verilog HDL statements, here we show how to use an LPM (library of
parameterized modules), provided by Altera, to implement memories. lpm_rom is a
read-only memory module and can be initialized with a memory initialization file
module sci_intr (a,inst); // inst mem (rom)
input [31:0] a; // mem address
output [31:0] inst; // mem data output
lpm_rom rom (.address(a[7:2]), // word address
.q(inst), // mem data output
.inclock(), // no clock
.outclock(), // no clock
.memenab()); // no write enable
defparam rom.lpm_width = 32, // data: 32 bits
rom.lpm_widthad = 6, // 2 ̂ 6 = 64 words
rom.lpm_file = "sci_intr.hex", // mem init file
rom.lpm_outdata = "UNREGISTERED", // no reg (data)
rom.lpm_address_control = "UNREGISTERED"; // no reg (addr)
endmodule
· The following Verilog HDL code implements the data memory. We also use LPM. lpm_ram_dq is a
synchronous random access memory module that needs a clock. The input signals of the
memory,including address, data in, and write control, must be registered using inclock. The output
signal can be either unregistered or registered using outclock.
module scd_intr (dataout,datain,addr,we,memclk); // data mem (sram)
input [31:0] datain; // mem data input
input [31:0] addr; // mem address
input we; // write enable
input memclk; // sync ram clock
output [31:0] dataout; // mem data output
wire inclk = memclk; // in reg clock
wire outclk = memclk; // out reg clock
lpm_ram_dq ram (.data(datain), // data in
.address(addr[6:2]), // word address
.we(we), // write enable
.inclock(inclk), // in reg clock
.outclock(outclk), // out reg clock
.q(dataout)); // mem data out
defparam ram.lpm_width = 32; // data: 32 bits
defparam ram.lpm_widthad = 5; // 2 ̂ 5 = 32 words
defparam ram.lpm_file = "scd_intr.hex"; // mem init file
defparam ram.lpm_indata = "REGISTERED"; // in reg (data)
defparam ram.lpm_outdata = "REGISTERED"; // out reg (data)
defparam ram.lpm_address_control = "REGISTERED"; // in reg (a, we)
endmodule