VR 4300
VR 4300
TM                  TM   TM
VR4300 , VR4305 , VR4310
64-Bit Microprocessor
µ PD30200
µ PD30210
©              1996, 1998
© MIPS Technologies, Inc. 1994
Printed in Japan
[MEMO]
VR Series, VR4300 Series, VR3000, VR4000, VR4100, VR4200, VR4300, VR4305, VR4310, and VR4400 are
trademarks of NEC Corporation.
UNIX is a registered trademark licensed by X/Open Company Limited in the US and other countries.
MC68000 is a trademark of Motorola Inc.
IBM370 is a trademark of International Business Machines Corporation.
iAPX is a trademark of Intel Corporation.
DEC VAX is a trademark of Digital Equipment Corporation.
MIPS is a registered trademark of MIPS Technologies, Inc. in the U.S.A.
• The information in this document is current as of October, 1999. The information is subject to
  change without notice. For actual design-in, refer to the latest publications of NEC's data sheets or
  data books, etc., for the most up-to-date specifications of NEC semiconductor products. Not all
  products and/or types are available in every country. Please check with an NEC sales representative
  for availability and additional information.
• No part of this document may be copied or reproduced in any form or by any means without prior
  written consent of NEC. NEC assumes no responsibility for any errors that may appear in this document.
• NEC does not assume any liability for infringement of patents, copyrights or other intellectual property rights of
  third parties by or arising from the use of NEC semiconductor products listed in this document or any other
  liability arising from the use of such products. No license, express, implied or otherwise, is granted under any
  patents, copyrights or other intellectual property rights of NEC or others.
• Descriptions of circuits, software and other related information in this document are provided for illustrative
  purposes in semiconductor product operation and application examples. The incorporation of these
  circuits, software and information in the design of customer's equipment shall be done under the full
  responsibility of customer. NEC assumes no responsibility for any losses incurred by customers or third
  parties arising from the use of these circuits, software and information.
• While NEC endeavours to enhance the quality, reliability and safety of NEC semiconductor products, customers
  agree and acknowledge that the possibility of defects thereof cannot be eliminated entirely. To minimize
  risks of damage to property or injury (including death) to persons arising from defects in NEC
  semiconductor products, customers must incorporate sufficient safety measures in their design, such as
  redundancy, fire-containment, and anti-failure features.
• NEC semiconductor products are classified into the following three quality grades:
  "Standard", "Special" and "Specific". The "Specific" quality grade applies only to semiconductor products
  developed based on a customer-designated "quality assurance program" for a specific application. The
  recommended applications of a semiconductor product depend on its quality grade, as indicated below.
  Customers must check the quality grade of each semiconductor product before using it in a particular
  application.
   "Standard": Computers, office equipment, communications equipment, test and measurement equipment, audio
                  and visual equipment, home electronic appliances, machine tools, personal electronic equipment
                  and industrial robots
    "Special": Transportation equipment (automobiles, trains, ships, etc.), traffic control systems, anti-disaster
                  systems, anti-crime systems, safety equipment and medical equipment (not specifically designed
                  for life support)
    "Specific": Aircraft, aerospace equipment, submersible repeaters, nuclear reactor control systems, life
                  support systems and medical equipment for life support, etc.
  The quality grade of NEC semiconductor products is "Standard" unless otherwise expressly specified in NEC's
  data sheets or data books, etc. If customers wish to use NEC semiconductor products in applications not
  intended by NEC, they must contact an NEC sales representative in advance to determine NEC's willingness
  to support a given application.
  (Note)
  (1) "NEC" as used in this statement means NEC Corporation and also includes its majority-owned subsidiaries.
  (2) "NEC semiconductor products" means any semiconductor product developed or manufactured by or for
       NEC (as defined above).
                                                                                                                       M8E 00. 4
• Device availability
• Ordering information
•   Development environment specifications (for example, specifications for third-party tools and
    components, host computers, power plugs, AC supply voltages, and so forth)
• Network requirements
In addition, trademarks, registered trademarks, export restrictions, and other legal issues may also vary
from country to country.
NEC Electronics Inc. (U.S.)              NEC Electronics (Germany) GmbH      NEC Electronics Hong Kong Ltd.
Santa Clara, California                  Benelux Office                      Hong Kong
Tel: 408-588-6000                        Eindhoven, The Netherlands          Tel: 2886-9318
     800-366-9782                        Tel: 040-2445845                    Fax: 2886-9022/9044
Fax: 408-588-6130                        Fax: 040-2444580
      800-729-9288                                                           NEC Electronics Hong Kong Ltd.
                                         NEC Electronics (France) S.A.       Seoul Branch
NEC Electronics (Germany) GmbH           Velizy-Villacoublay, France         Seoul, Korea
Duesseldorf, Germany                     Tel: 01-30-67 58 00                 Tel: 02-528-0303
Tel: 0211-65 03 02                       Fax: 01-30-67 58 99                 Fax: 02-528-4411
Fax: 0211-65 03 490
                                         NEC Electronics (France) S.A.       NEC Electronics Singapore Pte. Ltd.
NEC Electronics (UK) Ltd.                Madrid Office                       United Square, Singapore
Milton Keynes, UK                        Madrid, Spain                       Tel: 65-253-8311
Tel: 01908-691-133                       Tel: 91-504-2787                    Fax: 65-250-3583
Fax: 01908-670-290                       Fax: 91-504-2860
                                                                             NEC Electronics Taiwan Ltd.
NEC Electronics Italiana s.r.l.          NEC Electronics (Germany) GmbH      Taipei, Taiwan
Milano, Italy                            Scandinavia Office                  Tel: 02-2719-2377
Tel: 02-66 75 41                         Taeby, Sweden                       Fax: 02-2719-5951
Fax: 02-66 75 42 99                      Tel: 08-63 80 820
                                         Fax: 08-63 80 388                   NEC do Brasil S.A.
                                                                             Electron Devices Division
                                                                             Guarulhos-SP Brasil
                                                                             Tel: 55-11-6462-6810
                                                                             Fax: 55-11-6462-6829
J00.7
    Page                                                Description
p.33          1.1 Characteristics Correction of description
p.35          1.4.1 Internal Block Configuration Correction of description
p.166         6.3.5 Status Register (12) Correction of description
p.198         6.4.17 Watch Exception Correction and addition of description
p.244         8.2.7 Unimplemented Operation Exception (E) Addition of description
p.254         9.3.1 Power Modes Correction of description
pp.259, 260   10.2 Basic System Clocks Correction of description
p.264         10.4 Low Power Mode Operation Correction of description
p.360         15.1 Features Correction of description
p.360         15.1.2 Low Power Mode Correction of description
              17.5 FPU Instructions Addition of description to the following instructions
p.568            CEIL.L.fmt
p.570            CEIL.W.fmt
p.574            CVT.D.fmt
p.576            CVT.L.fmt
p.578            CVT.S.fmt
p.580            CVT.W.fmt
p.587            FLOOR.L.fmt
p.589            FLOOR.W.fmt
p.600            ROUND.L.fmt
p.602            ROUND.W.fmt
p.610            TRUNC.L.fmt
p.612            TRUNC.W.fmt
p.628         Table A-1 Differences Between the VR4300, VR4305, and VR4310 Correction of description
p.630         B.1.3 Status Register Correction of description
p.632         Table B-1 Differences in Software Correction of description
p.634         B.2.2 System Interface Correction of description
p.635         Table B-2 Differences in System Design Correction of description
p.639         Table B-3 Other Differences Correction of description
p.644         C.2.2 Clock Correction of description
pp.647, 648   Appendix D Restrictions of VR4300 Addition
Readers                   This manual targets users who intends to understand the functions of
                          the VR4300, VR4305 (mPD30200, VR4310 (mPD30210) and to design
                          application systems using this microprocessor.
How to read this manual   It is assumed that the readers of this manual has a general knowledge
                          of electric engineering, logic circuits, and microcomputers.
                          VR4300 ® VR4305
                          VR4300 ® VR4310
Conventions                Data significance:        Higher digits on the left and lower digits on
                                                     the right
                           Active low                ´´´ (overscore over pin or signal name)
                           representation:
                           *:                       Footnote for item marked with * in the text
                           Caution:                 Information requiring particular attention
                           Remark:                  Supplementary information
                           Numerical                binary or decimal ... ´´´´
                           representation:          hexadecimal ...........0´´´´´
                           Prefixes indicating power of 2 (address space, memory capacity):
                                                    K (kilo) 210 = 1024
                                                    M (mega) 220 = 10242
                                                    G (giga) 230 = 10243
                                                    T (tera) 240 = 10244
                                                    P (peta) 250 = 10245
                                                    E (exa) 260 = 10246
Appendix E Index...................................................................................649
         11-1    Stall Cycle Count for Data Cache Miss ................................... 281
         11-2    Stall Cycle Count for Instruction Cache Miss ....................... 282
A-1 Differences Between the VR4300, VR4305, and VR4310 ..... 628
1.1 Characteristics
            The VR4300, VR4305, and VR4310 are members of the NEC VR SeriesTM RISC
            (Reduced Instruction Set Computer) microprocessors and is a high-performance
            64-bit microprocessor employing the RISC architecture developed by MIPSTM.
            Its instructions are upward-compatible with the instructions of the VR3000TM
            Series and are completely compatible with those of the VR4400 and VR4200.
            Therefore, existing applications can be used as is with the VR4300, VR4305, and
            VR4310.
            The VR4300, VR4305, and VR4310 have the following features:
                • Internal operating frequency:
                      80 MHz max. (mPD30200-80),
                      100 MHz max. (mPD30200-100),
                      133 MHz max. (mPD30200-133, 30210-133),
                      167 MHz max. (mPD30210-167)
                • 64-bit architecture supporting 64-bit data processing
                • Optimized, 5-stage pipeline processing
                • High-speed translation lookaside buffer (TLB) supporting virtual
                   addresses (of 32 double entries)
                • Address space Physical:          32 bits
                                      Virtual:     40 bits (64-bit mode)
                                                   31 bits (32-bit mode)
                • Supports single-precision and double-precision floating-point
                   operations
                • On-chip cache memories
                                      Instruction: 16 KB
                                      Data:        8 KB
                • Employs write back cache system ® store operation via system bus
                   decreased
                • 32-bit external bus interface facilitating system development
                • Multiplies external operating frequency (input clock and bus
                   interface) to create internal operating frequency.
                   Multiple is selected on power application
                                      (mPD30200-80: ´1, ´2, or ´3)
                                      (mPD30200-100: ´1.5, ´2, or ´3)
                                      (mPD30200-133: ´2, ´3, or ´4)
                                      (mPD30210-133: ´2, ´2.5, ´3, or ´4)
                                      (mPD30210-167: ´2, ´2.5, ´3, ´4, ´5, or ´6)
                    •   Write buffer
                    •   Low power mode (mPD30200-80, 30200-100 only)
                        Reduces internal and system bus clocks to 1/4 of normal level. Also
                        reduces power consumption
                    •   Software-compatible with VR4400 and VR4200 and upward-
                        compatible with VR3000 Series
                    •   Supply voltage: 3.3 V ± 0.3 V (mPD30200-80, 30200-100), 3.0 to 3.5
                        V (mPD30200-133, 30210-´´´)
CP0 TLB
Pipeline Control
*1. Selectable with the 100 MHz model only (With the 133 MHz model, this setting is reserved.)
 2. Selectable with the 133 MHz model only (With the 100 MHz model, this setting is reserved.)
 3. Selectable with the 167 MHz model only (With the 133 MHz model, this setting is reserved.)
                 If the RP bit of the Status register is set to 1 during operation, the frequencies of
                 the PClock and SClock can be reduced to 1/4 of the normal frequency*. Because
                 the PLL (Phase-Locked Loop) technique is employed, the skew (phase difference)
                 between the external clock and internal operation clock can be minimized.
            Coprocessor 0 (CP0) has the memory management unit (MMU) and handles
            exception processing. The MMU handles address translation and checks memory
            accesses that occur between different memory segments (user, supervisor, or
            kernel). The translation lookaside buffer (TLB) is used to translate virtual to
            physical addresses.
            Data Cache is a direct-mapped, virtually-indexed and physically-tagged write-
            back cache. The capacity is 8 KB.
            Instruction Address calculates the effective address of the next instruction to be
            fetched. It contains the incrementer for the Program Counter (PC), the target
            address adder, and the conditional branch address selector.
            Pipeline Control ensures the instruction pipeline operates properly (should one
            of the following conditions occur: pipeline stall or exception).
                                                                     Load/Link Register
                                                                            0
            Floating-Point Registers                                      LLbit
     63                                     0
                        r0
                        r1
                        r2
                                                             Floating-Point Control Registers
                        ·                              31                                    0
                        ·
                        ·                                    r0 = Implementation/Revision
                        ·                              31                                    0
                       r29                                       r31 = Control/Status
                       r30
                       r31
                  The VR4300 processor has no Program Status Word (PSW) register as such; this
                  is covered by the Status and Cause registers incorporated within the System
                  Control Coprocessor (CP0). For CP0 registers, refer to 1.4.5 System Control
                  Coprocessor (CP0).
                                  31        26 25        21 20        16 15                                   0
             I-Type (Immediate)        op           rs           rt                immediate
                                  31        26 25                                                             0
             J-Type (Jump)             op                              target
                                  31        26 25        21 20        16 15        11 10        6 5           0
             R-Type (Register)         op       rs           rt               rd           sa         funct
             The instruction set can be further divided into the following groupings:
                 •    Load and Store instructions move data between memory and general
                      purpose registers. They are all immediate (I-type) instructions, since
                      the only addressing mode supported is base register plus 16-bit,
                      signed immediate offset.
                 •    Computational instructions perform arithmetic, logical, shift,
                      multiply, and divide operations on values in registers. They include
                      register (R-type, in which both the operands and the result are stored
                      in registers) and immediate (I-type, in which one operand is a 16-bit
                      signed immediate value) formats.
                 •    Jump and Branch instructions change the control flow of a program.
                      Jumps are always made to an address formed by combining a 26-bit
                      target address with the high-order bits of the Program Counter (J-type
                      format) or register address (R-type format). Branch instructions are
                      performed to the 16-bit offset address relative to the program counter
                      (I-type). Jump And Link instructions save their return address in
                      register 31.
                 Higher         Word
                 Address       Address        31        24 23        16 15        87          0
                                 12                12           13           14        15
                                  8                8            9            10        11
                                  4                4            5            6         7
                 Lower            0                0            1            2         3
                Address
                 Higher         Word
                 Address       Address        31        24 23        16 15        87          0
                                 12                15           14           13        12
                                  8                11           10           9         8
                                  4                7            6            5         4
                 Lower            0                3            2            1         0
                Address
The CPU uses byte addressing for halfword, word, and doubleword accesses with
the following alignment constraints:
    •    Halfword accesses must be aligned on an even byte boundary (0, 2,
         4...).
    •    Word accesses must be aligned on a byte boundary divisible by four
         (0, 4, 8...).
    •    Doubleword accesses must be aligned on a byte boundary divisible
         by eight (0, 8, 16...).
The following special instructions load and store words that are not aligned on 4-
byte (word) or 8-word (doubleword) boundaries:
         LWL               LWR               SWL                SWR
         LDL               LDR                  SDL             SDR
These instructions are always used in pairs to access data not aligned at an
boundary. To access data not aligned at a boundary, additional 1P cycle is
necessary as compared when accessing data aligned at a boundary.
Figure 1-8 illustrates how a word misaligned and having byte address 3 is
accessed in big and little endian.
    Higher
    Address
                  31        24 23       16 15         8 7       0
                       4            5            6                  Big-Endian
                                                            3
    Lower
   Address
    Higher
    Address
                  31        24 23       16 15         8 7       0
                                    6            5          4
                                                                    Little-Endian
                       3
    Lower
   Address
                   Figure 1-8    Misaligned Word Addressing
Index 0 Config 16
Random 1 LLAddr 17
EntryLo0 2 WatchLo 18
EntryLo1 3 WatchHi 19
Context 4 XContext 20
PageMask 5 21
Wired 6 22
7 23
BadVAddr 8 24
Count 9 25
Status 12 TagLo 28
Cause 13 TagHi 29
EPC 14 ErrorEPC 30
PRId 15 31
* These registers are defined to maintain compatibility with the VR4200, and not used with the
  hardware of the VR4300.
            The VR4300 has a 5-stage instruction pipeline. This pipeline is used for floating-
            point operations as well as for integer operations. In a normal environment, the
            pipeline executes one instruction in 1 cycle.
            The pipeline of the VR4300 operates at a frequency determined depending on the
            setting of the DivMode(1:0)* pins. For details, refer to Chapter 4 Pipeline.
                        ColdReset
                        DivMode0
                        DivMode1
                        SysCmd4
SysCmd3
SysCmd2
SysCmd1
                        SysCmd0
                        SysAD23
SysAD24
SysAD25
                        SysAD26
                        PMaster
                        EValid
                        Reset
                        EReq
                        GND
GND
GND
GND
GND
                        GND
                        NMI
                        VDD
VDD
VDD
VDD
VDD
                        VDD
                        Int3
                        120
                        119
                        118
                        117
                        116
                        115
                        114
                        113
                        112
                        111
                        110
                        109
                        108
                        107
                        106
                        105
                        104
                        103
                        102
                        101
                        100
                         99
                         98
                         97
                         96
                         95
                         94
                         93
                         92
                         91
           VDD     1                                             90   VDD
          GND      2                                             89   GND
      SysAD22      3                                             88   Int2
      SysAD21      4                                             87   SysAD27
           VDD     5                                             86   SysAD28
          GND      6                                             85   VDD
      SysAD20      7                                             84   GND
           VDD     8                                             83   SysAD29
          VDDP     9                                             82   EOK
         GNDP      10                                            81   SysAD30
      PLLCap0      11                                            80   VDD
      PLLCap1      12                                            79   GND
          VDDP     13                                            78   PValid
         GNDP      14                                            77   SysAD31
VDD (Div Mode2)    15                                            76   VDD
   MasterClcok     16                                            75   GND
          GND      17                                            74   PReq
         TClock    18                                            73   SysAD0
           VDD     19                                            72   VDD
          GND      20                                            71   GND
       SyncOut     21                                            70   SysAD1
      SysAD19      22                                            69   SysAD2
           VDD     23                                            68   VDD
         SyncIn    24                                            67   GND
          GND      25                                            66   SysAD3
      SysAD18      26                                            65   JTDO
      SysAD17      27                                            64   SysAD4
            Int4   28                                            63   JTDI
           VDD     29                                            62   VDD
          GND      30                                            61   GND
                        31
                        32
                        33
                        34
                        35
                        36
                        37
                        38
                        39
                        40
                        41
                        42
                        43
                        44
                        45
                        46
                        47
                        48
                        49
                        50
                        51
                        52
                        53
                        54
                        55
                        56
                        57
                        58
                        59
                        60
                            GND
                             VDD
                        SysAD16
                        SysAD15
                            GND
                             VDD
                        SysAD14
                        SysAD13
                            GND
                             VDD
                        SysAD12
                        SysAD11
                            GND
                             VDD
                        SysAD10
                             Int0
                         SysAD9
                            GND
                             VDD
                         SysAD8
                         SysAD7
                           JTMS
                            GND
                             VDD
                         SysAD6
                         SysAD5
                           JTCK
                             Int1
                            GND
                             VDD
PIN NAME
 ColdReset          : Cold Reset
 DivMode (1:0)*     : Divide Mode
 EOK                : External OK
 EReq               : External Request
 EValid             : External Valid
 Int (4:0)          : Interrupt Request
 JTCK               : JTAG Clock Input
 JTDI               : JTAG Data In
 JTDO               : JTAG Data Out
 JTMS               : JTAG Command Signal
 MasterClock        : Master Clock
 NMI                : Non-maskable Interrupt Request
 PLLCap (1:0)       : Phase Locked Loop Capacitance
 PMaster            : Processor Master
 PReq               : Processor Request
 PValid             : Processor Valid
 Reset              : Reset
 Syncln             : Synchronization Clock Input
 SyncOut            : Synchronization Clock Output
 SysAD (31:0)       : System Address/Data Bus
 SysCmd (4:0)       : System Command Data ID Bus
 TClock             : Transmit Clock
 VDD                : Power Supply
 GND                : Ground
 VDDP               : VDD for PLL
 GNDP               : GND for PLL
                                            • VR4300
                                            mPD30200-100
                                              DivMode       MasterClock : PClock : TClock
                                               (1 : 0)   Frequency ratio Example [MHz]
                                                 00           RFU                    –
                                                 01          2:3:2           66.7 : 100 : 66.7
                                                 10          1:2:1             50 : 100 : 50
                                                 11          1:3:1           33.3 : 100 : 33.3
• VR4305
                                             mPD30200-80
                                               DivMode           MasterClock : PClock : TClock
                                                (1 : 0)     Frequency ratio      Example [MHz]
                                                  00              1:1:1           66.7 : 66.7 : 66.7
                                                  01               RFU                    –
                                                  10              1:2:1             40 : 80 : 40
                                                  11              1:3:1             20 : 60 : 20
• VR4310
                                             mPD30210-133
                                               DivMode           MasterClock : PClock : TClock
                                                (2 : 0)     Frequency ratio      Example [MHz]
                                                  000             1:5:1           26.7 : 133 : 26.7
                                                  001             1:6:1           22.2 : 133 : 22.2
                                                  010              RFU                    –
                                                  011             1:3:1           33.3 : 100 : 33.3
                                                  100             1:4:1           33.3 : 133 : 33.3
                                                  101              RFU                    –
                                                  110             1:2:1             50 : 100 : 50
                                                  111             1:3:1           33.3 : 100 : 33.3
        This chapter is an overview of the central processing unit (CPU) instruction set;
        refer to Chapter 16 CPU Instruction Set Details for detailed descriptions of
        individual CPU instructions.
        Because the FPU instruction is dependent upon the structure of the coprocessor,
        refer to Chapter 7 Floating-Point Operations and Chapter 17 FPU Instruction
        Set Details.
               I-Type (Immediate)
                   31         26 25        21 20        16 15                               0
                         op           rs           rt            immediate
               J-Type (Jump)
                   31         26 25                                                         0
                         op                                 target
               R-Type (Register)
                   31         26 25        21 20        16 15        11 10        6 5       0
                         op           rs           rt           rd           sa     funct
                       Table 3-1    Number of Cycles for Load and Store Instruction Delay Slot
                                              Instruction        PCycles Required
                                      Load                                1
                                      Store                               1
                   When an integer multiply or divide instruction is executed, the VR4300 stalls the
                   entire pipeline. The number of processor cycles (PCycles) stalled at this time is
                   shown below.
The following common limits are applied to Tables 3-15 and 3-16.
            Branch Address
                   The branch addresses of all the branch instructions are calculated by adding a 16-
                   bit offset (signed 64 bits shifted 2 bits to the left) to the address of the instruction
                   in the delay slot. All the branch instructions generate one delay slot.
                  The following symbols in the instruction format in Table 3-15 through Table 3-21
                  are special.
                          REGIMM :       op code
                          Sub    :       sub operation code
                          CO     :       sub operation identifier
                          BC     :       BC sub operation code
                          br     :       branch condition identifier
                          cofun  :       coprocessor function area
                          op     :       operation code
4.1 General
                  The VR4300 uses a 5-stage pipeline. The pipeline is usually controlled by the
                  pipeline clock that is determined by the value of the DivMode(1:0)* pins. This
                  pipeline clock is called PClock and one cycle of it is called PCycle. Each stage of
                  the pipeline is executed in 1 PCycle. The PCycle has two stages, F1 and F2, as
                  shown in Figure 4-1. Therefore, at least 5 PCycles are required to execute an
                  instruction. If the necessary data is not in the cache and must be fetched from the
                  main memory, more cycles are necessary. When the pipeline flows smoothly, five
                  instructions are executed simultaneously.
                  * In VR4300 and VR4305. In VR4310, DivMode(2:0).
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
         Figure 4-2 outlines the pipeline. The horizontal rows in this figure indicate the
         execution processes of instructions, and the vertical columns indicate the five
         processes executed at the same time.
                                   (5-Deep)
PCycle
IC RF EX DC WB
IC RF EX DC WB
IC RF EX DC WB
IC RF EX DC WB
IC RF EX DC WB
                                     Current
                                      CPU
                                      Cycle
PCycle
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
                          ICF
Instr Fetch
                         ITLB     ITC
                                           RFR         BCMP
Computational
                                          IDEC            ALU
DVA
Branch IVA
          Begins During
Cycle                        Mnemonic                        Descriptions
            this Phase
                F1         —                 —
 IC                        ICF               Instruction Cache Fetch
                F2
                           ITLB              Instruction micro-TLB read
                F1         ITC               Instruction cache Tag Check
                           RFR               Register File Read
 RF
                F2         IDEC              Instruction DECode
                           IVA               Instruction Virtual Address calculation
                           BCMP              Branch Compare
 EX             F1         ALU               Arithmetic Logic operation
                           DVA               Data Virtual Address calculation
                           DCR               Data Cache Read
                F1
                           DTLB              Data joint-TLB read
DC
                           LA                Load data Alignment
                F2
                           DTC               Data cache Tag Check
                           DCW               Data Cache Write
                F1
WB                         RFW               Register File Write
                F2         —                 —
Branch IC RF EX DC WB
Branch Delay
            Add Instruction
            ADD rd,rs,rt
                   RF stage   In phase 1 of the RF stage, the cache index is compared with the
                              page frame number from the ITLB and the cache data is read out.
                              The cache hit/miss signal is valid late in phase 1 of the RF stage,
                              and the virtual PC is incremented by 4 so that the next
                              instruction can be fetched.
                              During phase 2, the rs and rt fields of the 2-port register file are
                              accessed and the register data is valid at the register file output.
                              At the same time, bypass multiplexers select inputs from either
                              the EX- or DC-stage output in addition to the register file output,
                              depending on the need for an operand bypass.
                   EX stage   The ALU controls are set to do an A+B operation. The operands
                              flow into the ALU inputs, and the ALU operation is started. The
                              result of the ALU operation is latched into the ALU output latch
                              during phase 2.
                   DC stage   This stage is a NOP for this instruction. The data from the
                              output of the EX stage (the ALU) is moved into the output latch
                              of the DC.
                   WB stage   During phase 1, the WB latch feeds the data to the inputs of the
                              register file, which is addressed by the rd field. The file write
                              strobe is enabled. By the end of phase 1, the data is written into
                              the register file.
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
ITLB IDEC
                    DC stage      The PC+8 value is moved from the Link output latch to the
                                  output latch of the DC pipeline stage.
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
ITLB IDEC
IVA
                   RF stage     During phase 2, the register file is addressed with the rs and rt
                                fields and the contents of these registers are placed in the register
                                file output latch.
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
IVA
                      EX stage   During the phase 1, the bypass multiplexers select inputs from
                                 the RF-, EX- or DC-stage output latch, depending on the need
                                 for an operand bypass. ALU controls are set to do an A – B
                                 operation. The operands flow into the ALU inputs, and the ALU
                                 operation is started.
                                 The result of the ALU operation is latched into the ALU output
                                 latch during phase 2.
                      DC stage   The sign bits of operands and of the ALU output latch are
                                 checked to determine if a less than condition is true. If this
                                 condition is true, a Trap Exception occurs. This, as with all
                                 pipeline exceptions, implies a 2-cycle stall. The PC register is
                                 loaded with the value of the exception vector and instructions
                                 following in previous pipeline stages are killed.
                      WB stage   The exception code is set in the ExCode field in the cause
                                 register if the less than condition was met in the DC stage. The
                                 PC value of this instruction is stored in the EPC register and BD
                                 bit are updated appropriately according to the contents of the
                                 EXL bit of the Status register. If the less than condition was not
                                 met in the DC stage, no activity occurs in the WB stage.
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
ITLB IDEC
IVA
                   RF stage   Same as the RF stage for the ADD instruction. Note that the base
                              field is in the same position as the rs field.
                   EX stage   Refer to the EX stage for the ADD instruction. For LW, the
                              inputs to the ALU come from GPR[base] through the bypass
                              multiplexer and from the sign-extended offset field. The result of
                              the ALU operation that is latched into the ALU output latch in
                              phase 2 represents the effective virtual address of the operand
                              (DVA).
                   DC stage   The data cache is accessed in parallel with the TLB, and the
                              cache tag field is compared with the Page Frame Number (PFN)
                              field of the TLB entry. After passing through the load aligner,
                              aligned data is placed in the DC output latch during phase 2.
                   WB stage   During phase 1, the cache read data is written into the file
                              addressed by the rt field.
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
                      WB stage    If there was a cache hit, the content of the store data output latch
                                  is written into the data cache at the appropriate word location.
                                  Note that all store instructions use the data cache for two
                                  consecutive PCycles. If the following instruction requires use of
                                  the data cache, the pipeline is stalled for one PCycle to complete
                                  the writing of an aligned store data.
PClock
Phase F1 F2 F1 F2 F1 F2 F1 F2 F1 F2
Cycle IC RF EX DC WB
Faults
Software Hardware
Exceptions Interlocks
Abort Stalls
           At each cycle, exception and interlock conditions are checked for all active
           instructions.
           Because each exception or interlock condition corresponds to a particular pipeline
           stage, a condition can be traced back to the particular instruction in the exception/
           interlock stage, as shown in Figure 4-12. For instance, an LDI Interlock is raised
           in the execution (EX) stage.
           Tables 4-2 and 4-3 describe the pipeline interlocks and exceptions listed in Figure
           4-12.
Clock
                             PCycle     F1 F2      F1 F2      F1 F2     F1 F2    F1 F2
                                                           Pipeline Stage
                           State
                                           IC         RF           EX       DC    WB
                                                   ITM        LDI       DCM      CP0I
                      Interlock                    ICB        MCI       DCB
                                                                        COp
                                                   IADE       SYSC      RST
                                                   ITLB       BRPT      NMI
                                                   IBE        CPU       OVFL
                                                              RSVD      TRAP
                                                                        FPE
                      Exceptions
                                                                        DADE
                                                                        DTLB
                                                                        WAT
                                                                        INTR
                                                                        DBE
                Remark      The conditions of the exceptions are shown starting from the
                            exception with the highest priority.
      Figure 4-12   Correspondence of Pipeline Stage to Interlock and Exception Condition
 Exception                                   Description
   IADE              Instruction Address Error Exception
   ITLB              Instruction TLB Exception
    IBE              Instruction Bus Error Exception
   SYSC              SYSCALL Instruction Exception
   BRPT              Breakpoint Instruction Exception
    CPU              Coprocessor Unusable Exception
   RSVD              Reserved Instruction Exception
    RST              External Reset Exception
    NMI              External NMI Exception
   OVFL              Integer Overflow Exception
   TRAP              TRAP Instruction Exception
    FPE              Floating-point Exception
   DADE              Data Address Error Exception
   DTLB              Data TLB Exception
   WAT               Reference to Watch Address Exception
   INTR              Interrupt Exception
    DBE              Data Bus Error Exception
Interlock                               Description
  ITM           Instruction TLB Miss
  ICB           Instruction Cache Busy
  LDI           Load Interlock
  MCI           Multi-cycle Interlock
 DCM            Data Cache Miss
  DCB           Data Cache Busy
  COp           Cache Op
 CP0I           CP0 Bypass Interlock
Run Stall Stall Stall Run Run Run Run Run Run Run
ITM ITM
                       IC    RF     RF      RF       RF     EX    DC    WB
                            ITLB
                                 Access JTLB ITLB
                            Miss            Update
IC IC IC IC RF EX DC WB
IC RF EX DC WB
IC RF EX DC WB
Run Stall • • • Stall Run Run Run Run Run Run Run
ICB ICB
IC RF • • • RF RF EX DC WB
IC IC IC RF EX DC WB
IC RF EX DC WB
IC RF EX DC WB
Run Run Run Run Stall ••• Stall ••• Run Run
                                            MCI
                                                                               MCI
Mult A,B IC RF EX EX • • • EX EX DC WB
Read MultHi IC RF RF • • • RF RF EX DC
Read MultLo IC IC • • • IC IC RF EX
                                                        Multiple
                                                    Cycle Instruction
                                                          Stall
Run Run Run Run Stall Run Run Run Run Run
             Load A       IC      RF     EX       DC     WB       WB
                        I-cache
                      Load B      IC     RF       EX     DC       DC      WB
                               I-cache
                                                                Bypass
                                              LDI
                                            detected
                                                            LDI     LDI
                           Add A,B        IC      RF     EX       EX      DC   WB
                                        I-cache
                                                   IC    RF       RF      EX   DC    WB
                                               I-cache
                                                          IC      IC      RF   EX    DC    WB
                                                        I-cache
Run Run Run Stall Run Run Stall • • • Stall • • • • • • • Run Run
      Load A       IC      RF      EX     DC        DC         WB
                 I-cache
                                                  Bypass
                                 LDI
                               detected       LDI        LDI
                                                                                                 Bypass
                                                                D-cache
                                                                   Miss                   Update
                                                     LDI                      D-cache
                                                                               Miss       D-cache
                                                     detected       LDI                                        LDI
IC IC RF EX ••• EX EX EX DC
IC RF ••• RF RF RF EX
Load LO IC RF EX DC DC WB
IC RF EX EX DC WB
IC RF RF EX DC WB
                                IC       RF        EX       DC         WB
                                                                                     Priority:
                                          IC       RF       EX         DC             Higher
IC RF EX
IC RF
IC Lower
                                                                      Current
                                                                       CPU
                                                                       Cycle
                   In the case of multiple exception requests from the same pipeline stage, the
                   highest-priority exception is processed first. The priority of the instruction-
                   dependent exceptions and interlocks are shown in the following sections.
4.7.9 Bypassing
             In some cases, data and conditions produced in the EX, DC and WB stages of the
             pipeline are made available to the EX stage (only) through the bypass datapath.
             Operand bypass allows an instruction in the EX stage to continue without having
             to wait for data or conditions to be written to the register file at the end of the WB
             stage. Instead, the Bypass Control Unit ensures data and conditions from later
             pipeline stages are available at the appropriate time for instructions earlier in the
             pipeline.
             The Bypass Control Unit also controls the source and destination register
             addresses supplied from the register file.
4 32 64
      * There are virtual-to-physical address translations that occur outside of the TLB. For example,
        addresses in the kseg0 and kseg1 spaces are unmapped translations. In these spaces the physical
        address is derived by subtracting the base address of the space from the virtual address.
                                                                           Virtual address
1. Virtual address (VA) represented by the vir-
   tual page number (VPN, high-order bit of the     ASID                                 Offset
   address) is compared with indicated area in
                                                                    VPN
   TLB.
TLB
Physical address
39 32 31 29 28 12 11 0
39 32 31 29 28 24 23 0
             8                                                                24
                                     8
                         8 bits = 256 pages
                            Virtual Address with 256 (28)16 MB pages
              8                    22                           28                        12
                            2
      71           64    63 62 61        40 39                  24 23                                 0
            ASID                  0 or -1            VPN                         Offset
              8             2       22                 16                           24
                                             16 bits = 64K pages
                                  Virtual Address with 64K (216)16 MB pages
                               32-bit*                                           64-bit
        0x FFFF FFFF                                   0x FFFF FFFF FFFF FFFF
                           Address                                               Address
                            Error                                                 Error
        0x 8000 0000                                0x 0000 0100 0000 0000
        0x 7FFF FFFF                                0x 0000 00FF FFFF FFFF
                          2 GB                                                     1 TB
                                  useg                                                     xuseg
                       TLB Mapped                                               TLB Mapped
        0x 0000 0000                                   0x 0000 0000 0000 0000
                       * The VR4300 internally uses 64-bit addresses. In the Kernel mode, the pro-
                         cessor saves and restores each register to initialize the register before
                         switching the context. A 32-bit value is used as an address, with bit 31
                         sign-extended to bits 32 through 63, in the 32-bit mode.
                         Usually, the program in the 32-bit mode does not generate invalid address-
                         es. If the context is switched and the processor enters the Kernel mode, a
                         value other than the 32-bit address previously sign-extended may be stored
                         to a 64-bit register. In this case, the program in the user mode may gener-
                         ate invalid addresses.
                               Figure 5-4    User Mode Virtual Address Space
                     Status Register
  Address Bit                                   Segment
                       Bit Values                            Virtual Address Range        Segment Size
    Values                                       Name
                KSU EXL ERL UX
                     32-bit*                                                 64-bit
   0x FFFF FFFF                             0x FFFF FFFF FFFF FFFF
                  Address Error                                          Address Error
   0x E000 0000                             0x FFFF FFFF E000 0000
   0x DFFF FFFF      0.5 GB                 0x FFFF FFFF DFFF FFFF         0.5 GB
                                   sseg                                                    csseg
   0x C000 0000 TLB Mapped                  0x FFFF FFFF C000 0000       TLB Mapped
   0x BFFF FFFF                             0x FFFF FFFF BFFF FFFF
                                                                         Address Error
                                            0x 4000 0100 0000 0000
                  Address Error
                                            0x 4000 00FF FFFF FFFF
   0x 8000 0000                                                             1 TB
                                                                                           xsseg
   0x 7FFF FFFF                                                          TLB Mapped
                                            0x 4000 0000 0000 0000
                                            0x 3FFF FFFF FFFF FFFF
                                                                         Address Error
                     2 GB                   0x 0000 0100 0000 0000
                  TLB Mapped suseg          0x 0000 00FF FFFF FFFF
                                                                            1 TB
                                                                                           xsuseg
                                                                         TLB Mapped
   0x 0000 0000                             0x 0000 0000 0000 0000
                     * The VR4300 internally uses 64-bit addresses. In the 32-bit mode, a 32-bit
                       value with bits 32 through 63 sign-extended is used as an address.
                       Normally, the program in the 32-bit mode does not generate an invalid ad-
                       dress. However, there is a possibility that an integer overflow may occur
                       as a result of an operation of base register + offset to calculate an address.
                       The address calculated at this time is invalid, and the result is undefined.
                       Two causes of the overflow are cited below.
                       Status Register
  Address Bit                                  Segment                                     Segment
                          Bit Values                        Virtual Address Range
    Values                                      Name                                         Size
                    KSU EXL ERL SX
                                                                 0x0000 0000
32-bit                                                                                        2 GB
                     01    0      0       0    suseg                through
A(31) = 0                                                                                  (231 bytes)
                                                                 0x7FFF FFFF
                                                                 0xC000 0000
32-bit                                                                                      512 MB
                     01    0      0       0    sseg                 through
A(31:29) = 110                                                                             (229 bytes)
                                                                 0xDFFF FFFF
                                                            0x0000 0000 0000 0000
64-bit                                                                                        1 TB
                     01    0      0       1    xsuseg              through
A(63:62) = 00                                                                              (240 bytes)
                                                           0x0000 00FF FFFF FFFF
                                                            0x4000 0000 0000 0000
64-bit                                                                                        1 TB
                     01    0      0       1    xsseg               through
A(63:62) = 01                                                                              (240 bytes)
                                                           0x4000 00FF FFFF FFFF
                                                          0xFFFF FFFF C000 0000
64-bit                                                                                      512 MB
                     01    0      0       1    csseg              through
A(63:62) = 11                                                                              (229 bytes)
                                                          0xFFFF FFFF DFFF FFFF
                   32-bit*                                                 64-bit
  0x FFFF FFFF                           0x FFFF FFFF FFFF FFFF             0.5 GB
                    0.5 GB                                                TLB Mapped             ckseg3
                  TLB Mapped     kseg3 0x FFFF FFFF E000 0000
  0x E000 0000                         0x FFFF FFFF DFFF FFFF               0.5 GB
  0x DFFF FFFF                                                            TLB Mapped             cksseg
                    0.5 GB             0x FFFF FFFF C000 0000
                  TLB Mapped     ksseg 0x FFFF FFFF BFFF FFFF               0.5 GB
  0x C000 0000                                                           TLB Unmapped ckseg1
                                         0x FFFF FFFF A000 0000            Uncached
  0x BFFF FFFF
                    0.5 GB               0x FFFF FFFF 9FFF FFFF             0.5 GB
                 TLB Unmapped    kseg1                                   TLB Unmapped ckseg0
                   Uncached              0x FFFF FFFF 8000 0000            Cacheable
  0x A000 0000
  0x 9FFF FFFF                           0x FFFF FFFF 7FFF FFFF
                    0.5 GB                                                Address Error
                 TLB Unmapped    kseg0 0x C000 00FF 8000 0000
                   Cacheable           0x C000 00FF 7FFF FFFF
  0x 8000 0000
                                                                          TLB Mapped             xkseg
  0x 7FFF FFFF                           0x C000 0000 0000 0000
                                         0x BFFF FFFF FFFF FFFF          TLB Unmapped
                                                                        (For details, refer to   xkphys
                                         0x 8000 0000 0000 0000             Figure 5-7.)
                                         0x 7FFF FFFF FFFF FFFF
                                                                          Address Error
                                       0x 4000 0100 0000 0000
                     2 GB
                  TLB Mapped     kuseg 0x 4000 00FF FFFF FFFF                1 TB
                                                                          TLB Mapped             xksseg
                                         0x 4000 0000 0000 0000
                                         0x 3FFF FFFF FFFF FFFF
                                                                          Address Error
                                         0x 0000 0100 0000 0000
                                         0x 0000 00FF FFFF FFFF              1 TB
                                                                          TLB Mapped             xkuseg
  0x 0000 0000                           0x 0000 0000 0000 0000
                    * The VR4300 internally uses 64-bit addresses. In the 32-bit mode, a 32-bit
                      value with bits 32 through 63 sign-extended is used as an address.
                      Normally, the program in the 32-bit mode uses 64-bit instructions. How-
                      ever, there is a possibility that an integer overflow may occur as a result of
                      an operation of base register + offset to calculate an address. The address
                      calculated at this time is invalid, and the result is undefined. Two causes of
                      the overflow are cited below.
                    •   When bit 15 of offset = 0, bit 31 of base register = 0, and bit 31 of
                        (base register + offset) = 1
                    •   When bit 15 of offset = 1, bit 31 of base register = 1, and bit 31 of
                        (base register + offset) = 0
                          Figure 5-6     Kernel Mode Address Space
                    Status Register
  Address Bit          Bit Value         Segment        Virtual           Physical        Segment
    Values                                Name          Address           Address           Size
                  KSU EXL ERL KX
                                                      0x0000 0000                            2 GB
 A(31) = 0                           0     kuseg         through          TLB map
                                                      0x7FFF FFFF                         (231 bytes)
                                                      0x8000 0000      0x0000 0000         512 MB
 A(31:29) = 100                      0     kseg0         through          through
                                                      0x9FFF FFFF      0x1FFF FFFF        (229 bytes)
                KSU = 00
                or                                   0xA000 0000       0x0000 0000         512 MB
 A(31:29) = 101 EXL = 1              0     kseg1        through           through
                                                     0xBFFF FFFF       0x1FFF FFFF        (229 bytes)
                or
                ERL =1                               0xC000 0000                           512 MB
 A(31:29) = 110                      0     ksseg        through           TLB map
                                                     0xDFFF FFFF                          (229 bytes)
                  Status Register
   Address           Bit Value    Segment                              Physical    Segment
                                                  Virtual Address
  Bit Values                       Name                                Address       Size
                 KSU EXL ERL KX
                                               0x0000 0000 0000 0000                 1 TB
 A(63:62) = 00                1    xkuseg             through          TLB map
                                              0x0000 00FF FFFF FFFF               (240 bytes)
                                   xkphys
                                  Refer to
                                  64-bit
                                  Kernel
                                  Mode,        0x8000 0000 0000 0000 0x0000 0000
 A(63:62) = 10                1   Physical            through          through   232 bytes
                                  Spaces      0xBFFF FFFF FFFF FFFF 0xFFFF FFFF
                                  (xkphy)
               KSU = 00           on the
               or                 following
               EXL = 1            page.
               or                              0xC000 0000 0000 0000              240 to 231
 A(63:62) = 11 ERL =1         1    xkseg              through          TLB map
                                              0xC000 00FF 7FFF FFFF                 bytes
* Register number
                                               32-bit Mode
   127           121 120                                 109    108                                       96
            0                         MASK                                       0
            7                            12                                      13
   95                                                           77 76 75 72 71                            64
                                VPN2                                G    0            ASID
                                 19                                 1    4                   8
   63 58 57                                                                      38 37 35 34 33 32
        0                                  PFN                                           C        D V0
    6                                         20                                         3         1 1 1
   31 26 25                                                                      6 5             3 2 1 0
        0                                     PFN                                        C           D V 0
        6                                      20                                        3            1 1 1
                                                64-bit Mode
   255                                              217 216                  205 204                     192
                           0                                    MASK                             0
                           39                                       12                           13
   191 190 189              168 167                            141 140139136 135                        128
        R            0                         VPN2                 G    0             ASID
        2            22                             27              1    4                   8
   127                          90 89                                            70 69       67 66 65 64
                 0                                            PFN                        C           D V0
                38                                             20                        3           1 1 1
   63                           26 25                                             6 5            3 2 1        0
                 0                                            PFN                        C            D V0
                 38                                            20                        3            1 1 1
                                  Figure 5-9        TLB Entry Format
                    The formats of the EntryHi, EntryLo0, EntryLo1, and PageMask registers are
                    almost the same as the TLB entry. However, the G bit of TLB is undefined with
                    the entry Hi register.
                                                 PageMask Register
          31                   25 24                              13 12                             0
                     0                         MASK                                  0
                     7                             12                                13
Mask : Page comparison mask. Determines the virtual page size of the corresponding entry.
0    : Reserved for future use (RFU). Must be written as zeroes, and returns zeroes when
       read.
                                                  EntryHi Register
          31                                                    13 12       8 7                     0
32-bit
Mode                               VPN2                                 0                 ASID
                                        19                              5                  8
          63 62 61                     40 39                   13 12        8 7                     0
64-bit
Mode
               R          Fill                    VPN2                  0                 ASID
               2          22                        27                  5                  8
                 Whether the cache is used when a page is referenced is specified by the page
                 coherency attribute (C) bit of the TLB. To use the cache, specify “cache is used”
                 or “cache is not used” by algorithm as a page attribute. Table 5-6 shows the page
                 attributes selected by the C bit.
                                             Index Register
               31 30                                                          6 5           0
                P                                0                                  Index
                1                                25                                     6
                                        Random Register
               31                                                           6 5          0
                                          0                                    Random
26 6
5.4.3 EntryHi (10), EntryLo0 (2), EntryLo1 (3), and PageMask (5) Registers
                   These registers are used to rewrite the TLB or to check coincidence of a TLB entry
                   when addresses are converted. If the TLB exception occurs, information on the
                   address that has caused the exception is loaded to these registers. Figure 5-10
                   shows the formats of the EntryHi, EntryLo0, EntryLo1, and PageMask registers.
                   The values of these registers on reset are undefined. Therefore, initialize the
                   registers by software.
            EntryHi Register
                   The EntryHi register is a read/write register and is used to access the high-order
                   bits of the internal TLB.
                   The EntryHi register retains the contents of the high-order bits of a TLB entry
                   when a TLB read or write operation is executed. If a TLB miss, TLB invalid, or
                   TLB modification exception occurs, the virtual page number (VPN2) of the
                   virtual address that has caused the exception and ASID are set to the EntryHi
                   register. For the details of the TLB exception, refer to Chapter 6 Exception
                   Processing.
                   ASID is used to write or read the ASID area of the TLB entry. When an address
                   is converted, it is verified against the ASID of the TLB entry as the ASID of the
                   virtual address.
                   To access this register, use the TLBP, TLBWR, TLBWI, or TLBR instruction.
       PageMask Register
             The PageMask register is a read/write register used for reading from or writing to
             the TLB; it holds a comparison mask that sets the page size for each TLB entry,
             as shown in Table 5-7. There are seven page sizes selectable. TLB read and write
             operations use this register as either a destination or a source; when virtual
             addresses are presented for translation into physical address, the bits 24:13 which
             are used in the comparison are masked. When the Mask field is not one of the
             values shown in Table 5-7, the operation of the TLB is undefined.
                                                      Bit
 Page Size
              24    23     22      21     20     19         18   17   16    15     14     13
4 KB          0      0      0       0      0      0         0    0    0      0      0      0
16 KB         0      0      0       0      0      0         0    0    0      0      1      1
64 KB         0      0      0       0      0      0         0    0    1      1      1      1
256 KB        0      0      0       0      0      0         1    1    1      1      1      1
1 MB          0      0      0       0      1      1         1    1    1      1      1      1
4 MB          0      0      1       1      1      1         1    1    1      1      1      1
16 MB         1      1      1       1      1      1         1    1    1      1      1      1
                                                  TLB
                                                               31
                                                     TLB
              Range of Random entries
                                                                       Value of
                                                                       Wired
                                                                       Register
               Range of Wired entries
                                                               0
                         Figure 5-13    Wired Register Boundary
             Although the Wired field is six bits wide, only the five low-order bits are used in
             TLB operations, since the VR4300 TLB has 32 entries. Bit 5 is readable and
             writable by software, but is ignored during TLB operations.
             The Wired register is set to 0 upon Cold Reset. Writing this register also sets the
             Random register to the value of its upper bound of 31 (Refer to 5.4.2 Random
             Register (1)). Figure 5-14 shows the format of the Wired register.
                                              Wired Register
                  31                                                              6 5           0
                                                 0                                      Wired
                                                 26                                       6
                                             PRId Register
                 31                                     16 15               87                0
                                    0                            Imp               Rev
16 8 8
              The processor revision number is a value in the format of yx. y is the major
              revision number contained in bits 7:4, and x is the minor revision number
              contained in bits 3:0.
              The processor revision number identifies revision of the chip. However, revision
              of the chip is not always reflected on the PRID register. Conversely, a change in
              the revision number does not always reflect on the actual change of the chip.
              Therefore, develop your program so that it does not depend on the processor
              revision number area.
                    The values of the EP and BE areas can be changed only when initialization is
                    executed in the non-cache area immediately after cold reset and before a store
                    instruction is executed. The operation is not guaranteed if the values of these areas
                    are changed at any other time. Figure 5-16 shows the format of the Config
                    register.
31 30 28 27 24 23 16 15 14 4 3 2 0
0 EC EP 00000110 BE 11001000110 CU K0
1 3 4 8 1 11 1 3
  EC : Operating frequency ratio (read-only). The value displayed corresponds to the frequency
       ratio set by the DivMode pins on power application.
       (For details of DivMode pin setting, refer to Table 2-2 Clock/Control Interface Signals.)
        mPD30200-80 (VR4305)
        110 ® 1:1 (MasterClock: PCIock)
        111 ® RFU
        000 ® 1:2
        001 ® 1:3
        Others ® RFU
        mPD30200-100 (VR4300)
        110 ® RFU
        111 ® 1:1.5 (MasterClock: PClock)
        000 ® 1:2
        001 ® 1:3
        Others ® RFU
        mPD30200-133 (VR4300)
        110 ® 1:4 (MasterClock: PCIock)
        111 ® RFU
        000 ® 1:2
        001 ® 1:3
        Others ® RFU
        mPD30210-133 (VR4310)
        010 ® 1:5 (MasterClock: PCIock)
        011 ® 1:6
        100 ® RFU
        101 ® 1:3
        110 ® 1:4
        111 ® RFU
        000 ® 1:2
        001 ® 1:3
        mPD30210-167 (VR4310)
        010 ® 1:5 (MasterClock: PCIock)
        011 ® 1:6
        100 ® 1:2.5
        101 ® 1:3
        110 ® 1:4
        111 ® RFU
        000 ® 1:2
        001 ® 1:3
EP :    Sets transfer data pattern (single/block write request).
        0 ® D (default on cold reset)
        6 ® DxxDxx: 2 doublewords/6 cycles
        Others ® RFU
BE :    Sets BigEndianMem (endianness).
        0 ® Little endian
        1 ® Big endian (default on cold reset)
CU :    RFU. However, can be read or written by software.
K0 :    Sets coherency algorithm of kseg0 (refer to Table 5-6 Cache Algorithm).
        010 ® Cache is not used
        Others ® Cache is used
1   :   Returns 1 when read.
0   :   Returns 0 when read.
Caution     If the BE bit of this register is changed by using the MTC0 instruction, insert two
            or more NOP instructions or an instruction other than the load/store instruction in
            between the MTC0 and load/store instructions.
                                          LLAddr Register
               31                                                                            0
PAddr
                                                    32
               PAddr : Stores the bits 31 through 4 of the physical address read by the last
                       LL instruction to bits 27 through 0, and 0 to bits 31 through 28.
                                      Figure 5-17    LLAddr Register
                  31 28 27                                         8 7       6 5              0
         TagLo         0                   PTagLo                    PState           0
                       4                      20                         2            6
                  31                                                                          0
         TagHi                                           0
                                                         32
                 PTagLo : Physical address bits 31:12
                 PState : Specifies the primary cache state
                          Data cache
                             11 = Valid
                             00 = Invalid
                          Instruction cache
                             10 = Valid
                             00 = Invalid
                          Others = Undefined
                 0      : RFU. Must be written as zeroes; returns zeroes when read
            If a TLB entry matches, the physical address and access control bits (C, D, and V)
            are retrieved from the matching TLB entry. While the V bit of the entry must be
            set for a valid translation to take place, it is not involved in the determination of a
            matching TLB entry.
            Figure 5-19 illustrates the TLB address translation process.
                * The number of bits differs depending on the page size.
                  Here are examples where the page size is 16 MB and 4 KB:
                       Page Size
                                                16 MB                           4 KB
            Mode
             32-bit mode            A (31:25)                       A (31:13)
             64-bit mode            A63, A62, and A (39:25)         A63, A62, and A (39:13)
        Exception                                                                               Exception
                            Yes
                                                                        No           Yes
                                                         No      Legal
                                              Address           Address?
                                               Error
Yes
                            Yes
            No                    Global
                                         No       ASID   No
                            G = 1?                Match?
                            Yes
                                                  Yes
                                     Valid                                      32-bit     No
                                                                               address?
                            V = 1?
                                         No
Yes Yes
                                      Dirty
   Yes                 No
             Write?          D = 1?
                                   Yes
             No
                 Access
                  Main                 Access
                 Memory                Cache
         This chapter describes the exception processing and the hardware used for the
         exception processing. For the FPU exception, refer to Chapter 8 Floating-Point
         Exceptions.
            Interrupt Enable
                   Interrupts are enabled if the following conditions are satisfied.
                       •    IE (interrupt enable bit) = 1
                       •    EXL bit = 0, ERL bit = 0
                       •    Bit of corresponding IM area in status register = 1
     Exception/Error Level
            The Kernel mode is set when either of the EXL or ERL bit is set to 1.
            When execution returns from exception processing, the exception level is reset to
            normal (0) (for details, refer to ERET Instruction of Chapter 16 CPU
            Instruction Set Details).
            In addition to the above, registers that hold information on addresses, causes, and
            statuses during exception processing are provided. For details, refer to 6.3
            Exception Processing Registers. For details of the exception processing, refer to
            6.4 Exception Details.
            Hazard of CP0
                  With the General Purpose registers of the CPU, when the result of an operation is
                  to be used by the next instruction, the hardware generates a stall and waits until
                  the result can be used. However, the CP0 register and TLB do not generate a stall.
                  If a value is stored to the CP0 register, that value may not be used by the
                  immediately following instruction because the value is stored in the register
                  several cycles later. When designing a program, therefore, you must take this into
                  consideration when setting values to the CP0 register and TLB (for details, refer
                  to Chapter 19 Coprocessor 0 Hazards).
                                               Context Register
                    31             23 22                                             4 3        0
           32-bit        PTEBase                        BadVPN2                            0
           Mode
                            9                                   19                         4
                    63                                     23 22                    4 3        0
           64-bit                  PTEBase                            BadVPN2              0
           Mode
                                      41                                 19                4
           PTEBase : Base address of page table entry
           BadVPN2 : Page number of virtual address whose translation is invalid divided by 2
           0       : RFU. Must be written zeroes; returns zeroes when read
                                           BadVAddr Register
                     31                                                                            0
            32-bit                           Bad Virtual Address
            Mode
                                                        32
                     63                                                                            0
            64-bit                           Bad Virtual Address
            Mode
                                                        64
            BadVAddr : virtual address at which an address error occurred last or which failed
                       in address translation
                                                Count Register
                     31                                                                        0
                                                        Count
                                                          32
                     Count : latest count value (incremented at frequency half PClock)
                                             Compare Register
                 31                                                                          0
                                                 Compare
                                                     32
                                           Status Register
   31           28 27 26 25 24                16 15                   8 7   6   5 4 3 2    1   0
         CU
                  RP FR RE       DS                     IM(7:0)        KX SX UX KSU ERL EXL IE
      (CU3:CU0)
            4      1 1   1             9                  8             1   1 1    2   1   1   1
   * The low power mode is supported only in the 100 MHz model of the VR4300 and theVR4305.
     Fix the RP bit of the 133 MHz model of the VR4300 and the VR4310 to 0.
                 Figure 6-6 shows the format of the self-diagnostic status (DS) area. All the bits in
                 the DS area, except the TS bit, can be read or written.
ITS 0 BEV TS SR 0 CH CE DE
1 1 1 1 1 1 1 1 1
                   Fields of the Status register set the modes and access states described in the
                   sections that follow.
            Interrupt Enable
                   Interrupts are enabled when all of the following conditions are satisfied:
                       •    IE = 1
                       •    EXL = 0
                       •    ERL = 0
                       •    When corresponding bit of IM is set to 1
Operating Modes
       The following Status register bit settings are required for User, Kernel, and
       Supervisor modes.
           •    The processor is in User mode when KSU = 10, EXL = 0, and ERL =
                0.
           •    The processor is in Supervisor mode when KSU = 01, EXL = 0, and
                ERL = 0.
           •    The processor is in Kernel mode when KSU = 00, or EXL = 1, or ERL
                = 1.
            Status on Reset
                   The contents of the Status register on reset are undefined except for the following
                   bits:
                       •    TS and RP = 0
                       •    ERL and BEV = 1
                       •    SR = 0 on cold reset; SR = 1 on soft reset or NMI interrupt
            Inverting Endian
                   The VR4300 is set to big endian at reset. After that, the endian setting can changed
                   by using the BE bit of the Config register.
                       •    When RE bit = 1
                            The endian setting in the Kernel and supervisor modes is specified by
                            the BE bit of the Config register. The endian setting in the User mode
                            is opposite to the specified endian setting.
                       •    When RE bit = 0
                            The endian setting in the Kernel, Supervisor mode, and User mode is
                            specified by the BE bit of the Config register.
                                       Cause Register
  31    30 29 28 27                        16 15                             8 7 6         21       0
                                                                                0    Exc
  BD 0      CE                   0                           IP(7:0)                Code        0
1 1 2 12 8 1 5 2
   BD      : Indicates whether the last exception occurred has been executed in a branch delay
             slot.
             1 ® delay slot
             0 ® normal
   CE      : Coprocessor unit number referenced when a Coprocessor Unusable exception has
             occurred. If this exception does not occur, undefined.
   IP(7:0) : Indicates an interrupt is pending.
             1 ® interrupt pending
             0 ® no interrupt
             IP(7) : Timer interrupt
             IP(6:2) : External normal interrupts. Controlled by Int[4:0], or external write
                       requests
             IP(1:0) : Software interrupts. Only these bits can cause interrupt exception when
                       they are set to 1 by software.
   ExcCode : Exception code field (refer to Table 6-2 for details.)
   0       : RFU. Must be written as zeroes, and returns zeroes when read.
  Exception
               Mnemonic                              Description
 Code Value
       0      Int         Interrupt
       1      Mod         TLB Modification exception
       2      TLBL        TLB Miss exception (load or instruction fetch)
       3      TLBS        TLB Miss exception (store)
       4      AdEL        Address Error exception (load or instruction fetch)
       5      AdES        Address Error exception (store)
       6      IBE         Bus Error exception (instruction fetch)
       7      DBE         Bus Error exception (data reference: load or store)
       8      Sys         Syscall exception
       9      Bp          Breakpoint exception
       10     RI          Reserved Instruction exception
       11     CpU         Coprocessor Unusable exception
       12     Ov          Arithmetic Overflow exception
       13     Tr          Trap exception
       14     –           RFU
       15     FPE         Floating-Point exception
      16–22   –           RFU
       23     WATCH       Watch exception
      24–31   –           RFU
         The VR4300 has eight interrupt requests: IP7 through IP0. These interrupt
         requests are used for the following purposes.
IP7
         Indicates whether a timer interrupt request has been issued. This interrupt request
         is set when the contents of the Count register have become equal to those of the
         compare register.
                                                  EPC Register
                     31                                                                           0
            32-bit                                     EPC
            Mode
                                                          32
                     63                                                                           0
            64-bit                                     EPC
            Mode
                                                          64
            EPC : Address from which program execution is resumed after an exception
                  processing
                                            WatchLo Register
            31                                                       3    2      1            0
                                       PAddr0                             0      R        W
                                             29                            1     1        1
                                            WatchHi Register
            31                                                             4 3                0
                                             0                                       PAddr1
                                             28                                       4
                                        XContext Register
             63                               33 32 3130                     4 3        0
                            PTEBase                 R         BadVPN2               0
                               31                   2             27                4
BadVPN2 Area
         The BadVPN2 area is written by the hardware in case of a TLB miss.
R Area
         The R area is written by the hardware in case of a TLB miss.
PTEBase Area
         The PTEBase area is a read/write area and is used by the operating system.
The 27-bit BadVPN2 area holds the values of the bits 39:13 of the virtual address that has
caused a TLB miss. Because a TLB entry consists of a pair of an even page and an odd
page, it does not include bit 12. This register can be used as a pointer that references an 8-
byte PTE pair table as it is where the page size is 4 KB. With the page size of 16 KB or
more, an appropriate PTE reference address can be generated by shifting or masking the
value of this register.
                                                  PErr Register
                   31                                                    8 7                 0
                                                 0                              Diagnostic
24 8
CacheErr Register
      31                                                                                         0
                                                     0
32
                                              ErrorEPC Register
                   31                                                                        0
          32-bit                                    ErrorEPC
          Mode
                                                      32
                   63                                                                        0
          64-bit                                    ErrorEPC
          Mode
                                                      64
           ErrorEPC : Indicates the program counter on cold reset or soft reset, or in case of
                      the NMI exception.
Other 0x0180
                E.g. TLB Miss vector (EXL = 0): When BEV = 0, the vector base for this
                exception vector is in kseg0 (uncached, TLB unmapped space) (0x8000 0000 in
                32-bit mode, 0xFFFF FFFF 8000 0000 in 64-bit mode).
                When BEV = 1, the vector base address for this exception vector is in kseg1
                (uncached, TLB unmapped space) 0xBFC0 0200 in 32-bit mode and 0xFFFF
                FFFF BFC0 0200 in 64-bit mode. This is a TLB unmapped space, allowing the
                exception to bypass the TLB.
                E.g. General Exception vector: When BEV = 0, the vector base address for this
                exception vector is in kseg0 (uncached, unmapped space) (0x8000 0180 in 32-bit
                mode, 0xFFFF FFFF 8000 0180 in 64-bit mode).
                When BEV = 1, the vector base address for this exception vector is in kseg1
                (uncached, TLB unmapped space) (0x8000 0180 in 32-bit mode and 0xFFFF
                FFFF BFC0 0380 in 64-bit mode).
                This space is an uncached and TLB unmapped space, allowing the exception
                handler to bypass the cache and TLB.
      Cause
              The Cold Reset exception occurs when the ColdReset signal is asserted and then
              deasserted. This exception is not maskable.
      Processing
              The CPU provides a special interrupt vector for this reset exception:
                   •   location 0xBFC0 0000 in 32-bit mode
                   •   location 0xFFFF FFFF BFC0 0000 in 64-bit mode
              The Cold Reset vector resides in unmapped and uncached CPU address space, so
              the hardware need not initialize the TLB or the cache to process this exception. It
              also means the processor can fetch and execute instructions while the caches and
              virtual memory are in an undefined state.
              The contents of all registers in the CPU are undefined when this exception occurs,
              except for the following register fields:
                   •   The TS, SR, and RP bits of the Status register and the EP(3:0) bits of
                       the Config register are cleared to 0.
                   •   The ERL and BEV bits of the Status register and the BE bit of the
                       Config register are set to 1.
                   •   The Random register is set to the upper-limit value (31).
                   •   The EC(2:0) bits of the Config register are set to the contents of the
                       DivMode(1:0)* pins.
                   * In VR4300 and VR4305. In VR4310, DivMode(2:0).
      Servicing
              The Cold Reset exception is serviced by:
                   •   initializing all processor registers, coprocessor registers, TLB, caches,
                       and the memory system
                   •   performing diagnostic tests
                   •   bootstrapping the operating system
            Cause
                    A Soft Reset (sometimes called Warm Reset) occurs when the ColdReset signal
                    remains deasserted while the Reset pin is deasserted after assertion of more than
                    16 MasterClock cycles.
                    A Soft Reset immediately resets all state machines, and sets the SR bit of the Status
                    register. Execution begins at the reset vector when a Soft Reset occurs.
                    This exception is not maskable.
            Processing
                    The CPU provides a special interrupt vector for this exception (same location as
                    Cold Reset):
                         •   location 0xBFC0 0000 in 32-bit mode
                         •   location 0xFFFF FFFF BFC0 0000 in 64-bit mode
                    This vector is located within unmapped and uncached address space, so that the
                    cache and TLB need not be initialized to process this exception. When a Soft
                    Reset occurs, the SR bit of the Status register is set to distinguish this exception
                    from a Cold Reset exception.
                    When this exception occurs, the contents of all registers are preserved except for:
                         •   The program counter value when this exception occurs is set to the
                             ErrorEPC register, when the ERL bit of the Status register is 0.
                         •   TS and RP bits of the Status register are cleared to 0.
                         •   ERL, SR, and BEV bits of the Status register are set to 1.
                    Because the Soft Reset can abort cache and access to the system interface, cache
                    and memory state is undefined when this exception occurs.
            Servicing
                    The Soft Reset exception is serviced by saving the current processor state for self-
                    diagnostic purposes, and reinitializing the system in the same manner as the Cold
                    Reset exception.
      Cause
              The Non-maskable Interrupt (NMI) exception occurs in response to the falling
              edge of the NMI pin. An NMI can also be set by externally writing 1 to the bit 6
              of the internal interrupt register through the SysAD6 bus.
              Unlike all other interrupts, this interrupt is not maskable; it occurs regardless of
              the settings of the EXL, ERL, and the IE bits in the Status register.
      Processing
              The CPU provides a special interrupt vector for this exception (same location as
              Cold Reset):
                   •   location 0xBFC0 0000 in 32-bit mode
                   •   location 0xFFFF FFFF BFC0 0000 in 64-bit mode
              This vector is located within unmapped and uncached address space so that the
              cache and TLB need not be initialized to process this exception. When an NMI
              exception occurs, the SR bit of the Status register is set to differentiate this
              exception from a Reset exception.
              Unlike Cold Reset and Soft Reset, but like other exceptions, NMI is taken only at
              instruction boundaries. The state of the caches and memory system are preserved
              by this exception.
              When this exception occurs, the contents of all registers are preserved except for:
                   •   The program counter value when this exception occurs is set to the
                       ErrorEPC register.
                   •   TS bit of the Status register are cleared to 0.
                   •   ERL, SR, and BEV bits of the Status register are set to 1.
      Servicing
              The NMI exception is serviced by saving the current processor state for self-
              diagnostic purposes, and reinitializing the system in the same manner as the Cold
              Reset exception.
            Cause
                    The Address Error exception occurs when an attempt is made to execute one of
                    the following:
                         •   Execute the LW or SW instruction to the word data that is not located
                             at the word boundary.
                         •   Execute the LH or SH instruction to the halfword data that is not
                             located at the halfword boundary.
                         •   Execute the LD or SD instruction to the doubleword data that is not
                             located at the doubleword boundary.
                         •   Reference the Kernel address space from User or Supervisor mode
                         •   Reference the supervisor address space from User mode
                         •   Reference an address not in Kernel, Supervisor, or User space in 64-
                             bit Kernel, Supervisor, or User mode.
                    This exception is not maskable.
            Processing
                    The common exception vector is used for this exception. The AdEL or AdES code
                    in the Cause register is set, indicating whether the instruction caused the exception
                    with an instruction reference (AdEL), load operation (AdEL), or store operation
                    (AdES).
                    When this exception occurs, the BadVAddr register retains the virtual address that
                    was not properly aligned or was referenced in protected address space. The
                    contents of the VPN field of the Context and EntryHi registers are undefined, as
                    are the contents of the EntryLo register.
                    The EPC register contains the address of the instruction that caused the exception,
                    unless this instruction is in a branch delay slot. If it is in a branch delay slot, the
                    EPC register contains the address of the preceding branch instruction and the BD
                    bit of the Cause register is set.
            Servicing
                    The process executing at the time is handed a UNIXTM SIGSEGV (segmentation
                    violation) signal by Kernel. This error is usually fatal to the process incurring the
                    exception.
             Cause
             The TLB (XTLB) Miss exception occurs when there is no TLB entry to match an
             address to be referenced. This exception is not maskable.
             Processing
             There are two special vectors for this exception. One is for the 32-bit mode, and
             the other is for the 64-bit mode. The UX, SX, and KX bits of the Status register
             determine whether the user, supervisor or Kernel address spaces referenced are
             32-bit or 64-bit spaces. All TLB Miss exceptions use these two special vectors
             when the EXL bit is set to 0 in the Status register, and they use the common ex-
             ception vector when the EXL bit is set to 1 in the Status register.
             This exception sets the TLBL or TLBS code to the ExcCode area of the Cause reg-
             ister. If the cause of the exception is an instruction reference or load operation,
             the TLBL code is set; if the cause is a store operation, the TLBS code is set.
             When this exception occurs, the BadVAddr, Context, XContext and EntryHi
             registers hold the virtual address that failed address translation. The EntryHi
             register also contains the ASID from which the translation fault occurred. The
             Random register normally contains a valid location in which to place the
             replacement TLB entry. The contents of the EntryLo register are undefined.
             The EPC register contains the address of the instruction that caused the exception,
             unless this instruction is in a branch delay slot, in which case the EPC register
             contains the address of the preceding branch instruction and the BD bit of the
             Cause register is set.
                  Servicing
                  To service this exception, the contents of the Context or XContext register are used
                  as a virtual address to load memory words containing the physical page frame and
                  access control bits to a pair of TLB entries. Memory words are written into the
                  TLB through the EntryLo0/EntryLo1/EntryHi register.
                  It is possible that the page frame and access control bit are placed on a page where
                  the virtual address is not resident in the TLB. This condition is processed by
                  allowing a TLB Miss exception in the TLB Miss exception handler. This second
                  exception goes to the common exception vector because the EXL bit of the Status
                  register is set.
                  Cause
                  The TLB Invalid exception occurs when a virtual address reference matches a
                  TLB entry that is marked invalid (TLB valid bit cleared). This exception is not
                  maskable.
                  Processing
                  The common exception vector is used for this exception. The TLBL or TLBS code
                  is set to the ExcCode field of the Cause register. If the cause of the exception is
                  an instruction reference or load operation, the TLBL code is set; if the cause is a
                  store operation, the TLBS code is set.
                  When this exception occurs, the BadVAddr, Context, XContext and EntryHi
                  registers contain the virtual address that failed address translation. The EntryHi
                  register also contains the ASID from which the translation fault occurred. The
                  contents of the EntryLo register are undefined.
                  The EPC register contains the address of the instruction that caused the exception
                  unless this instruction is in a branch delay slot, in which case the EPC register
                  contains the address of the preceding branch instruction and the BD bit of the
                  Cause register is set.
      Servicing
      A TLB entry is typically marked invalid when one of the following is true:
          •    a virtual address does not exist
          •    the virtual address exists, but is not in main memory (a page fault)
          •    a trap is desired on any reference to the page (for example, to
               maintain a reference bit)
      After removing the cause of a TLB Invalid exception, place another entry to the
      location of the TLB entry where the exception has occurred by the TLB Probe
      (TLBP) instruction and set 1 to the V bit.
      Cause
      The TLB change exception occurs if the TLB entry that matches the virtual
      address referenced by the store instruction is disabled from being written (the D
      bit is 0), though the TLB entry is valid (V bit is 1). This exception occurs only
      when an attempt is made to write the data cache. Note, however, that the priority
      of this exception is low.
      Processing
      The common exception vector is used for this exception, and the Mod code is set
      to the ExcCode field in the Cause register.
      When this exception occurs, the BadVAddr, Context, XContext and EntryHi
      registers contain the virtual address that failed address translation. The EntryHi
      register also contains the ASID from which the translation fault occurred. The
      contents of the EntryLo register are undefined.
      The EPC register contains the address of the instruction that caused the exception
      unless that instruction is in a branch delay slot, in which case the EPC register
      contains the address of the preceding branch instruction and the BD bit of the
      Cause register is set.
      Servicing
      The Kernel uses the failed virtual address or virtual page number to identify the
      corresponding access control bits. The page identified may or may not permit
      write accesses; if writes are not permitted, a write protection violation occurs.
      If write accesses are permitted, the page frame is marked dirty/writable by the
      Kernel in its own data structures.
                    The TLBP instruction places the index of the TLB entry that must be altered into
                    the Index register. The EntryLo register is loaded with a word containing the
                    physical page frame and access control bits (with the D bit set), and the contents
                    of the EntryHi and EntryLo registers are written into the TLB.
            Cause
                    A Bus Error exception is raised by board-level circuitry for events such as bus
                    time-out, local bus parity errors, and invalid physical memory addresses or access
                    types. This exception is not maskable.
                    A Bus Error exception occurs only when a cache miss refill, uncached field
                    reference, or unbuffered write occurs synchronously; in concrete terms, a Bus
                    Error exception occurs if SysCmd(0) indicates that the data contains an error when
                    it is transferred on the system bus, regardless of the direction of the transfer
                    between the system and the processor. An exception for the local bus error of the
                    system resulting from a buffered write transaction is generated using the interrupt
                    exception.
            Processing
                    The common interrupt vector is used for a Bus Error exception. The IBE or DBE
                    code in the ExcCode field of the Cause register is set. If the cause of the exception
                    is an instruction reference (instruction fetch), the IBE code is set. If the cause is a
                    data reference (load/store), the DBE code is set.
                    The EPC register contains the address of the instruction that caused the exception,
                    unless it is in a branch delay slot, in which case the EPC register contains the
                    address of the preceding branch instruction and the BD bit of the Cause register is
                    set.
      Servicing
              The physical address at which the fault occurred can be computed from
              information available in the system control coprocessor registers.
                   •   If the IBE code in the Cause register is set (indicating an instruction
                       fetch), the virtual address is contained in the EPC register (or 4 + the
                       contents of the EPC register if the BD bit of the Cause register is set).
                   •   If the DBE code is set (indicating a load or store), the virtual address
                       of the instruction that caused the exception (the address of the
                       preceding branch instruction if the BD bit of the Cause register is set)
                       is stored in the EPC register (or 4 + the contents of the EPC register
                       if the BD bit of the Cause register is set).
              The virtual address of the load and store reference can then be obtained by
              interpreting the instruction. The physical address can be obtained by using the
              TLBP instruction and reading the EntryLo register to compute the physical page
              number.
              The process executing at the time of this exception is handed a UNIX SIGBUS
              (bus error) signal, which is usually fatal.
      Cause
              A System Call exception occurs during an attempt to execute the SYSCALL
              instruction. This exception is not maskable.
      Processing
              The common exception vector is used for this exception, and the Sys code is set to
              the ExcCode field in the Cause register.
              The EPC register contains the address of the SYSCALL instruction unless it is in
              a branch delay slot. If the SYSCALL instruction is in a branch delay slot, the EPC
              register contains the address of the preceding branch instruction and the BD bit of
              the Cause register is set; otherwise this bit is cleared.
            Servicing
                    When this exception occurs, control is transferred to the applicable system
                    routine.
                    To resume execution, the EPC register must be altered so that the SYSCALL
                    instruction does not re-execute; this is accomplished by adding a value of 4 to the
                    EPC register (EPC register + 4) before returning.
                    If a SYSCALL instruction is in a branch delay slot, the branch instruction is
                    decoded to branch and re-execute.
            Cause
                    A Breakpoint exception occurs when an attempt is made to execute the BREAK
                    instruction. This exception is not maskable.
            Processing
                    The common exception vector is used for this exception, and the BP code is set to
                    the ExcCode in the Cause register.
                    The EPC register contains the address of the BREAK instruction unless it is in a
                    branch delay slot. If the BREAK instruction is in a branch delay slot, the EPC
                    register contains the address of the preceding branch instruction and the BD bit of
                    the Cause register is set, otherwise the bit is cleared.
            Servicing
                    When the Breakpoint exception occurs, servicing is transferred to the applicable
                    system routine. Additional information can be passed using the unused bits of the
                    BREAK instruction (bits 25:6). This information can be obtained by reading the
                    contents indicated by the EPC register as data. (A value of 4 must be added to the
                    contents of the EPC register (EPC register + 4) to locate the instruction if it resides
                    in a branch delay slot.)
                    To resume execution, the EPC register must be altered so that the BREAK
                    instruction does not re-execute; this is accomplished by adding a value of 4 to the
                    EPC register (EPC register + 4) before returning. If a BREAK instruction is in a
                    branch delay slot, decode the branch instruction to get the branch destination and
                    resume execution.
      Cause
              The Coprocessor Unusable exception occurs when an attempt is made to execute
              a coprocessor instruction for either:
                   •   If use of the corresponding coprocessor unit is not marked usable
                       (CU bits (3:1) of the Status register = 0).
                   •   If the CP0 instruction is executed in the User or Supervisor mode
                       when CP0 cannot be used (CU0 bit of the Status register = 0).
              This exception is not maskable.
      Processing
              The common exception vector is used for this exception, and the CpU code is set
              to the ExcCode in the Cause register.
              The CE bits of the Cause register indicate which of the four coprocessors was
              referenced.
              The EPC register indicates the coprocessor instruction that caused an exception.
              If the coprocessor instruction that caused the exception is in a branch delay slot,
              the EPC register indicates the preceding branch instruction and the BD bit of the
              Cause register is set.
      Servicing
              The coprocessor unit to which an attempted reference was made is identified by
              the CE bit of the Cause register, process as follows by a handler.
                   •   If the process is entitled access to the coprocessor, the coprocessor is
                       marked usable and the coprocessor resumes execution.
                   •   If the process is entitled access to the coprocessor, but the
                       coprocessor does not exist or has failed, decoding of the coprocessor
                       instruction is possible.
                   •   If the BD bit is set in the Cause register, the branch instruction must
                       be decoded; then the coprocessor instruction can be emulated and
                       execution resumed by making the contents of the EPC register
                       advanced past the coprocessor instruction.
            Cause
                    The Reserved Instruction exception occurs when one of the following conditions
                    occurs:
                         •   an attempt is made to execute an instruction with an undefined
                             opcode (bits 31:26)
                         •   an attempt is made to execute a SPECIAL instruction with an
                             undefined sub-opcode (bits 5:0)
                         •   an attempt is made to execute a REGIMM instruction with an
                             undefined sub-opcode (bits 20:16)
                         •   an attempt is made to execute 64-bit operations in 32-bit mode when
                             in User or Supervisor modes
                    64-bit operations are always valid in Kernel mode regardless of the value of the
                    KX bit in the Status register.
                    This exception is not maskable.
            Processing
                    The common exception vector is used for this exception, and the RI code is set in
                    the ExcCode field in the Cause register.
                    The EPC register indicates the instruction that caused an exception if the reserved
                    instruction is not in a branch delay slot, in which case the EPC register indicates
                    the preceding branch instruction and the BD bit of the Cause register is set.
            Servicing
                    All instructions in the MIPS ISA that are currently defined can be executed.
                    The process executing at the time of this exception is handled by a UNIX SIGILL/
                    ILL_RESOP_FAULT (illegal instruction/reserved operand fault) signal. This
                    exception is usually fatal.
      Cause
              The Trap exception occurs when a TGE, TGEU, TLT, TLTU, TEQ, TNE, TGEI,
              TGEUI, TLTI, TLTUI, TEQI, or TNEI instruction results in a TRUE condition.
              This exception is not maskable.
      Processing
              The common exception vector is used for this exception, and the Tr code is set in
              the ExcCode field in the Cause register.
              The EPC register indicates the Trap instruction that caused the exception. If the
              instruction is in a branch delay slot, the EPC register indicates the preceding
              branch instruction and the BD bit of the Cause register is set.
      Servicing
              The process executing at the time of a Trap exception is handed a UNIX SIGFPE/
              FPE_INTOVF_TRAP (floating-point exception/integer overflow) signal by
              Kernel. This exception is usually a fatal error.
            Cause
                    An Integer Overflow exception occurs when an ADD, ADDI, SUB, DADD,
                    DADDI or DSUB instruction results in a 2’s complement overflow. This
                    exception is not maskable.
            Processing
                    The common exception vector is used for this exception, and the Ov code is set in
                    the ExcCode field in the Cause register.
                    The EPC register indicates the instruction that caused the exception. If the
                    instruction is in a branch delay slot, the EPC register indicates the preceding
                    branch instruction and the BD bit of the Cause register is set.
            Servicing
                    The process executing at the time of the exception is handed a UNIX SIGFPE/
                    FPE_INTOVF_TRAP (floating-point exception/integer overflow) signal by
                    Kernel. This exception is usually a fatal error to the current process.
      Cause
              The Floating-Point exception is generated by the floating-point coprocessor. This
              exception is not maskable.
      Processing
              The common exception vector is used for this exception, and the FPE code is set
              in the ExcCode field in the Cause register.
              The contents of the Floating-Point Control/Status register indicate the cause of
              this exception.
              The EPC register indicates the reserved instruction if the instruction is not in a
              branch delay slot. If the instruction is in the branch delay slot, the EPC register
              indicates the preceding branch instruction and the BD bit of the Cause register is
              set.
      Servicing
              This exception is cleared by clearing the appropriate bit in the Floating-Point
              Control/Status register.
              For an unimplemented instruction exception, the Kernel must emulate the
              instruction; for other exceptions, the Kernel should pass the exception to the user
              program that caused the exception.
            Cause
                    A Watch exception occurs when a load or store instruction references the physical
                    address specified in the WatchLo/WatchHi registers. The exception is caused by
                    the following instructions: a load instruction when the R bit is set in the WatchLo
                    register; a store instruction when the W bit is set in the WatchLo register; a load or
                    store instruction when both the R and W bits are set in the WatchLo register.
                    The CACHE instruction never causes a Watch exception.
                    The Watch exception is postponed if the EXL bit is set in the Status register. The
                    Watch exception is maskable by setting the EXL bit in the Status register to 1 or
                    by clearing the R and W bits in the WatchLo register to 0.
            Processing
                    The common exception vector is used for this exception, and the Watch code is set
                    in the ExcCode field in the Cause register.
                    The EPC register indicates the Load and Store instructions if they are not in a
                    branch delay slot. If these instructions are in the branch delay slot, the EPC
                    register indicates the preceding branch instruction and the BD bit of the Cause
                    register is set.
            Servicing
                    The Watch exception is a debugging aid; typically the exception handler transfers
                    control to a debugger, allowing the user to examine the situation. To continue, the
                    Watch exception must be masked to execute the faulting instruction. The Watch
                    exception must then be reenabled.
                    Because the contents of the WatchLo/WatchHi registers become undefined after
                    reset, initialize the registers by software (especially clear the R and W bits to 0).
                    If not initialized, the Watch exception may occur.
      Cause
              The Interrupt exception occurs when one of the eight interrupt conditions (one for
              timer interrupt; five for hardware interrupt; two for software interrupt) is asserted.
              The significance of these interrupts is dependent upon the specific system
              implementation. An interrupt request signal from a pin is detected by the level.
              Each of the eight interrupts can be masked by clearing the corresponding bit in the
              Int-Mask field of the Status register, and all of the eight interrupts can be masked
              at once by clearing the IE bit, setting the EXL bit, or setting the ERL bit of the
              Status register.
      Processing
              The common exception vector is used for this exception, and the Int code is set in
              the ExcCode field in the Cause register.
              The IP field of the Cause register indicates current interrupt requests. It is
              possible before this register is read that more than one of the bits can be
              simultaneously set if the interrupt request signal is asserted; or that more than one
              of the bits can be simultaneously cleared if the interrupt request signal is
              deasserted.
              If the instruction that causes an exception is not in a branch delay slot the EPC
              register indicates that instruction. If the instruction is in the branch delay slot, the
              EPC register indicates the preceding branch instruction and the BD bit of the
              Cause register is set.
      Servicing
              If the interrupt is caused by one of the two software-generated exceptions (SW1 or
              SW0), the interrupt condition is cleared by setting the corresponding Cause
              register bit to 0.
              If an interrupt is generated by the hardware, the interrupt is cleared by asserting
              inactive the interrupt request signal that has caused the interrupt.
              If the timer interrupt request is generated, either clear the IP7 bit of the Cause
              register or change the contents of the Compare register, to clear this interrupt.
                                               Start
                                                                                   Comments
                                   Set FP Control Status Register ; FP Control/Status Register are
                                   EnHi <- VPN2, ASID               only set if the respective exception
                                   X/Context <- VPN2                occurs.
                                   Set Cause Register               EnHi, X/Context are set only for
                                   EXcCode, CE                      TLB-Invalid, Modification & Miss
                                   BadVAddr Register Setting        exceptions. It is not set by bus
                                                                    error exceptions, however.
                                              EXL=1?        Yes
                                               (SR1)                           ; Check for multiple
                                                                                 exception
                                                    No
                                    Yes       Instr. in
                                            Br.Dly. Slot?
No
                           = 0 (normal)                       =1 (bootstrap)
                                                BEV
     PC <- 0xFFFF FFFF 8000 0000 + 180                      PC <- 0xFFFF FFFF BFC0 0200 + 180
             (unmapped, cached)                                    (unmapped, uncached)
Yes
EXL = 1
(a) Hardware
Start
                                Yes         Instr. in
                                         Br.Dly. Slot?
                                                                                            Comments
         EXL = 0?         No                     No
         (SR bit 1)
                                          EXL = 0?        No                             ; Check for multiple
                 Yes                      (SR bit 1)                                       exception
                                                 Yes
BD bit of Cause Register <- 1    BD bit of Cause Register <- 0
EPC <- (PC–4)                    EPC <- PC
                                                                                         ; Processor moves
                                           EXL <- 1                                        to Kernel Mode
                                                                                           & interrupt
                                                                                           disabled
                       = 0 (normal)          BEV           =1 (bootstrap)
                                          (SR bit 22)
PC <- 0xFFFF FFFF 8000 0000 + Vec. Off.              PC <- 0xFFFF FFFF BFC0 0200 + Vec. Off.
          (unmapped, cached)                                  (unmapped, uncached)
                                                                                                Random <- 31
                                          Status:                                               Wired <- 0
                                                    RP <- 0 (soft reset)                        Update 31–4 bit of Config register
     Processing Guidelines (HW)
ErrorEPC <- PC
                                                                                                                  Comments
Cold Reset, Soft Reset & NMI Exception
                                                                        Yes
                                                                                   NMI?              ; There is no indication from the
                                                                                                       processor to differentiate between
      Servicing Guidelines (SW)
                                                                                 SR bit of             =0
                                                                              Status Register
=1
                                                          (Optional)
                                                                           Servicing of soft reset             Servicing of cold reset
                                                     ERET                    exception routine                   exception routine
Figure 6-16 Cold Reset, Soft Reset & NMI Exception Handler
7.1 Overview
             All floating-point instructions, as defined in the MIPS ISA for the floating-point
             coprocessor, CP1, can be processed by the VR4300. Logically, the Floating-Point
             Arithmetic Unit (FPU) exists as an individual coprocessor; however, unlike those
             of the VR4400, the VR4300 FPU is physically integrated into the Integer
             Arithmetic Unit (CPU). The CPU and the FPU use a common datapath and FPU
             instructions are fully-implemented in the CPU hardware. Unlike the VR4400
             implementation, VR4300 integer instructions cannot be executed until a
             multicycle floating-point instruction has been completed.
             The execution of floating-point instructions can be disabled by the coprocessor
             usability CU bit defined in the System Control Coprocessor (CP0) Status register.
                                                       Floating-Point
                                                      Control Registers
                                                           (FCR)
                            Control/Status Register                         Implementation/Revision Register
                                  (FCR31)                                               (FCR0)
                       31                              0                    31                            0
              The FPU in the VR4000 Series (excluding VR4100) has 32 control registers. With
              the VR4300, the following two FCRs are valid.
                  •       The Control/Status register (FCR31) controls and monitors
                          exceptions, holds the result of compare operations, and establishes
                          rounding modes.
                  •       The Implementation/Revision register (FCR0) holds revision
                          information about the FPU.
              Table 7-1 lists the assignments of the FCRs.
                       Bit # 17    16      15     14     13       12
                                                                         Cause
                             E     V       Z      O       U       I       Bits
                             Bit # 11     10       9      8        7
                                                                         Enable
                                   V       Z      O       U       I       Bits
                             Bit # 6       5       4      3        2
                                                                          Flag
                                   V       Z      O       U       I       Bits
                                                               Inexact Operation
                                                        Underflow
                                                 Overflow
                                          Division by Zero
                                  Invalid Operation
                           Unimplemented Operation
Figure 7-3 Control/Status Register (FCR31) Cause, Enable, and Flag Bit Fields
            The contents of FCR31 and FCR0 can be read by using the CFC1 instruction.
            The bits of FCR31 can be set or cleared by using the CTC1 instruction. FCR0 is
            a read-only register. The contents of a register to which data is to be written are
            undefined when an instruction that immediately follows the instruction that writes
            data to the register is executed. The pipeline does not interlock.
            The IEEE754 specifies detection of an exception during a floating-point
            operation, setting flags, and calling an exception handler in case of an exception.
            With the MIPS architecture, these specifications are realized by the cause, enable,
            and flag bits of the Control/Status register. The flag bit conforms to the exception
            status flag of the IEEE754, and the cause and enable bits conform to the exception
            handler of the IEEE754.
            Each bit of FCR31 is described next.
FS bit
         The FS bit enables a value that cannot be normalized (denormalized number) to
         be flashed. When the FS bit is set and the enable bit is not set for the underflow
         exception and illegal exception, the result of the denormalized number does not
         cause the unimplemented operation exception, but is flushed. Whether the flushed
         result is 0 or the minimum normalized value is determined depending on the
         rounding mode (refer to Table 7-2). If the result is flushed, the Flag and Cause
         bits are set for the underflow and illegal exceptions.
C Bit
         When a floating-point Compare operation takes place, the result is stored at bit 23,
         the Condition bit. The C bit is set to 1 if the condition is true; the bit is cleared to
         0 if the condition is false. Bit 23 is affected only by compare and CTC1
         instructions.
         Cause Bits
         Bits 17:12 in the FCR31 contain Cause bits which reflect the results of the most
         recently executed floating-point instruction. The Cause bits are a logical
         extension of the CP0 Cause register; they identify the exceptions raised by the last
         floating-point operation; and generate exceptions if the corresponding Enable bit
         is set. If more than one exception occurs on a single instruction, each appropriate
         bit is set.
                  The Cause bits are updated by the floating-point operations (except load, store,
                  and transfer instructions). The unimplemented operation instruction (E) bit is set
                  to a 1 if software emulation is required, otherwise it remains 0. The other bits are
                  set to 0 or 1 to indicate the occurrence or non-occurrence (respectively) of an
                  IEEE754 exception.
                  If the floating-point operation exception occurs, the operation result is not stored,
                  and only the Cause bit is influenced. The type of the exception that has been
                  caused by the most-recently-executed floating-point operation can be identified
                  by reading the Cause bit.
                  Enable Bits
                  A floating-point exception is generated any time a Cause bit and the
                  corresponding Enable bit are set. As soon as the Cause bit enabled through the
                  Floating-point operation, an exception occurs. When both Cause and Enable bits
                  are set by the CTC1 instruction, an exception also occurs.
                  There is no enable bit for unimplemented operation instruction (E). An
                  Unimplemented exception always generates a floating-point exception.
                  Before returning from a floating-point exception, software must first clear the
                  Cause bits that are enabled to generate exceptions to prevent a repeat of
                  exceptions. Thus, User mode programs cannot observe the set Cause bits. To use
                  the information by the handler in User mode, save the value of the Status register
                  and then call the handler in User mode.
                  If the Cause bit is set but the corresponding Enable is not set, no floating-point
                  exception occurs and the default result defined by IEEE754 is stored. In this case,
                  whether the exceptions were caused by the immediately previous floating-point
                  operation can be determined by reading the Cause bit.
                  Flag Bits
                  The Flag bits are cumulative and indicate the exceptions that were raised after
                  reset. Flag bits are set to 1 if an IEEE754 exception is raised but the occurrence
                  of the exception is prohibited. Otherwise, they remain unchanged. The Flag bits
                  are never cleared as a side effect of floating-point operations; however, they can
                  be set or cleared by writing a new value into the FCR31, using a CTC1 instruction.
  RM bits
                 Mnemonic                           Description
Bit 1   Bit 0
                                 Round result to nearest representable value;
                                 round to value with least-significant bit 0 when
 0       0            RN
                                 the two nearest representable values are equally
                                 near.
                                 Round toward 0: round to value closest to and
 0       1            RZ         not greater in magnitude than the infinitely
                                 precise result.
                                 Round toward + ¥: round to value closest to
 1       0            RP
                                 and not less than the infinitely precise result.
                                 Round toward – ¥: round to value closest to
 1       1           RM
                                 and not greater than the infinitely precise result.
             The implementation revision number is a value in the format of y.x, where y is the
             major revision number stored to the bits 7:4, and x is the minor revision number
             stored to bits 3:0. Revision of the chip can be identified by the implementation
             revision number. However, the fact that a chip has been changed is not always
             reflected on the revision number. Conversely, a change in the revision number
             does not always reflect an actual change of the chip. Therefore, design the
             program so that it does not depend on the revision number of this register.
                  31        30                     23      22                                               0
                   s               e                                           f
                  Sign           Exponent                                   Fraction
                   1                    8                                       23
                  The double-precision format has a 53-bit signed fraction field (s+f) and an 11-bit
                  exponent, as shown in Figure 7-6.
    63       62                  52     51                                                                  0
      s              e                                                 f
     Sign         Exponent                                          Fraction
         1             11                                             52
                          No.                                     Equation
                        NaN
                                       if E = Emax+1 and f ¹ 0, then v is NaN, regardless of s
                   (Not a Number)
                           ±¥
                                     if E = Emax+1 and f = 0, then v = (–1)s ¥
                   (Infinite number)
                     Normalized
                                       if Emin £ E £ Emax, then v = (–1)s2E(1.f)
                      number
                    Denormalized
                                       if E = Emin–1 and f ¹ 0, then v = (–1)s2Emin(0.f)
                      number
The minimum and maximum values that can be expressed in this floating-point
format are shown in Table 7-6.
                      Type                                 Value
 Single-precision floating-point Minimum         1.40129846e–45
 Single-precision floating-point Minimum
                                                 1.17549435e–38
 (Normal)
 Single-precision floating-point Maximum         3.40282347e+38
 Double-precision floating-point Minimum         4.9406564584124654e–324
 Double-precision floating-point Minimum
                                                 2.2250738585072014e–308
 (Normal)
 Double-precision floating-point Maximum         1.7976931348623157e+308
        31       30                                                                                    0
             s                                               i
            Sign                                         Integer
             1                                             31
        s : sign bit
        i : integer value (2’s complement)
        63       62                                                                                    0
             s                                               i
            Sign                                         Integer
             1                                             63
        s : sign bit
        i : integer value (2’s complement)
            Data Alignment
                  All coprocessor loads and stores reference the following aligned data items:
                         •   For word loads and stores, the access type is always WORD, and the
                             low-order 2 bits of the address must always be 0.
                         •   For doubleword loads and stores, the access type is always
                             DOUBLEWORD, and the low-order 3 bits of the address must
                             always be 0.
            Endianness
                  Regardless of byte-numbering order (endianness) of the data, the address specifies
                  the byte that has the smallest byte address in the addressed field. For a big-endian
                  system, it is the leftmost byte; for a little-endian system, it is the rightmost byte.
                 fmt appended to the instruction op code of the arithmetic operation and compare
                 instruction indicates the data format. S indicates the single-precision floating
                 decimal point, D indicates the double-precision floating decimal point, L indicates
                 the 64-bit fixed decimal point, and W indicates the 32-bit fixed decimal point. For
                 example, “ADD.D” means that the operand of the addition instruction is a double-
                 precision floating-point value.
                 If the FR bit is 0, an odd-numbered register cannot be specified.
             To obtain optimum performance, the VR4300 pipeline does not perform a bypass
             from EX to EX stage of the next instruction for the floating-point result of a
             compare, computational, LWC1, or LDC1 instruction. If the subsequent EX-
             stage floating-point instruction depends on the result of the current EX-stage
             floating-point instruction, the current floating-point instruction completes and its
             EX-stage result is registered in the DC stage and the bypass is enabled.
             Meanwhile, the RF-stage floating-point instruction advances to the EX-stage,
             where it is stalled for one pipeline clock to wait for the result to be bypassed from
             DC to EX, before it begins execution.
                   Caution      This limitation on bypass from EX to EX stage of the next
                                instruction does not apply to integer operations nor to float-
                                ing-point load/store/transfer instructions (except LWC1 and
                                LDC1).
FP #1      IC      RF       EX     EX   EX     EX                    EX    DC      WB
         I-cache
Run Run Run Run Run Run Run Run Stall •••
        FP #2       IC      RF     RF   RF     RF       RF                 EX      EX     EX      •••
                  I-cache
                The execution unit of the VR4300 can shorten the delay time of almost all the
                floating-point instructions depending on the circumstances. By using this feature,
                the performance can be improved and design can be simplified. Changes in the
                delay time are simplified as much as possible. If occurrence of an exception is
                detected by checking the source operand when a multicycle instruction is executed
                (if a source exception occurs), this multicycle instruction is executed for only 2
                cycles, and exception processing is started. Similarly, if the result of an operation
                is found to be the value that does not cause an exception (zero or infinite) as a
                result of checking the operand, the result (e.g., a value other than ¥´0) is written
                back 2 cycles after, and the operation ends.
                Floating-point exceptions, except the source exception, are not aborted until
                instruction execution is completed. In other words, an exception is reported not
                when it has been found, but when instruction execution has been completed.
                Next, the execution time of each instruction is described.
This chapter explains how the FPU handles the floating-point exception.
                   Bit # 17     16      15     14     13      12
                                                                     Cause
                         E       V      Z      O       U      I       Bits
                         Bit # 11       10      9      8       7
                                                                    Enable
                                 V      Z      O       U      I      Bits
                          Bit # 6       5       4      3       2
                                                                      Flag
                                 V      Z      O       U      I       Bits
                                                           Inexact Operation
                                                    Underflow
                                             Overflow
                                      Division by Zero
                              Invalid Operation
                       Unimplemented Operation
           The five exceptions (V, Z, O, U, and I) of the IEEE754 are enabled when the
           Enable bit is set. When an exception occurs, the corresponding Cause bit is set.
           If the corresponding Enable bit is set, the FPU generates an interrupt to the CPU,
           and starts exception processing. If occurrence of the exception is disabled, the
           Cause and Flag bits corresponding to the exception are set.
8.2.1 Flags
                  Flag bits corresponding to the respective IEEE754 exceptions are provided. The
                  Flag bit is set when occurrence of the corresponding exception is disabled and
                  when the condition of the exception is detected. The flag bit can be reset by
                  writing a new value to the Status register by using the CTC1 instruction.
                  If an exception is disabled by the corresponding Enable bit, the FPU performs
                  predetermined processing. This processing gives the default value as the result,
                  instead of the result of the floating-point operation. This default value is
                  determined by the type of the exception. In the case of the overflow and
                  underflow exceptions, the default value differs depending on the rounding mode
                  used at that time. Table 8-1 shows the default values to be given by the respective
                  IEEE754 exceptions of the FPU.
                                      Rounding
   Field         Description                                        Default Values
                                       Mode
      V     Invalid operation              –        Supply a Quiet Not a Number (Q-NaN)
      Z     Division by zero               –        Supply a properly signed ¥
                                         RN         ¥ signed with intermediate result
                                                    Maximum normal number signed with
                                          RZ
                                                    intermediate result
                                                    Negative overflow: maximum negative normal
      O     Overflow                      RP        number
                                                    Positive overflow: +¥
                                                    Positive overflow: maximum positive normal
                                         RM         number
                                                    Negative overflow: -¥
                                         RN         0 signed with intermediate result
                                          RZ        0 signed with intermediate result
                                                    Positive underflow: minimum positive normal
      U     Underflow                     RP        number
                                                    Negative underflow: 0
                                                    Negative underflow: minimum negative
                                         RM         normal number
                                                    Positive underflow: 0
      I     Inexact exception              –        Supply a rounded result
                   The FPU detects the nine exception causes internally. When the FPU detects one
                   of these unusual situations, it causes either an IEEE754 exception or an
                   unimplemented operation exception (E). Table 8-2 lists the exception-causing
                   situations and compares the contents of the Cause bits of the FPU with the
                   IEEE754 standard when each exception occurs.
                        *1. With the IEEE754, the inexact operation exception occurs only if an
                            overflow occurs only when the overflow exception is disabled.
                            However, the VR4300 always generates the overflow exception and
                            inexact operation exception when an overflow occurs.
                        *2. If both the underflow exception and inexact operation exception are
                            disabled when the exponent underflow occurs, and if the FS bit of
                            FCR31 is set, the Cause bit and Flag bit of the underflow exception and
                            inexact operation exception are set. Otherwise, the Cause bit of the
                            unimplemented operation exception is set.
                   Next, each FPU exception is described.
            If Exception Is Enabled:
                   The Destination register is not modified, the Source registers are preserved and an
                   Inexact Operation exception occurs.
             Software can simulate the Invalid Operation exception for other operations that
             are invalid for the given source operands. Examples of these operations include
             IEEE754-specified functions implemented in software, such as Remainder x REM
             y, where y is 0 or x is infinite; conversion of a floating-point number to a decimal
             format whose value causes an overflow, is infinity, or is NaN; and transcendental
             functions, such as ln (–5) or cos–1(3). Refer to Chapter 17 FPU Instruction Set
             Details. Refer to Appendix B for examples or for routines to handle these cases.
      If Exception Is Enabled:
             The Destination register is not modified, the Source registers are preserved, and
             the Invalid Operation Exception occurs.
      If Exception Is Enabled:
             The contents of the Destination register are not changed, the contents of the
             Source register are preserved, and the zero division exception occurs.
            If Exception Is Enabled:
                   The contents of the Destination register is not modified, and the Source registers
                   are preserved, and the overflow exception occurs.
      If Exception Is Enabled:
             If the underflow exception or inexact operation exception is enabled, or if the FS
             bit of the FCR31 register is not set, the unimplemented operation exception (E)
             occurs. At this time, the contents of the destination register are not changed.
            If Exception Is Enabled:
                     The contents of the Destination register are not changed, the contents of the source
                     register are preserved, and the unimplemented operation exception occurs.
            Restrictions:
                     An unimplemented operation exception will occur in response to the execution of
                     a type conversion instruction in the following cases.
                          •    If an overflow occurs during conversion to integer format
                          •    If the source operand is an infinite number
                          •    If the source operand is NaN
                     The type conversion instructions affected by this restriction are as follows.
                          CEIL.L.fmt       fd, fs               FLOOR.L.fmt        fd, fs
                          CEIL.W.fmt fd, fs                     FLOOR.W.fmt        fd, fs
                          CVT.D.fmt        fd, fs               ROUND.L.fmt        fd, fs
                          CVT.L.fmt        fd, fs               ROUND.W.fmt fd, fs
                          CVT.S.fmt        fd, fs               TRUNC.L.fmt        fd, fs
                          CVT.W.fmt        fd, fs               TRUNC.W.fmt        fd, fs
          This chapter describes the VR4300 Initialization interface, and the processor
          modes. This includes the reset signal description and types, and initialization
          sequence, with signals and timing dependencies, and the user-selectable VR4300
          processor modes.
      ColdReset signal
             The ColdReset signal must be asserted active to initialize the processor using
             Power-ON Reset or Cold Reset. At this time, the RESET signal can be asserted
             active or inactive. Set DivMode (1:0)* before the Power-ON Reset.
             Do not deassert the ColdReset signal inactive at least for 64000 MasterClock
             Cycles after the signal has been asserted active. The ColdReset signal may be
             controlled not in synchronization with the MasterClock. When the ColdReset
             signal is deasserted inactive, the SClock, TClock, and SyncOut clock signals
             start operating in synchronization with the MasterClock.
             * In VR4300 and VR4305. In VR4310, DivMode(2:0).
      Reset signal
             Assert this pin active or inactive in synchronization with MasterClock, or keep it
             inactive at Power-ON Reset or Cold Reset.
             Assert this pin active or inactive in synchronization with MasterClock at soft
             reset.
             Determine the DivMode signal until the ColdReset signal is asserted active. The
             DivMode signal cannot be changed after that. If the DivMode signal is changed
             after the ColdReset signal has been asserted active, the operation of the processor
             is not guaranteed.
             When asserting the ColdReset signal active, the Reset signal may be active or
             inactive. However, do not change the value of the Reset signal during the reset
             sequence.
             Keep the Reset signal active for the duration of 16 MasterClock cycles
             immediately after the ColdReset signal has been deasserted inactive.
             The output signals of the system interface are as follows during the reset period.
                   •   PValid signal : 1
                   •   PReq signal     :1
                   •   PMaster signal : 0
                   •   SysAD (31:0) : Undefined
                   •   SysCmd (4:0) : Undefined
             When resetting has been completed, the processor serves as the bus master and
             drives SysAD (31:0). The processor branches to a reset exception vector and
             starts executing a reset exception code.
              Keep the Reset signal active for the duration of 16 MasterClock cycles
              immediately after the ColdReset signal has been deasserted inactive.
              The output signals of the system interface are as follows during the reset period.
                   •   PValid signal : 1
                   •   PReq signal      :1
                   •   PMaster signal : 0
                   •   SysAD (31:0) : Undefined
                   •   SysCmd (4:0) : Undefined
              When resetting has been completed, the processor serves as the bus master and
              drives SysAD (31:0). The processor branches to a reset exception vector and
              starts executing a reset exception code.
  MasterClock
       (input)
                                         tDH                                                   tDH
                          tDS                                                      tDS
           Reset
          (input)
                                                                     ³ 16 MasterClock cycles
                                       ³ 64000 MasterClock cycles
      ColdReset
          (input)
DivMode(1:0)*
      (input)
       SyncOut
        (output)                Undefined
         TClock
        (output)                Undefined
           * Determine the DivMode signal before the ColdReset signal is asserted active.
             In VR4300 and VR4305. In VR4310, DivMode(2:0).
  MasterClock
       (input)
                                         tDH                                                   tDH
                          tDS                                                      tDS
           Reset
          (input)
                                                                     ³ 16 MasterClock cycles
                                       ³ 64000 MasterClock cycles
      ColdReset
          (input)
       SyncOut
        (output)                Undefined
         TClock
        (output)                Undefined
MasterClock
     (input)
                                                        tDH                       tDH
                                      tDS                        tDS
      Reset
     (input)
                                                   ³ 16 MasterClock cycles
 ColdReset     H
     (input)
   SyncOut
    (output)
     TClock
    (output)
            Low Power Mode (100 MHz model of VR4300 and VR4305 only)
                  The user may set the processor to low power mode by setting the RP bit of the
                  Status register to 1. In RP mode, the processor stalls the pipeline and goes into a
                  quiescent state—the store buffers empty and all cache misses resolved. However,
                  the RP mode operation is guaranteed only when the MasterClock is 40 MHz or
                  more. The frequency of PClock drops to the 1/4 of the normal level. The speeds
                  of SClock and TClock also drop to the 1/4 of the normal level.
                  This feature reduces the power consumed by the processor chip to 25% of its
                  normal value.
                  Software must guarantee the proper operation of the system upon setting or
                  clearing the RP bit.
                  1.   The functions of circuits such as the DRAM refresh counter change if the
                       operating frequency changes. Therefore, write new values to the registers of
                       the external agent that are directly affected by changes in frequency.
                  2.   Set the system interface in the inactive status. For example, execute a read
                       instruction to the non-cache area, and make the write buffer empty before
                       completion of the instruction execution. Then the RP bit can be set or cleared.
             3.   Make sure that the eight instructions before and after the MTC0 instruction
                  that sets or clears the RP bit do not generate exceptions such as cache miss
                  and TLB miss.
10
This chapter describes the clock signals (“clocks”) used in the VR4300 processor.
1 2 3 4
             High-to-Low
              Transition                        Low-to-High
                                                 Transition
                                                                                Data Out
                                                                           Q
                                                       Data In
                                  Clock Input
                                                                 Clock-to-Q
                                                                   Delay
        MasterClock
                 The internal and external (system interface) clocks of the VR4300 are generated
                 and operate based on the MasterClock.
        SyncIn/SyncOut
                 The VR4300 processor generates SyncOut at the same frequency as MasterClock
                 and aligns SyncIn with MasterClock.
                 SyncOut must be connected to SyncIn either directly, or through an external
                 buffer. The processor can compensate for both output driver and input buffer
                 delays when aligning SyncIn with MasterClock. When SyncOut is connected
                 to SyncIn through an external buffer as illustrated in Figure 10-7, delay caused by
                 external buffers connected to clock outputs can also be compensated.
        PClock
                 The PClock is selected by setting the frequency ratio between the PClock and the
                 MasterClock.
                 This ratio is set by the DivMode pins on power application. Table 10-1 indicates
                 the selectable frequency ratio. For details of the DivMode pins settings, refer to
                 Table 2-2 Clock/Control Interface Signals.
                 When the low power mode (100 MHz model of the VR4300 and the VR4305 only)
                 is set by setting the RP bit of the Status register, the frequency of PClock
                 decreases to the 1/4 of the normal level.
                 All the internal registers and latches use PClock.
*1. Selectable with the 100 MHz model only (With the 133 MHz model, this setting is reserved.)
 2. Selectable with the 133 MHz model only (With the 100 MHz model, this setting is reserved.)
 3. Selectable with the 167 MHz model only (With the 133 MHz model, this setting is reserved.)
         SClock
                  The frequency of the system interface clock (SClock) is equal to that of
                  MasterClock, and SClock is synchronized with MasterClock. Because SClock
                  is generated from PClock, the frequency of SClock also drops to the 1/4 of the
                  normal level, like the frequency of PClock, when the low power mode (100 MHz
                  model of the VR4300 and the VR4305 only) is set. The output of the VR4300 is
                  driven at the edge of SClock.
                  SClock rises in synchronization with the first rising edge of MasterClock
                  immediately after ColdReset is deasserted inactive.
         TClock
                  TClock (transfer/receive clock) is the reference clock of the output and input
                  registers of the external agent. It is also used as the global clock of the external
                  agent, and a clock can be supplied to all the logic circuits in the external agent.
                  TClock is the same as SClock in frequency, and its edge is accurately
                  synchronized with that of SClock. When SyncIn is connected to SyncOut,
                  TClock can also be synchronized with MasterClock.
Cycle 1 2 3 4
MasterClock
(input)
                                       tMCkHigh
tMCkLow
tMCkP
PClock
(internal)
SClock
(internal)
TClock
(output)
SysAD(31:0)
(Driven by                   D                       D                D           D
processor)                             tDO
SysAD(31:0)
(Received by                       D                      D               D           D
processor)                             tDS
tDH
      Cycle                      1                      2                 3              4
      MasterClock
      (input)
                                               tMCkHigh
tMCkLow
tMCkP
      PClock
      (internal)
      SClock
      (internal)
      TClock
      (output)
      SysAD(31:0)
      (Driven by                     D                       D                   D           D
      processor)                            tDO
      SysAD(31:0)                          D                      D                  D           D
      (Received by
      processor)                               tDS
                                                  tDH
                 Figure 10-5 shows a block diagram of a phase-locked system using the VR4300
                 processor.
MasterClock
MasterClock MasterClock
SysCmd(4:0) SysCmd(4:0)
SysAD(31:0) SysAD(31:0)
SyncOut
SyncIn
TClock
                                                               Input
                                        Gate                  Register
MasterClock                             Array
  VR4300
        MasterClock
                                                              Output
       SysCmd(4:0)
                                                              Register
SysAD(31:0)
              SyncOut
               SyncIn
TClock
                                                               Input
                                                              Register
                                                                 CE
                                                               Output
                                                              Register
                                                                 CE
Figure 10-6 Gate-Array System without Phase Lock, Using the VR4300 Processor
             Figure 10-7 is a block diagram of a system without phase lock, employing the
             VR4300 processor and an external agent composed of both a gate array and
             CMOS discrete devices.
                                                          Control
MasterClock                                               Gate Array               Input
                                                                                   Register
  VR4300
              MasterClock
SysCmd(4:0)
                                                                                   output
              SysAD(31:0)
                                                                                   Register
SyncOut
SyncIn
TClock
CE CE
                                                  Memory
                                                  Memory
Figure 10-7 Gate-Array and CMOS System without Phase Lock, Using the VR4300 Processor
The transmission time for a signal from an external agent composed of CMOS
discrete devices can be calculated from the following equation:
Transmission Time = (1TClock period) – (tDS for VR4300)
         – (Maximum External Output Register Clock-to-Q Delay)
         – (Maximum External Clock Buffer Delay Mismatch)
         – (Maximum Clock Jitter for VR4300 Internal Clocks)
         – (Maximum Clock Jitter for TClock)
In this clocking methodology, the hold time of data driven from the processor to
an external input register is an important parameter. To guarantee hold time, the
minimum output delay of the processor, tDO, must be greater than the sum of:
         Minimum Hold Time for the External Input Register
         + Maximum Clock Jitter for VR4300 Internal Clocks
         + Maximum Clock Jitter for TClock
         + Maximum Delay Mismatch of the External Clock Buffers
11
       This chapter describes in detail the cache memory: its place in the VR4300
       memory organization, and individual organization of the caches.
       This chapter uses the following terminology:
           •   The data cache may also be referred to as the D-cache.
           •   The instruction cache may also be referred to as the I-cache.
       These terms are used interchangeably throughout this book.
VR4300 CPU
                                                       Registers
                   Registers       Registers
                    I-cache        D-cache
                                                       Caches
                               Cache
                                                       Memory
                          Disk, CD-ROM,
                             Tape, etc.
             The VR4300 processor has two on-chip caches: one holds instructions (the
             instruction cache), the other holds data (the data cache). The instruction and data
             caches can be read in one PClock cycle.
             Data writes take two PClock cycles. In the first cycle, the store address is
             generated and the tag is checked; in the second cycle, the data is written into the
             data RAM.
VR4300
         I-cache
                               Caches
D-cache
Cache Sizes
                The VR4300 instruction cache is 16 KB; the data cache is 8 KB.
                                                                              20 19                  0
                                                                               V         PTag
                                                                               1           20
  255                                                                                                0
                                                  Data
                                                   256
                                                                   21    20 19                   0
                                                                   V     D         PTag
                                                                    1    1           20
 127                                                                                             0
                                               Data
                                                128
  V      :   Valid bit
  D      :   Dirty bit (refer to 11.4 Cache States)
  PTag   :   Physical tag (bits 31:12 of the physical address)
  Data   :   D-cache data
Tags Data
64
V Tag D Data
        Number
                                                   Operation
        of Cycles
             1        DC stage stall
                      Transfer address to write buffer and wait for the pipeline start
             1
                      signal
                      Synchronize with SClock and transfer address to internal SysAD
          1 to 2
                      bus
             2        Transfer to external SysAD bus
            M         Time needed to access memory, measured in PClock cycles
             2        Transfer the cache line from memory to the SysAD bus
                      Transfer the cache line from the external to internal bus and to
             1
                      D-cache bus
             0        Restart the DC stage
                  Number
                                                          Operation
                  of Cycles
                      1        RF stage stall
                               Transfer address to write buffer and wait for the pipeline start
                      1
                               signal
                               Synchronize with SClock and transfer address to internal SysAD
                   1 to 2
                               bus
                      2        Transfer to external SysAD bus
                     M         Time needed to access memory, measured in PClock cycles
                      8        Transfer the cache line from memory to the SysAD bus
                               Transfer the cache line from the external to internal bus and to
                      1
                               I-cache bus
                      0        Restart the RF stage
     Data Cache
            The data cache supports three cache states:
                  •   invalid
                  •   clean
                  •   dirty
     Instruction Cache
            The instruction cache supports two cache states:
                  •   invalid
                  •   valid
            The cache line that contains valid information may be changed when the processor
            executes the CACHE operation. For CACHE operation, refer to Chapter 16
            CPU Instruction Set Details.
Write(1) Read(1)
             Read(2)                                                                     Read(2)
                                                       Write(1)
             Write(2)
                              Dirty                    CACHE instruction         Clean
Write Back
                                                        CACHE instruction
                  Read(2)         Valid                 Read(1)                    Invalid
12
         The System interface allows the processor to access external resources needed to
         perform processing of cache misses and uncached areas, while permitting an
         external agent to access to some of the processor internal resources.
         This chapter describes the System interface between the processor and the
         external agent.
         The VR4300 uses a subset of the System interface contained on the VR4400 and
         VR4200.
12.1 Terminology
             The following terms are used in this chapter:
                 •    An external agent is any device connected to the processor, over the
                      System interface, that processes requests issued by the processor.
                 •    A system event is an event that occurs within the processor and
                      requires access to external resources. System events include: an
                      instruction fetch that misses in the instruction cache; a load/store
                      instruction that misses in the data cache; an uncached load or store
                      instructions; an execution of cache instructions.
                 •    Sequence refers to the series of requests that a processor generates to
                      process a system event.
                 •    Protocol refers to the cycle-by-cycle signal transitions that occur on
                      the System interface pins, which issue external request, or a
                      processor.
                 •    Syntax refers to the definition of bit patterns on encoded buses, such
                      as the command bus.
                 •    Block indicates any data transfer of 8 bytes or longer across the
                      System interface.
                 •    Single indicates any data transfer of 7 bytes or shorter across the
                      System interface.
                 •    Fetch refers to the read of information from the instruction cache.
                 •    Load refers to the read of information from the data cache.
Transfer sequence
1 2 3 4 5 6 7 8 (Sequential ordering)
W0 W1 W2 W3 W4 W5 W6 W7
Transfer sequence
3 4 1 2 (Subblock ordering)
W0 W1 W2 W3
SysAD(31:0)
SysCmd(4:0)
      Processor Request
             There are two types of processor issue cycles:
                 •    processor read request
                 •    processor write request
             The issuance cycle of the processor read/write request is determined by the status
             of the EOK signal. The issuance cycle is a cycle that becomes valid in the address
             cycle of each processor request. Only one issuance cycle exists for one processor
             request.
             To define the issuance cycle of the address cycle, assert the EOK signal active at
             the external agent side one cycle before the address cycle of the processor read/
             write request as shown in Figure 12-4.
             To define the address cycle as the issuance cycle, do not deassert the EOK signal
             inactive until the address cycle is started.
SCycle 1 2 3 4 5 6
                            SClock
                          (internal)
                      SysAD(31:0)
                             (I/O)                                  Addr
                              EOK
                            (input)
                                                                   Issuance cycle
             The processor repeatedly outputs the address cycle until the address cycle of the
             processor request becomes the issuance cycle. With the VR4300, therefore, the
             address cycle next to the cycle in which the EOK signal has become active is the
             issuance cycle, and the address cycle is repeated up to that cycle. Figure 12-5
             illustrates how the address cycle is extended by the EOK signal.
SCycle 1 2 3 4 5 6 7
                               SClock
                             (internal)
                          SysAD(31:0)
                                 (I/O)                                Addr
                                  EOK
                                (input)
                                                                                 Issuance cycle
      EOK Signal
             This signal is used by the external agent to indicate whether it can accept a new
             read or write transactions.
VR4300
Output data
Input data
SClock
When an external agent receives a read request, it accesses the specified resource
and returns the response data as a read response, which may be returned at any
time after the read request is completed.
A processor read request is completed after the last response data has been
received from the external agent. A processor write request is completed after the
last word of data has been transferred.
The processor will not issue another request while a read request is pending
(before receiving the response data after issuing the read request).
System events and requests are shown in Figure 12-7.
         Processor Requests
            • Read
            • Write
                                                        External Requests
                                                           • Read response
                                                           • Write
                        System Events
                          • Fetch Miss
                          • Load Miss
                          • Store Miss
                          • Load/Store to Uncached area
                          • CACHE instructions
         Outline Requests
                Read request asks for a block, word, or partial word of data either from main
                memory or from another system resource.
                Write request provides a block, word, or partial word of data to be written either
                to main memory or to another system resource.
         Request Issuance
                The processor issues requests in a strict sequential order; that is, the processor is
                only allowed to have one request pending at any time. For example, the processor
                issues a read request and waits for a read response before issuing any subsequent
                requests. The processor issues a write request only if there are no read requests
                pending.
         Request Control
                The processor has the input signal EOK to allow an external agent to control the
                flow of processor requests.
                The processor request cycle sequence is shown in Figure 12-8.
         Outline of Requests
                Read response returns data in response to a processor read request.
                Write request provides data to be written to the processor’s internal resource.
         Request Control
                The processor controls the flow of external requests through the arbitration signals
                EReq and PMaster, as shown in Figure 12-9. The external agent must acquire
                mastership of the System interface before it issues an external request; the external
                agent acquires mastership of the System interface by asserting EReq signal and
                then waiting for the processor to deassert PMaster signal for one cycle.
                Mastership of the System interface always returns to the processor when EReq
                signal becomes inactive after an external request is issued. The processor does not
                accept a subsequent external request until it has completed the current request.
         Request Issuance
                If there are no processor requests pending, the processor decides, based on its
                internal state, whether to accept the external request, or to issue a new processor
                request. The processor can issue a new processor request even if the external
                agent is requesting access to the System interface.
                The external agent asserts EReq signal indicating that it wishes to begin an
                external request. The processor releases mastership of the System interface by
                deasserting PMaster signal. An external request can be accepted based on the
                criteria listed below.
1. Read request
2. Read response
Once the processor enters slave state (starting at cycle 5 in Figure 12-11), the
external agent can return the requested data through a read response. The read
response returns the requested data or, if the requested data could not be
successfully retrieved, indicate to SysCmd(4:0) bus that the returned data is
erroneous as a read response. If the returned data is erroneous, the processor
generates a bus error exception.
Figure 12-11 illustrates a processor read request, coupled with an uncompelled
change to slave state, that occurs as the read request is issued. Figure 12-12 shows
the processor read request delayed by the EOK signal.
The following sequence describes the protocol for a processor read request (the
numbered steps below correspond to Figures 12-11 and 12-12).
1.   The processor is in the master status. It outputs a read command to
     SysCmd(4:0) and a read address to SysAD(31:0) to issue a read request.
     After the read request is issued, the processor enters the pending status. Only
     one read request can be pending at a time.
2.   The processor asserts the PValid signal to indicate that the current data of
     SysCmd(4:0) and SysAD(31:0) are valid.
3.   The external agent asserts the EOK signal for two consecutive cycles to
     enable issuance of a processor read request. If the EOK signal is deasserted,
     the issuance cycle of the read request is delayed.
4.   The processor deasserts the PMaster signal at the first cycle after the read
     request is accepted, and shifts to the slave status unforcibly.
5.   The processor releases SysCmd(4:0) and SysAD(31:0) at the same time as
     the PMaster signal is deasserted.
6.   An external agent can drive SysCmd(4:0) and SysAD(31:0) from the first
     cycle after the PMaster signal is deasserted.
                                Master                                            Slave
       SCycle          1        2        3          4       5        6   7    8           9       10   11   12
        SClock
      (internal)
  SysAD(31:0)                                   Hi-Z
         (I/O)                       Addr
                                               5. 6.
                                    1.
 SysCmd(4:0)                                    Hi-Z
       (I/O)                        Read
        PValid
       (output)                     2.
        EValid
        (input)    H
      PMaster
      (output)                                 4.
          EOK              3.
        (input)
                                         Master                                               Slave
       SCycle          1        2        3          4       5        6   7    8           9       10   11   12
        SClock
      (internal)
  SysAD(31:0)                                                    Hi-Z
         (I/O)                               Addr
                                                                5. 6.
                           1.
 SysCmd(4:0)                                                     Hi-Z
       (I/O)                                 Read
        PValid
       (output)            2.
        EValid
        (input)    H
      PMaster
      (output)                                                  4.
          EOK                                  3.
        (input)
                                                                       Master
       SCycle            1        2        3         4      5         6      7      8     9   10   11   12
        SClock
      (internal)
  SysAD(31:0)
         (I/O)                         Addr Data0 Data1 Data2 Data3
                                      1.    4.
 SysCmd(4:0)
       (I/O)                           Write        Data   Data      Data   EOD
                                                                             6.
       PValid
      (output)                        2.
                                                                5.
      PMaster        L
      (output)
          EOK
        (input)              3.
                                                                       Master
       SCycle            1        2        3         4      5         6      7      8     9   10   11   12
        SClock
      (internal)
  SysAD(31:0)
         (I/O)                         Addr                Data0                  Data1
                                      1.        4.
 SysCmd(4:0)
       (I/O)                           Write        Data                    EOD
                                                                             6.
       PValid
      (output)                        2.
                     L                         5.                            5.
      PMaster
      (output)
          EOK
        (input)              3.
Figure 12-14 Processor Block Write Request (Write Data Pattern: Dxx)
SCycle 1 2 3 4 5 6 7 8 9 10 11 12
      SClock
    (internal)
 SysAD(31:0)                                                          Hi-Z
        (I/O)                                         Addr
 SysCmd(4:0)                                                          Hi-Z
       (I/O)                                          Read
      PValid                         1.                          2.
     (output)
        EOK
      (input)
     PMaster
     (output)
SCycle 1 2 3 4 5 6 7 8 9 10 11 12
        SClock
      (internal)
  SysAD(31:0)
         (I/O)                         Addr    Data                               Addr             Data
 SysCmd(4:0)
       (I/O)                          Write    EOD                                Write            EOD
                                                                    1.                        2.
        PValid
       (output)
          EOK
        (input)
      PMaster
      (output)     L
                   1.   The external agent continues asserting the EReq signal active to issue an
                        external request.
                   2.   When the processor is ready to process the external request, it deasserts the
                        PMaster signal inactive.
                   3.   The processor sets SysAD(31:0) and SysCmd(4:0) in the high-impedance
                        state.
                   4.   The external agent should drive SysAD(31:0) and SysCmd(4:0) one cycle
                        after the PMaster signal has been deasserted inactive.
                   5.   The external agent should deassert the EReq signal inactive in the last cycle
                        of the external request (2 cycles before the external agent enters the slave
                        status), except when it executes another external request.
                   6.   The external agent should set SysAD(31:0) and SysCmd(4:0) in the high-
                        impedance state on completion of the external request.
        SClock
      (internal)
        EValid
        (input)
         EReq
        (input)         1.                                                                  5.
      PMaster
      (output)                         2.
                   If the external agent has entered the master status by issuing the processor read
                   request, the external agent must always return read request data. If the external
                   agent has entered the master status by using the EReq signal, any command and
                   data can be issued in accordance with the arbitration process. This means that the
                   processor always satisfies any request from the external agent.
                                 Slave                                         Master
     SCycle         1     2        3        4      5      6      7        8      9      10      11     12
     SClock
   (internal)
SysAD(31:0)                                       Hi-Z
       (I/O)                    External: data                       Processor: address/data
SysCmd(4:0)                                       Hi-Z
      (I/O)                   External: command                          Processor: command
    PMaster
    (output)
      PReq
    (output)
      EReq
     (input)
       EOK      L
     (input)
      SClock
    (internal)
      PValid
     (output)    H
     PMaster
     (output)
      EValid
      (input)
         Ereq
       (input)
                     Only an interrupt processing can be done by the processor in the external write
                     request.
        SClock
      (internal)
        PValid
       (output)
      PMaster
      (output)
         EReq      H
        (input)
        EValid
        (input)
          EOK
        (input)
                                           Slave                                       Master
       SCycle          1      2      3       4      5       6           7   8    9      10      11   12
        SClock
      (internal)
  SysAD(31:0)                                                       Hi-Z
         (I/O)                     Data0 Data1 Data2 Data3
 SysCmd(4:0)                                                        Hi-Z
       (I/O)                       Data    Data    Data    EOD
       PValid
      (output)     H
      PMaster
      (output)
       EValid
        (input)
                 Figure 12-22 shows the case where an external write request is issued following a
                 read response to a processor single read request. The following sequence
                 describes the protocol (the numbers in the following description correspond to the
                 numbers in Figure 12-22).
                 1.   The external agent returns response data to the processor single read request.
                 2.   To issue an external request following the read response, assert the EReq
                      signal active in the cycle in which EOD is returned. In this case, the PMaster
                      signal remains inactive two cycles after EOD.
                 3.   Because the external agent is in the master status, it can issue the external
                      write request.
                 4.   Deassert the EReq signal inactive up to the data cycle of the external write
                      request. In this case, the PMaster signal is asserted active two cycles after
                      EOD, and the bus mastership is returned to the processor.
     SClock
   (internal)
     PValid
    (output)                          1.              3.
    PMaster
    (output)
       EOK
     (input)                                          2.                      4.
      EValid
      (input)
      EReq
     (input)
        SClock
      (internal)
        PValid
       (output)
      PMaster
      (output)
          EOK
        (input)
        EValid
        (input)
          Figure 12-23    When External Write Request Takes Precedence While Processor
                          Read Request is Pending
                   As shown in this figure, even if the external request interrupts the processor read
                   request, the processor remains in the slave status until the read response data is
                   returned.
                                   Processor                         Processor
                                   single write        Wait         single write
Figure 12-25 Successive Single Write Requests (Write Data Pattern: Dxx)
        SClock
      (internal)
        PValid
       (output)
      PMaster
      (output)
          EOK
        (input)
        EValid
        (input)
      SClock
    (internal)
      PValid
     (output)
     PMaster
     (output)
        EOK
      (input)
      EValid
      (input)
      Figure 12-27   Processor Single Read Request Followed by Block Write Request
                     (Write Data Pattern: D)
        SClock
      (internal)
        PValid
       (output)
       PMaster
       (output)
          EOK      L
        (input)
        EValid
        (input)
         EReq
        (input)
      Figure 12-28         Successive Processor Write Requests Followed by External Write Request
                           (Write Data Pattern: D)
                   6.   Because the EOK signal is active in one cycle (cycle 9) before the write
                        request of the second Data1, this cycle is the issuance cycle.
                   7.   Because the EOK signal is active in the write request cycle (cycle 10) of the
                        second Data1, the next cycle is a normal data cycle.
SCycle 1 2 3 4 5 6 7 8 9 10 11 12
        SClock
      (internal)
  SysAD(31:0)
         (I/O)                       Addr0 Data0 Addr1 Data1                         Addr1                      Data1
 SysCmd(4:0)
       (I/O)                         Write    EOD        Write     EOD               Write                      EOD
        PValid
       (output)
                                1.      2.          3.        4.        5.                       6.        7.
      PMaster
      (output)     L
          EOK
        (input)
      SClock
    (internal)
                                                                                         4.
 SysAD(31:0)                                                  Hi-Z               Hi-Z            Hi-Z
        (I/O)                                          Addr               Addr          Data
     PMaster
     (output)
                                                                     3.
       EReq
      (input)    H                            1.        2.
        EOK
      (input)
      PValid
     (output)
      EValid
      (input)
     SClock
   (internal)
    PMaster
    (output)
       EReq
      (input)
       EOK
     (input)
      PReq
    (output)
     PValid
    (output)
         Read Response
               An external agent may transfer data to the processor at the maximum data rate of
               the System interface. The rate at which data is transferred to the processor can be
               controlled by the external agent, which asserts EValid signal at the cycle which
               data is transferred. The processor accepts cycles as valid only when EValid signal
               is asserted and the SysCmd(4:0) bus contains a data identifier; thereafter, the
               processor continues to accept data until it receives the data word tagged as the last
               one.
               Data identifier EOD must be attached to the last data word. Without this, the
               System interface hangs up as a protocol error. In this case, because the protocol
               error state is identified with the PReq signal at double the cycle of SClock
               oscillating in synchronization with the MasterClock, the processor should be
               reset and initialized.
         Write Request
               The rate at which the processor transfers data to an external agent is
               programmable through the EP bit of the Config register (setting at reset is D)
               signal. Data patterns are defined using the letters D and x, where D indicates a
               data cycle and x indicates an unused cycle. For example, a Dxx data pattern
               indicates a data rate of one word every three cycles.
               The VR4300 has two data transfer rates: D and Dxx. The processor continues
               outputting data output in the period of D immediately before, while the processor
               is in the master status and during the period of x.
               A processor block write request with a Dxx data pattern (one word every three
               cycles) is shown in Figure 12-14.
                      Bit                                     Meaning
                   SysCmd4            Attributes.
                                       0: Command (address)
                                       1: Data identifier
4 3 2 0
             SysCmd(2:0) are specific to each type of request and are defined in each of the
             following sections.
4 3 2 0
Tables 12-4 through 12-6 list the encodings of SysCmd(2:0) bit read attributes for
read requests.
          Bit                                    Meaning
 SysCmd2                  Read Attributes.
                           0: Single Read
                           1: Block Read
          Bit                                    Meaning
 SysCmd(1:0)              Read Block Size.
                           0: 2 words
                           1: 4 words (D-cache only)
                           2: 8 words (I-cache only)
                           3: Reserved
          Bit                                    Meaning
 SysCmd(1:0)              Read Data Size.
                           0: 1 byte valid (Byte)
                           1: 2 bytes valid (Halfword)
                           2: 3 bytes valid
                           3: 4 bytes valid (Word)
4 3 2 0
                       Bit                                     Meaning
              SysCmd2                  Write Attributes.
                                        0: Single Write
                                        1: Block Write
                       Bit                                     Meaning
              SysCmd(1:0)              Write Block Size.
                                        0: 2 words
                                        1: 4 words (for D-cache only)
                                        2: 8 words (for I-cache only) (for test)
                                        3: Reserved
                       Bit                                     Meaning
              SysCmd(1:0)              Write Data Size.
                                        0: 1 byte valid (Byte)
                                        1: 2 bytes valid (Halfword)
                                        2: 3 bytes valid
                                        3: 4 bytes valid (Word)
4 3 2 1 0
                 Bit                                  Meaning
             SysCmd3       Last Data Element Indication.
                            0: Last data element, or data element on single transfer
                            1: Not the last data element
             SysCmd2       Reserved
             SysCmd1       Reserved: Error Data Indication.
                           The processor outputs 0 (error free).
             SysCmd0       Reserved: Data check enabled
                           Processor outputs 1 (data check disabled).
                 Bit                                  Meaning
             SysCmd3       Last Data Element Indication.
                            0: Last data element or data element on single transfer
                            1: Not the last data element
             SysCmd2       Response Data Indication.
                            0: Data is response data
                            1: Data is not response data
             SysCmd1       Error Data Indication.
                            0: Data is error free
                            1: Data is erroneous
             SysCmd0       Reserved: Data Checking Enable.
                           Processor ignores this bit. (external agent transfers 1)
      Sequential Ordering
             An instruction cache read request returns data in sequential order, starting with the
             first word (DW0) of the 8-word block, no matter which word is requested.
      Subblock Ordering
             When a read request is issued to the data cache, the low-order word of the
             doubleword that includes the word required by the CPU is first returned, and then
             the high-order word, the low-order word of the remaining doubleword, and the
             high-order word of it is returned in that order (for details, refer to 12.2.1 Physical
             Addresses).
13
                                                               IC External Pin
                                                               Boundary-Scan Cells
Caution When the JTAG interface is not used, keep the JTCK signal low.
                                            2            0
                                               Instruction
                                                 Context is
                                                Register
                                                   saved
                                                CPU
                                                                     JTDI Pin
                                                   0
                                                Bypass
                                                 Context is
                                                Register
                                                  saved              JTDO Pin
                            TAP is
                           Context
                          Controller
                             saved        56             0           JTMS Pin
                                               Boundary-
                                                 Context
                                                  scan is            JTCK Pin
                                                   saved
                                                Register
             The Instruction register has two stages: shift register, and parallel output latch.
             Refer to 13.3.7 Controller States for detail. Figure 13-3 shows the format of the
             Instruction register.
                                  2                         1                        0
                                 MSB                                             LSB
JTDI
                                                                       Bypass
 Board                                                                 Register
 Input                                               JTDO
                    JTDO        JTDI
 Board    JTDI                          JTDO
 Output
JTDO JTDI
JTDI JTDO
          IC Package                                                     Boundary-scan
                                                                         Register Pad Cell
Board
                       57      56                                                          0
                       OE1
             OE1 (jSysADEn) is the JTAG output enable bit for all outputs of the processor.
             Output is enabled when this bit is set to 1 (default state).
             The remaining 57 bits correspond to 57 signal pads. Outputs are enabled when
             this bit is set to 1.
             Table 13-2 lists the scan order of these scan bits.
              JTCK
                         JTMS and JTDI sampled                                      JTDO changes at
                          at rising edge of JTCK                                  falling edge of JTCK
     Data scanned in serially                                 Data scanned out serially
     2             0                                            2            0
         Instruction
           Context is                                              Instruction
                                                                     Context is
          Register
             saved                                                  Register
                                                                       saved
                         CPU                                                      CPU
             0                                                         0                           (MSB)
                               LSB
          Bypass is
          Context                           JTDI Pin                Bypass is                      JTDO Pin
                                                                    Context
          Register
            saved                                                   Register
                                                                      saved
                                           JTMS Pin
    56               0                                        56             0
         Boundary-                                                 Boundary-
          Context is                                                Context is
           scan                                                      scan
            saved                                                     saved
          Register                                                  Register
                     The JTDI and JTMS signals are sampled in synchronization with the rising edge
                     of the JTCK signal. State on the JTDO signal changes in synchronization with
                     the falling edge of the JTCK signal.
         Capture IR State
                The value 0x4 is loaded into the shift register stage.
         Shift IR State
                Data is loaded serially into the shift register stage of the Instruction register from
                the JTDI input pin, and the MSB of the Instruction register’s shift register stage
                is shifted out to the JTDO pin.
      Update IR State
                The current data in the shift register stage is loaded into the parallel output latch.
                     The VR4200 generates the update function at the next rising edge. In
                     other words, it is 1/2JTCK cycle late as compared with the VR4300.
14
                                  6                                     Interrupt Request
                                                                        Register (6)
                   (Internal
                   Register)                                          NMI
NMI
                   SClock
                                                                        Inverter    OR Gate
          SysAD(4:0)
      Interrupt Set Value                        Interrupt Register
 4        3        2         1        0
                                                    0
                                                    2                      Refer to Figures
                                                                           14-3 and 14-4.
                                                    3
20 19 18 17 16 4
     SysAD(20:16)
     Write Enables                                  6
                                                                      Refer to Figure 14-1.
                        SysAD6
                                          Nonmaskable Interrupt
                   22
                        SysAD22
                                  4         3       2          1       0
                                                                               Interrupt Register (4:0)
IP2 10
IP3 11
                                                                                       IP4 12
                                                                                                       Refer to Figure 14-4.
                                                                                       IP5 13
                                                                                       IP6 14
             Timer Interrupt                                                           IP7 15
                                                                                 Cause Register
                                                                                    (15:10)
                                      4         3       2          1       0
                                                                                 (Internal Register)
                                             Int3              Int1
                                  Int4                  Int2           Int0
                                     Status Register
                                          SR0
IE
                                      Status Register
                                         SR(15:8)
                                           IM0    8
                                           IM1    9
                                           IM2    10
                                           IM3    11 8
                                           IM4    12
                                           IM5    13
                                           IM6    14
                                           IM7    15                                      VR4300 Interrupt
                                                                         1                     1
                            Software       IP0    8
                           Interrupts      IP1    9
                                           IP2    10
                                           IP3    11 8                          AND
                     External Normal                                            block
                           Interrupts
                                           IP4    12
                                           IP5    13
                                           IP6    14   AND-OR
                Timer Interrupt                         block
                                           IP7    15
                                      Cause Register
                                          (15:8)
15
       One of the objectives of the design of the VR4300 processor is to minimize power
       consumption in order to make the processor suitable for use in battery operated
       systems, as well as in environments where low power consumption and heat
       dissipation are desirable.
       To accomplish this, the VR4300 has power management features which bring a
       dynamic reduction of power consumption, described in this chapter.
15.1 Features
             The VR4300 has three processor-level operation modes: normal, low power (100
             MHz model of the VR4300 and the VR4305 only), and power off.
             These modes allow processor power consumption to be managed by system logic.
             Generally a notebook system has many different levels of power management. It
             is the responsibility of system logic to switch the processor between the three
             available modes in order to reflect the power management state of the system.
16
         This chapter provides a detailed description of the function of each VR4300 CPU
         instruction in both 32- and 64-bit modes. The instructions are listed in
         alphabetical order.
         For details of the FPU instruction set, refer to Chapter 17 FPU Instruction Set
         Details.
   Symbol                                           Meaning
     ¬          Substitution
      ||        Bit string concatenation.
       y
     x          Repetition of bit string x with a y-bit string. x is always a single-bit value.
    xy...z      Selection of bits y through z for bit string x.
                Little-endian bit notation is always used. If y is less than z, this expression is
                an empty (zero length) bit string.
     +          2’s complement or floating-point addition.
     –          2’s complement or floating-point subtraction.
     *          2’s complement or floating-point multiplication.
    div         2’s complement integer division.
    mod         2’s complement remainder.
      /         Floating-point division.
     <          2’s complement less than comparison.
    and         Bit-wise logical AND.
     or         Bit-wise logical OR.
    xor         Bit-wise logical XOR.
    nor         Bit-wise logical NOR.
   GPR[x]       General Purpose Register x. The content of GPR[0] is always zero.
                Attempts to alter the content of GPR[0] have no effect.
   CPR[z,x]     Coprocessor unit z, general purpose register x.
   CCR[z,x]     Coprocessor unit z, control register x.
   COC[z]       Coprocessor unit z, condition signal.
BigEndianMem    Endian mode as configured at reset (0 ® Little, 1 ® Big).
                Specifies the endianness of the memory interface (see LoadMemory and
                StoreMemory), and the endianness of Kernel and Supervisor modes.
ReverseEndian   Signal to reverse the endianness of load and store instructions.
                This feature is available in User mode only, and is effected by setting the RE
                bit of the Status register. Thus, ReverseEndian is set to 1 only when the RE
                bit is set in User mode.
BigEndianCPU    The endianness for load and store instructions (0 ® Little, 1 ® Big).
                In User mode, this endianness is reversed by setting RE bit. Thus,
                BigEndianCPU is calculated as BigEndianMem XOR ReverseEndian.
    LLbit       Bit showing synchronized state of instructions. Set by LL instruction, cleared
                by ERET instruction and read by SC instruction.
    T+i:        Indicates the time steps between operations. Each statement within a time
                step are defined to be executed in sequential order (instruction execution
                order may be changed by conditional branch and loop).
                Operations which are marked T+i: are executed at instruction cycle i from the
                start of execution of the instruction. Thus, an instruction which starts at time j
                executes operations marked T+i: at time of i + j th cycle. The order is not
                defined for instructions executed at the same time or operations.
Example #1:
Example #2:
(immediate15)16 || immediate15...0
         Function                                           Meaning
                                Uses TLB to search a physical address from a virtual address. If
AddressTranslation              TLB does not have the requested contents of conversion, this
                                function fails, and TLB non-coincidence exception occurs.
                                Searches the cache and main memory to search for the contents
                                of the specified data length stored in a specified physical address.
                                If the specified data length is less than a word, the contents of a
                                data position taking the endian mode and reverse endian mode of
LoadMemory
                                the processor into consideration are loaded. The low-order 3 bits
                                and access type field of the address determine the data position in
                                a data word. The data is loaded to the cache if the cache is
                                enabled.
                                Searches the cache, write buffer, and main memory to store the
                                contents of a specified data length to a specified physical address.
                                If the specified data length is less than a word, the contents of a
StoreMemory                     data position taking the endian mode and reverse endian mode of
                                the processor into consideration are stored. The low-order 3 bits
                                and access type field of the address determine the data position in
                                a data word.
             The Access Type field indicates the size of the data to be loaded or stored.
             Regardless of access type or byte order (endianness), the address specifies the byte
             which has the smallest byte address in the field accessed. For a big-endian system,
             this is the leftmost byte and contains the sign for a 2’s complement value; for a
             little-endian system, this is the rightmost byte.
             The bytes within the accessed doubleword can be determined directly from the
             access type and the low-order three bits of the address.
      Format:
               ADD rd, rs, rt
      Description:
               The contents of general purpose register rs and the contents of general purpose
               register rt are added to store the result in general purpose register rd. In 64-bit
               mode, the operands must be sign-extended, 32-bit values.
               An integer overflow exception occurs if the carries out of bits 30 and 31 differ (2’s
               complement overflow). The contents of destination register rd is not modified
               when an integer overflow exception occurs.
      Operation:
      32   T:      GPR[rd] ¬GPR[rs] + GPR[rt]
      Exceptions:
               Integer overflow exception
         Format:
                ADDI rt, rs, immediate
         Description:
                The 16-bit immediate is sign-extended and added to the contents of general
                purpose register rs to store the result in general purpose register rt. In 64-bit
                mode, the operand must be sign-extended, 32-bit values.
                An integer overflow exception occurs if carries out of bits 30 and 31 differ (2’s
                complement overflow). The contents of destination register rt is not modified
                when an integer overflow exception occurs.
         Operation:
         32   T:      GPR [rt] ¬ GPR[rs] +(immediate15)16 || immediate15...0
         Exceptions:
                Integer overflow exception
      Format:
              ADDIU rt, rs, immediate
      Description:
              The 16-bit immediate is sign-extended and added to the contents of general
              purpose register rs to store the result in general purpose register rt. No integer
              overflow exception occurs under any circumstance. In 64-bit mode, the operand
              must be sign-extended, 32-bit values.
              The only difference between this instruction and the ADDI instruction is that
              ADDIU instruction never causes an integer overflow exception.
Operation:
      Exceptions:
              None
        SPECIAL            rs             rt              rd           0                ADDU
       000000                                                        00000             100001
           6               5               5              5            5                  6
         Format:
                  ADDU rd, rs, rt
         Description:
                  The contents of general purpose register rs and the contents of general purpose
                  register rt are added to store the result in general purpose register rd. No integer
                  overflow exception occurs under any circumstance. In 64-bit mode, the operands
                  must be sign-extended, 32-bit values.
                  The only difference between this instruction and the ADD instruction is that
                  ADDU instruction never causes an integer overflow exception.
         Operation:
         32   T:      GPR[rd] ¬GPR[rs] + GPR[rt]
         Exceptions:
                  None
      SPECIAL           rs             rt             rd           0                 AND
     000000                                                      00000             100100
         6               5              5             5            5                  6
       Format:
                AND rd, rs, rt
       Description:
                The contents of general purpose register rs are combined with the contents of
                general purpose register rt in a bit-wise logical AND operation. The result is
                stored in general purpose register rd.
Operation:
       Exceptions:
                None
        ANDI             rs            rt                        immediate
       001100
         6                5              5                          16
         Format:
                ANDI rt, rs, immediate
         Description:
                The 16-bit immediate is zero-extended and combined with the contents of general
                purpose register rs in a bit-wise logical AND operation. The result is stored in
                general purpose register rt.
Operation:
         Exceptions:
                None
     Format:
                BCzF offset
     Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. If CPz’s
                condition signal (CpCond), as sampled during the previous instruction execution,
                is false, then the program branches to the branch address with a delay of one
                instruction.
                Because the condition signal is sampled during the previous instruction execution,
                there must be at least one instruction between this instruction and a coprocessor
                instruction that changes the condition signal.
     Operation:
     32     T–1: condition ¬ not COC[z]
            T:   target ¬ (offset15)14 || offset || 02
            T+1: if condition then
                               PC ¬ PC + target
                 endif
     64     T–1: condition ¬ not COC[z]
            T:   target ¬ (offset15)46 || offset || 02
            T+1: if condition then
                               PC ¬ PC + target
                 endif
                * Refer to the table Opcode Bit Encoding on the next page, or 16.7 CPU
                  Instruction Opcode Bit Encoding.
         Exceptions:
                Coprocessor unusable exception
              Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                        0
             BC1F 0     1   0   0     0   1   0   1   0   0   0     0   0   0   0   0
              Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                        0
             BC2F 0     1   0   0     1   0   0   1   0   0   0     0   0   0   0   0
                              Branch On Coprocessor z
BCzFL                               False Likely                                 BCzFL
31         26 25              21 20         16 15                                                0
     Format:
                BCzFL offset
     Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. If the
                CPz’s condition signal (CpCond), as sampled during the previous instruction
                execution, is false, the program branches to the branch address with a delay of one
                instruction.
                If it does not branch, the instruction in the branch delay slot is discarded.
                Because the condition signal is sampled during the previous instruction execution,
                there must be at least one instruction between this instruction and a coprocessor
                instruction that changes the condition signal.
                              Branch On Coprocessor z
BCzFL                               False Likely
                                    (continued)
                                                                                     BCzFL
         Operation:
         32     T–1: condition ¬ not COC[z]
                T:   target ¬ (offset15)14 || offset || 02
                T+1: if condition then
                                   PC ¬ PC + target
                     else
                                   NullifyCurrentInstruction
                     endif
         64     T–1: condition ¬ not COC[z]
                T:   target ¬ (offset15)46 || offset || 02
                T+1: if condition then
                                   PC ¬ PC + target
                     else
                                   NullifyCurrentInstruction
                     endif
         Exceptions:
                 Coprocessor unusable exception
 BCzFL Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                             0
              BC0FL 0    1   0   0     0   0   0   1   0   0   0     0   0   0   1   0
                Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                    0
              BC1FL 0    1   0   0     0   1   0   1   0   0   0     0   0   0   1   0
                Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                    0
              BC2FL 0    1   0   0     1   0   0   1   0   0   0     0   0   0   1   0
Coprocessor Number
     Format:
                BCzT offset
     Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. If the
                CPz’s condition signal (CpCond) sampled during the previous instruction
                execution is true, then the program branches to the branch address with a delay of
                one instruction.
                Because the condition signal is sampled during the previous instruction execution,
                there must be at least one instruction between this instruction and a coprocessor
                instruction that changes the condition signal.
     Operation:
     32     T–1: condition ¬ COC[z]
            T:   target ¬ (offset15)14 || offset || 02
           T+1: if condition then
                              PC ¬ PC + target
                 endif
     64    T–1: condition ¬ COC[z]
           T:   target ¬ (offset15)46 || offset || 02
           T+1: if condition then
                              PC ¬ PC + target
                endif
         Exceptions:
                Coprocessor unusable exception
              Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                     0
             BC1T 0     1   0   0     0   1   0   1   0   0   0     0   0   0   0   1
              Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                     0
             BC2T 0     1   0   0     1   0   0   1   0   0   0     0   0   0   0   1
Coprocessor Number
     Format:
                BCzTL offset
     Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. If the
                CPz’s condition signal (CpCond), as sampled during the previous instruction
                execution, is true, the program branches to the branch address with a delay of one
                instruction.
                If it does not branch, the instruction in the branch delay slot is discarded.
                Because the condition signal is sampled during the previous instruction execution,
                there must be at least one instruction between this instruction and a coprocessor
                instruction that changes the condition signal.
     Operation:
     32     T–1: condition ¬ COC[z]
            T:   target ¬ (offset15)14 || offset || 02
            T+1: if condition then
                 else          PC ¬ PC + target
                               NullifyCurrentInstruction
                 endif
     64     T–1: condition ¬ COC[z]
            T:   target ¬ (offset15)46|| offset || 02
            T+1: if condition then
                               PC ¬ PC + target
                 else
                               NullifyCurrentInstruction
                 endif
                            Branch On Coprocessor z
BCzTL                              True Likely                                  BCzTL
                                  (continued)
         Exceptions:
                Coprocessor unusable exception
 BCzTL Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                            0
             BC0TL 0    1   0   0     0   0   0   1   0   0   0     0   0   0   1   1
               Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                    0
             BC1TL 0    1   0   0     0   1   0   1   0   0   0     0   0   0   1   1
               Bit # 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16                    0
             BC2TL 0    1   0   0     1   0   0   1   0   0   0     0   0   0   1   1
Coprocessor Number
       BEQ            rs             rt                             offset
     000100
        6              5              5                             16
       Format:
              BEQ rs, rt, offset
       Description:
              A branch address is calculated from the sum of the address of the instruction in the
              delay slot and the 16-bit offset, shifted two bits left and sign-extended. The
              contents of general purpose register rs and the contents of general purpose register
              rt are compared. If the two registers are equal, then the program branches to the
              branch address with a delay of one instruction.
       Operation:
       32   T:   target ¬ (offset15)14 || offset || 02
                 condition ¬ (GPR[rs] = GPR[rt])
            T+1: if condition then
                               PC ¬ PC + target
                 endif
       64   T:   target ¬ (offset15)46 || offset || 02
                 condition ¬ (GPR[rs] = GPR[rt])
            T+1: if condition then
                               PC ¬ PC + target
                 endif
       Exceptions:
              None
        BEQL             rs              rt                            offset
       010100
          6               5               5                            16
         Format:
                 BEQL rs, rt, offset
         Description:
                 A branch address is calculated from the sum of the address of the instruction in the
                 delay slot and the 16-bit offset, shifted two bits left and sign-extended. The
                 contents of general purpose register rs and the contents of general purpose register
                 rt are compared. If the two registers are equal, the program branches to the branch
                 address with a delay of one instruction.
                 If it does not branch, the instruction in the branch delay slot is discarded.
         Operation:
         32     T:   target ¬ (offset15)14 || offset || 02
                     condition ¬ (GPR[rs] = GPR[rt])
                T+1: if condition then
                                   PC ¬ PC + target
                     else
                                   NullifyCurrentInstruction
                     endif
         64     T:   target ¬ (offset15)46 || offset || 02
                     condition ¬ (GPR[rs] = GPR[rt])
                T+1: if condition then
                                   PC ¬ PC + target
                     else
                                   NullifyCurrentInstruction
                     endif
         Exceptions:
                 None
      Format:
              BGEZ rs, offset
      Description:
              A branch address is calculated from the sum of the address of the instruction in the
              delay slot and the 16-bit offset, shifted two bits left and sign-extended. If the
              contents of general purpose register rs are equal to or larger than 0, then the
              program branches to the branch address with a delay of one instruction.
      Operation:
      32    T:   target ¬ (offset15)14 || offset || 02
                 condition ¬ (GPR[rs]31 = 0)
            T+1: if condition then
                               PC ¬ PC + target
                 endif
      64   T:   target ¬ (offset15)46 || offset || 02
                condition ¬ (GPR[rs]63 = 0)
           T+1: if condition then
                              PC ¬ PC + target
                endif
      Exceptions:
              None
         Format:
                BGEZAL rs, offset
         Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended.
                Unconditionally, the address of the instruction next to the delay slot is stored in
                the link register, r31. If the contents of general purpose register rs are equal to or
                larger than 0, then the program branches to the branch address, with a delay of one
                instruction.
                Generally, general purpose register r31 should not be specified as general purpose
                register rs, because the contents of rs are destroyed by storing link address, and
                then it may not be reexecutable. An attempt to execute this instruction does not
                cause exception, however.
         Operation:
        32    T:   target ¬ (offset15)14 || offset || 02
                   condition ¬ (GPR[rs]31 = 0)
                   GPR[31] ¬ PC + 8
              T+1: if condition then
                          PC ¬ PC + target
                   endif
        64    T:   target ¬ (offset15)46 || offset || 02
                   condition ¬ (GPR[rs]63 = 0)
                   GPR[31] ¬ PC + 8
              T+1: if condition then
                          PC ¬ PC + target
                   endif
         Exceptions:
                None
       Format:
              BGEZALL rs, offset
       Description:
              A branch address is calculated from the sum of the address of the instruction in the
              delay slot and the 16-bit offset, shifted two bits left and sign-extended.
              Unconditionally, the address of the instruction next to the delay slot is stored in
              the link register, r31. If the contents of general purpose register rs are equal to or
              larger than 0, then the program branches to the branch address, with a delay of one
              instruction. When it does not branch, instruction in the delay slot are discarded.
              Generally, general purpose register r31 should not be specified as general purpose
              register rs, because the contents of rs are destroyed by storing link address, and
              then it may not be reexecutable. An attempt to execute this instruction does not
              cause any exception, however.
       Operation:
       32   T:   target ¬ (offset15)14 || offset || 02
                 condition ¬ (GPR[rs]31 = 0)
                 GPR[31] ¬ PC + 8
            T+1: if condition then
                       PC ¬ PC + target
                 else
                      NullifyCurrentInstruction
                 endif
       64   T:   target ¬ (offset15)46 || offset || 02
                 condition ¬ (GPR[rs]63 = 0)
                 GPR[31] ¬ PC + 8
            T+1: if condition then
                       PC ¬ PC + target
                 else
                       NullifyCurrentInstruction
                 endif
       Exceptions:
              None
                               Branch On Greater
BGEZL                      Than Or Equal To Zero Likely                          BGEZL
  31           26 25           21 20           16 15                                             0
         Format:
                BGEZL rs, offset
         Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. If the
                contents of general purpose register rs are equal to or larger than 0, then the
                program branches to the branch address, with a delay of one instruction.
                If it does not branch, the instruction in the branch delay slot is discarded.
         Operation:
        32    T:   target ¬ (offset15)14 || offset || 02
                   condition ¬ (GPR[rs]31 = 0)
              T+1: if condition then
                         PC ¬ PC + target
                   else
                         NullifyCurrentInstruction
                   endif
        64    T:   target ¬ (offset15)46 || offset || 02
                   condition ¬ (GPR[rs]63 = 0)
              T+1: if condition then
                         PC ¬ PC + target
                   else
                         NullifyCurrentInstruction
                   endif
         Exceptions:
                None
      BGTZ             rs           0                              offset
     000111                       00000
       6                5            5                             16
      Format:
             BGTZ rs, offset
      Description:
             A branch address is calculated from the sum of the address of the instruction in the
             delay slot and the 16-bit offset, shifted two bits left and sign-extended. The
             contents of general purpose register rs are larger than zero, then the program
             branches to the branch address, with a delay of one instruction.
Operation:
      Exceptions:
             None
                                Branch On Greater
BGTZL                            Than Zero Likely                               BGTZL
  31          26 25           21 20           16 15                                             0
        BGTZL           rs            0                               offset
       010111                       00000
          6              5             5                               16
         Format:
                BGTZL rs, offset
         Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. The
                contents of general purpose register rs are larger than 0, then the program branches
                to the branch address, with a delay of one instruction.
                If it does not branch, the instruction in the branch delay slot is discarded.
Operation:
         Exceptions:
                None
      BLEZ            rs           0                                offset
     000110                      00000
       6               5            5                               16
      Format:
              BLEZ rs, offset
      Description:
              A branch address is calculated from the sum of the address of the instruction in the
              delay slot and the 16-bit offset, shifted two bits left and sign-extended. If the
              contents of general purpose register rs are equal to 0 or smaller than 0, then the
              program branches to the branch address, with a delay of one instruction.
      Operation:
      32   T:   target ¬ (offset15)14 || offset || 02
                condition ¬ (GPR[rs]31 = 1) or (GPR[rs] = 032)
           T+1: if condition then
                       PC ¬ PC + target
                 endif
      64    T:   target ¬ (offset15)46 || offset || 02
                 condition ¬ (GPR[rs]63 = 1) and (GPR[rs] = 064)
            T+1: if condition then
                       PC ¬ PC + target
                 endif
      Exceptions:
              None
        BLEZL            rs           0                                offset
       010110                       00000
           6              5            5                               16
         Format:
                BLEZL rs, offset
         Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. The
                contents of general purpose register rs is equal to or smaller than zero, then the
                program branches to the branch address, with a delay of one instruction.
                If it does not branch, the instruction in the branch delay slot is discarded.
         Operation:
        32    T:   target ¬ (offset15)14 || offset || 02
                   condition ¬ (GPR[rs]31 = 1) or (GPR[rs] = 032)
              T+1: if condition then
                         PC ¬ PC + target
                   else
                         NullifyCurrentInstruction
                   endif
        64    T:   target ¬ (offset15)46 || offset || 02
                   condition ¬ (GPR[rs]63 = 1) and (GPR[rs] = 064)
              T+1: if condition then
                         PC ¬ PC + target
                   else
                         NullifyCurrentInstruction
                   endif
         Exceptions:
                None
      Format:
             BLTZ rs, offset
      Description:
             A branch address is calculated from the sum of the address of the instruction in the
             delay slot and the 16-bit offset, shifted two bits left and sign-extended. If the
             contents of general purpose register rs are smaller than 0, then the program
             branches to the branch address, with a delay of one instruction.
Operation:
      Exceptions:
             None
                                 Branch On Less
BLTZAL                          Than Zero And Link                            BLTZAL
  31           26 25           21 20           16 15                                                0
         Format:
                BLTZAL rs, offset
         Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended.
                Unconditionally, the address of the instruction next to the delay slot is stored in
                the link register, r31. If the contents of general purpose register rs are smaller than
                0, then the program branches to the branch address, with a delay of one
                instruction.
                Generally, general purpose register r31 should not be specified as general purpose
                register rs, because the contents of rs are destroyed by storing link address, and
                then it is not reexecutable. An attempt to execute this instruction does not
                generate exceptions, however.
Operation:
         Exceptions:
                None
                              Branch On Less
BLTZALL                   Than Zero And Link Likely                  BLTZALL
31        26 25            21 20           16 15                                                0
     Format:
            BLTZALL rs, offset
     Description:
            A branch address is calculated from the sum of the address of the instruction in the
            delay slot and the 16-bit offset, shifted two bits left and sign-extended.
            Unconditionally, the instruction next to the delay slot is stored in the link register,
            r31. If the contents of general purpose register rs is smaller than 0, then the
            program branches to the branch address, with a delay of one instruction.
            If it does not branch, the instruction in the branch delay slot is discarded.
            Generally, general purpose register r31 should not be specified as general purpose
            register rs, because the contents of rs are destroyed by storing link address, and
            then it is not reexecutable. An attempt to execute this instruction does not cause
            exception, however.
     Operation:
     32   T:   target ¬ (offset15)14 || offset || 02
               condition ¬ (GPR[rs]31 = 1)
               GPR[31] ¬ PC + 8
          T+1: if condition then
                     PC ¬ PC + target
               else
                    NullifyCurrentInstruction
               endif
     64   T:   target ¬ (offset15)46 || offset || 02
               condition ¬ (GPR[rs]63 = 1)
               GPR[31] ¬ PC + 8
          T+1: if condition then
                     PC ¬ PC + target
               else
                     NullifyCurrentInstruction
               endif
     Exceptions:
            None
         Format:
                BLTZL rs, offset
         Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended.
                Unconditionally, the instruction next to the delay slot is stored in the link register,
                r31. If the contents of general purpose register rs are smaller than 0, then the
                program branches to the branch address, with a delay of one instruction.
                If it does not branch, the instruction in the branch delay slot is discarded.
         Operation:
         32    T:   target ¬ (offset15)14 || offset || 02
                    condition ¬ (GPR[rs]31 = 1)
               T+1: if condition then
                          PC ¬ PC + target
                    else
                          NullifyCurrentInstruction
                    endif
         64    T:   target ¬ (offset15)46 || offset || 02
                    condition ¬ (GPR[rs]63 = 1)
               T+1: if condition then
                          PC ¬ PC + target
                    else
                          NullifyCurrentInstruction
                    endif
         Exceptions:
                None
       BNE             rs                rt                         offset
     000101
        6               5                 5                         16
      Format:
              BNE rs, rt, offset
      Description:
              A branch address is calculated from the sum of the address of the instruction in the
              delay slot and the 16-bit offset, shifted two bits left and sign-extended. The
              contents of general purpose register rs and the contents of general purpose register
              rt are compared. If the two registers are not equal, then the program branches to
              the branch address, with a delay of one instruction.
Operation:
      Exceptions:
              None
        BNEL             rs             rt                             offset
       010101
          6              5               5                             16
         Format:
                BNEL rs, rt, offset
         Description:
                A branch address is calculated from the sum of the address of the instruction in the
                delay slot and the 16-bit offset, shifted two bits left and sign-extended. The
                contents of general purpose register rs and the contents of general purpose register
                rt are compared. If the two registers are not equal, then the program branches to
                the branch address, with a delay of one instruction.
                If it does not branch, the instruction in the branch delay slot is discarded.
         Operation:
         32   T:   target ¬ (offset15)14 || offset || 02
                   condition ¬ (GPR[rs] ¹ GPR[rt])
              T+1: if condition then
                          PC ¬ PC + target
                   else
                          NullifyCurrentInstruction
                   endif
        64    T:   target ¬ (offset15)46 || offset || 02
                   condition ¬ (GPR[rs] ¹ GPR[rt])
              T+1: if condition then
                         PC ¬ PC + target
                   else
                         NullifyCurrentInstruction
                   endif
         Exceptions:
                None
     Format:
              BREAK
     Description:
              A breakpoint exception occurs after execution of this instruction, transferring
              control to the exception handler.
              The code area is available for use to transfer parameters to the exception handler,
              the parameter is retrieved by the exception handler only by loading the contents
              of the memory word containing the instruction as data.
Operation:
32, 64 T: BreakpointException
     Exceptions:
              Breakpoint exception
         Format:
                CACHE op, offset(base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The virtual address is translated to a
                physical address using the TLB, and the 5-bit sub-opcode op specifies a cache
                operation contents for the specified address.
                CP0 is not usable if the CP0 enable bit CU0 in the Status register in the User or
                Supervisor mode is cleared, and a coprocessor unusable exception occurs after
                execution of this instruction. The execution of this instruction on any cache/
                operation combination not listed below, or on a secondary cache which is not
                supplied to VR4300, is undefined. The execution of this instruction in uncached
                area is also undefined.
                The Index operation uses a part of the virtual address to specify a cache block. For
                example a cache of 2CACHEBITS bytes with 2LINEBITS bytes per tag,
                vAddrCACHEBITS ... LINEBITS specifies the block.
                The Hit operation accesses the cache as normal data references, and performs the
                specified cache operation only if the cache contains valid data of the specified
                physical address (a hit). If data is not in the cache (a miss), the cache operation is
                not executed.
                     Cache Operation
CACHE                  (continued)                               CACHE
    Write back from a cache goes to the main memory. The address in the main
    memory to be written is the address in the cache tag and not the physical address
    translated by using TLB.
    The TLB miss exception and TLB invalid exception may occur when any cache
    operation is performed. The Index* operation executed to the address in the
    unmapped area is used to prevent occurrence of the TLB exception. The Index
    operation never generates the TLB change exception. Bits 16 and 17 of the
    instruction code indicate the cache subject to the operation as follows.
    * Although a physical address is used to index the cache, it does not have to
      coincide with the cache tag.
    Bits 20:18 of this instruction specify the contents of the cache operation. For
    details, refer to the following pages.
                                       Cache Operation
CACHE                                    (continued)                                CACHE
                          Cache Operation
CACHE                       (continued)                           CACHE
 Operation:
 32, 64   T:     vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
                 (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                 CacheOp (op, vAddr, pAddr)
 Exceptions:
          Coprocessor unusable exception
          TLB invalid exception
          TLB miss exception
          Bus error exception
          Address error exception
         COPz              CF                     rt               rd                 0
      0 1 0 0 x x*        00010                                                 000 0000 0000
            6               5                     5                5                  11
          Format:
                     CFCz rt, rd
          Description:
                     The contents of coprocessor control register rd of CPz are loaded to general
                     purpose register rt.
                     This instruction is not valid for CP0.
          Operation:
         32           T:   data ¬ CCR[z, rd]
                      T+1: GPR[rt] ¬ data
         64           T:   data ¬ (CCR[z, rd]31)32 || CCR[z, rd]
                      T+1: GPR[rt] ¬ data
          Exceptions:
                     Coprocessor unusable exception
                Bit # 31 30 29 28 27 26 25 24 23 22 21                                                  0
              CFC2 0         1     0    0     1   0    0   0   0    1   0
       COPz       CO                                       cofun
     0 1 0 0 x x* 1
         6         1                                         25
       Format:
                 COPz cofun
       Description:
                 A coprocessor operation is performed. The operation may specify and reference
                 internal coprocessor registers, and may change the state of the coprocessor
                 condition line, but does not modify state within the processor or the cache/main
                 memory. For details of coprocessor operations, refer to Chapter 17 FPU
                 Instruction Set Details.
       Operation:
       Exceptions:
                 Coprocessor unusable exception
                 Floating-point exception (CP1 only)
             Bit # 31 30 29 28 27 26 25                                                           0
           COP1 0           1   0   0     0   1    1
             Bit # 31 30 29 28 27 26 25                                                           0
           COP2 0           1   0   0     1   0    1
       COPz           CT                      rt                   rd                     0
      0100xx*        00110                                                          000 0000 0000
         6             5                       5                   5                      11
         Format:
                  CTCz rt, rd
         Description:
                  The contents of general purpose register rt are loaded into coprocessor control
                  register rd of CPz. This instruction is not valid for CP0.
         Operation:
         32,64      T:     data ¬ GPR[rt]
                    T + 1: CCR[z, rd] ¬ data
         Exceptions:
                  Coprocessor unusable exception
                 Bit # 31 30 29 28 27 26 25 24 23 22 21                                                  0
             CTC2 0       1     0   0     1        0   0   0   1        1    0
      Format:
               DADD rd, rs, rt
      Description:
               The contents of general purpose register rs and the contents of general purpose
               register rt are added, and the result is stored in general purpose register rd. An
               integer overflow exception occurs if the carries out of bits 62 and 63 differ (2’s
               complement overflow). The contents of the destination register rd are not
               modified when an integer overflow exception occurs.
               This operation is only defined for the VR4300 operating in 64-bit mode and in 32-
               bit Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
               causes a reserved instruction exception.
Operation:
      Exceptions:
               Integer overflow exception
               Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
         Format:
                   DADDI rt, rs, immediate
         Description:
                   The 16-bit immediate is sign-extended and added to the contents of general
                   purpose register rs, and the result is stored in general purpose register rt. An
                   integer overflow exception occurs if carries out of bits 62 and 63 differ (2’s
                   complement overflow). The contents of the destination register rt are not
                   modified when an integer overflow exception occurs.
                   This operation is only defined for the VR4300 operating in 64-bit mode and in 32-
                   bit Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                   causes a reserved instruction exception.
Operation:
         Exceptions:
                   Integer overflow exception
                   Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
                                Doubleword Add
DADDIU                        Immediate Unsigned                          DADDIU
31         26 25              21 20         16 15                                               0
      DADDIU           rs              rt                        immediate
     011001
        6               5               5                           16
      Format:
               DADDIU rt, rs, immediate
      Description:
               The 16-bit immediate is sign-extended and added to the contents of general
               purpose register rs, and the result is stored in general purpose register rt.
               This operation is only defined for the VR4300 operating in 64-bit mode and in 32-
               bit Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
               causes a reserved instruction exception.
               The only difference between this instruction and the DADDI instruction is that
               DADDIU instruction never causes an integer overflow exception.
Operation:
      Exceptions:
               Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
        SPECIAL            rs              rt             rd           0              DADDU
       000000                                                        00000           101101
           6                5               5             5            5                6
         Format:
                   DADDU rd, rs, rt
         Description:
                   The contents of general purpose register rs and the contents of general purpose
                   register rt are added, and the result is stored in general purpose register rd.
                   This operation is only defined for the VR4300 operating in 64-bit mode and in 32-
                   bit Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                   causes a reserved instruction exception.
                   The only difference between this instruction and the DADD instruction is that
                   DADDU instruction never causes an integer overflow exception.
Operation:
         Exceptions:
                   Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
     SPECIAL            rs               rt                 0                          DDIV
     000000                                           00 0000 0000                    011110
        6               5                 5                10                            6
      Format:
               DDIV rs, rt
      Description:
               The contents of general purpose register rs are divided by the contents of general
               purpose register rt, treating both operands as signed integers. An integer overflow
               exception never occurs, and the result of this operation is undefined when the
               divisor is zero.
               This instruction is usually executed after additional instructions to check for a zero
               divisor and for overflow.
               When the operation completes, the quotient word of the double result is loaded
               into special register LO, and the remainder word of the double result is loaded into
               special register HI.
               If either of the two preceding instructions is MFHI or MFLO, the results of those
               instructions are undefined. To obtain the correct result, insert two or more
               additional instructions between the MFHI or MFLO and DDIV instruction.
               This operation is only defined for the VR4300 operating in 64-bit mode and in 32-
               bit Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
               causes a reserved instruction exception.
      Operation:
      64         T–2:          LO         ¬ undefined
                               HI         ¬ undefined
                 T–1:          LO         ¬ undefined
                               HI         ¬ undefined
                 T:            LO         ¬ GPR[rs] div GPR[rt]
                               HI         ¬ GPR[rs] mod GPR[rt]
       SPECIAL            rs               rt                 0                         DDIVU
       000000                                            0000000000                    011111
          6                5                5                10                           6
         Format:
                 DDIVU rs, rt
         Description:
                 The contents of general purpose register rs are divided by the contents of general
                 purpose register rt, treating both operands as unsigned integers. An integer
                 overflow exception never occurs, and the result of this operation is undefined
                 when the divisor is zero.
                 This instruction is executed after the instructions to check for a zero division.
                 When the operation completes, the quotient (doubleword) is stored into special
                 register LO, and the remainder (doubleword) is stored into special register HI.
                 If either of the two preceding instructions is MFHI or MFLO, the results of those
                 instructions are undefined. To obtain the correct result, insert two or more
                 instructions in between the MFHI or MFLO and DDIVU instructions.
                 This operation is only defined for the VR4300 operating in 64-bit mode and in 32-
                 bit Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                 causes a reserved instruction exception.
         Operation:
        64         T–2:          LO          ¬ undefined
                                 HI          ¬ undefined
                   T–1:          LO          ¬ undefined
                                 HI          ¬ undefined
                   T:            LO          ¬ (0 || GPR[rs]) div (0 || GPR[rt])
                                 HI          ¬ (0 || GPR[rs]) mod (0 || GPR[rt])
      Format:
               DIV rs, rt
      Description:
               The contents of general purpose register rs are divided by the contents of general
               purpose register rt, treating both operands as unsigned integers. An overflow
               exception never occurs, and the result of this operation is undefined when the
               divisor is zero. In 64-bit mode, the result must be sign-extended, 32-bit values.
               This instruction is usually executed after the instructions to check for a zero
               division and for overflow.
               When the operation completes, the quotient (doubleword) is stored into special
               register LO, and the remainder (doubleword) is stored into special register HI.
               If either of the two preceding instructions is MFHI or MFLO, the results of those
               instructions are undefined. To obtain the correct result, insert two or more
               additional instructions in between the MFHI or MFLO and DIV instructions.
                                  Divide
DIV                             (continued)                              DIV
Operation:
        32     T–2:    LO            ¬ undefined
                       HI            ¬ undefined
               T–1:    LO            ¬ undefined
                       HI            ¬ undefined
               T:      LO            ¬ GPR[rs] div GPR[rt]
                       HI            ¬ GPR[rs] mod GPR[rt]
         64    T–2:    LO            ¬ undefined
                       HI            ¬ undefined
               T–1:    LO            ¬ undefined
                       HI            ¬ undefined
               T:      q             ¬ GPR[rs]31...0 div GPR[rt]31...0
                       r             ¬ GPR[rs]31...0 mod GPR[rt]31...0
                       LO            ¬ (q31)32 || q31...0
                       HI            ¬ (r31)32 || r31...0
         Exceptions:
               None
      Format:
               DIVU rs, rt
      Description:
               The contents of general purpose register rs are divided by the contents of general
               purpose register rt, treating both operands as unsigned integers. An integer
               overflow exception never occurs, and the result of this operation is undefined
               when the divisor is zero. In 64-bit mode, the result must be sign-extended, 32-bit
               values.
               This instruction is executed after the instructions to check for a zero division.
               When the operation completes, the quotient (doubleword) is stored into special
               register LO, and the remainder (doubleword) is stored into special register HI.
               If either of the two preceding instructions is MFHI or MFLO, the results of those
               instructions are undefined. To obtain the correct result, insert two or more
               additional instructions in between the MFHI or MFLO and DIVU instructions.
                            Divide Unsigned
DIVU                          (continued)                                 DIVU
         Operation:
         32    T–2:    LO           ¬ undefined
                       HI           ¬ undefined
               T–1:    LO           ¬ undefined
                       HI           ¬ undefined
               T:      LO           ¬ (0 || GPR[rs]) div (0 || GPR[rt])
                       HI           ¬ (0 || GPR[rs]) mod (0 || GPR[rt])
         64    T–2:    LO           ¬ undefined
                       HI           ¬ undefined
               T–1:    LO           ¬ undefined
                       HI           ¬ undefined
               T:      q            ¬ (0 || GPR[rs]31...0) div (0 || GPR[rt]31...0)
                       r            ¬ (0 || GPR[rs]31...0) mod (0 || GPR[rt]31...0)
                       LO           ¬ (q31)32 || q31...0
                       HI           ¬ (r31)32 || r31...0
         Exceptions:
                None
      COP0          DMF              rt             rd                  0
     010000        00001                                          000 0000 0000
       6             5                5              5                  11
      Format:
              DMFC0 rt, rd
      Description:
              The contents of coprocessor register rd of the CP0 are loaded into general purpose
              register rt.
              This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
              Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
              causes a reserved instruction exception. The contents of the source coprocessor
              register rd are written to the 64-bit destination general purpose register rt. The
              operation of DMFC0 instruction on a 32-bit register of the CP0 is undefined.
      Operation:
      64           T:     data ¬CPR[0,rd]
                   T+1: GPR[rt] ¬ data
      Exceptions:
              Coprocessor unusable exception       (VR4300 in 64-/32-bit User mode and
                                                   Supervisor mode if CP0 is disabled)
              Reserved instruction exception       (VR4300 in 32-bit User or Supervisor mode)
                            Doubleword Move To
DMTC0                    System Control Coprocessor                            DMTC0
 31              26 25          21 20          16 15         11 10                               0
       COP0           DMT                rt             rd                0
      010000         00101                                           00000000000
             6             5              5              5                    11
         Format:
                  DMTC0 rt, rd
         Description:
                  The contents of general purpose register rt are loaded into coprocessor register rd
                  of the CP0.
                  This operation is defined for the VR4300 operating in 64-bit mode or in 32-bit
                  Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                  causes a reserved instruction exception.
                  The contents of the source general purpose register rd are written to the 64-bit
                  destination coprocessor register rt. The operation of DMTC0 instruction on a 32-
                  bit register of the CP0 is undefined.
                  Because the state of the virtual address translation system may be altered by this
                  instruction, the operation of load instructions, store instructions, and TLB
                  operations immediately prior to and after this instruction are undefined.
Operation:
        64            T:       data ¬ GPR[rt]
                      T+1: CPR[0, rd] ¬ data
         Exceptions:
                  Coprocessor unusable exception       (VR4300 in 64-/32-bit User and Supervisor
                                                       mode if CP0 is disabled)
                  Reserved instruction exception       (VR4300 in 32-bit User or Supervisor mode)
  SPECIAL           rs              rt                 0                      DMULT
 000000                                          00 0000 0000                011100
     6               5               5                10                        6
     Format:
            DMULT rs, rt
     Description:
            The contents of general purpose registers rs and rt are multiplied, treating both
            operands as signed integers. An integer overflow exception never occurs.
            When the operation completes, the low-order doubleword is stored into special
            register LO, and the high-order doubleword is stored into special register HI.
            If either of the two preceding instructions is MFHI or MFLO, the results of these
            instructions are undefined. To obtain the correct result, insert two or more other
            instructions in between the MFHI or MFLO and DMULT instructions.
            This operation is only defined for the VR4300 operating in 64-bit mode and in 32-
            bit Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
            causes a reserved instruction exception.
     Operation:
     64        T–2: LO           ¬ undefined
                    HI           ¬ undefined
               T–1: LO           ¬ undefined
                    HI           ¬ undefined
               T:   t            ¬ GPR[rs] * GPR[rt]
                    LO           ¬ t63...0
                    HI           ¬ t127...64
     Exceptions:
            Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
                                 Doubleword Multiply
DMULTU                                Unsigned                             DMULTU
  31          26 25              21 20          16 15                        6   5              0
       SPECIAL            rs               rt                 0                   DMULTU
       000000                                            00 0000 0000            011101
          6               5                 5                 10                     6
         Format:
                 DMULTU rs, rt
         Description:
                 The contents of general purpose register rs and the contents of general purpose
                 register rt are multiplied, treating both operands as unsigned integers. An
                 overflow exception never occurs.
                 When the operation completes, the low-order doubleword is stored into special
                 register LO, and the high-order doubleword is stored into special register HI.
                 If either of the two preceding instructions is MFHI or MFLO, the results of these
                 instructions are undefined. To obtain the correct result, insert two or more other
                 instructions in between the MFHI or MFLO and DMULTU instructions.
                 This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                 Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                 causes a reserved instruction exception.
         Operation:
        64         T–2:        LO ¬ undefined
                               HI ¬ undefined
                   T–1:        LO ¬ undefined
                               HI ¬ undefined
                   T:          t ¬ (0 || GPR[rs]) * (0 || GPR[rt])
                               LO ¬ t63...0
                               HI ¬t127...64
      SPECIAL         0                 rt             rd              sa             DSLL
     000000         00000                                                           111000
         6            5                  5             5               5               6
       Format:
                DSLL rd, rt, sa
       Description:
                The contents of general purpose register rt are shifted left by sa bits, inserting
                zeros into the low-order bits. The result is stored in general purpose register rd.
                This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                causes a reserved instruction exception.
Operation:
        64         T:     s ¬ 0 || sa
                          GPR[rd] ¬ GPR[rt](63–s)...0 || 0s
       Exceptions:
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
        SPECIAL            rs              rt             rd           0               DSLLV
       000000                                                        00000            010100
           6               5                5             5            5                6
         Format:
                  DSLLV rd, rt, rs
         Description:
                  The contents of general purpose register rt are shifted left by the number of bits
                  specified by the low-order six bits contained in general purpose register rs,
                  inserting zeros into the low-order bits. The result is stored in general purpose
                  register rd.
                  This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                  Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                  causes a reserved instruction exception.
         Operation:
         64           T:        s ¬ GPR[rs]5...0
                                GPR[rd]¬ GPR[rt](63–s)...0 || 0s
         Exceptions:
                  Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
      SPECIAL         0                  rt             rd              sa              DSLL32
     000000         00000                                                               111100
         6            5                   5             5                5                6
       Format:
                DSLL32 rd, rt, sa
       Description:
                The contents of general purpose register rt are shifted left by 32+sa bits, inserting
                zeros into the low-order bits. The result is stored in general purpose register rd.
                This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                causes a reserved instruction exception.
       Operation:
       64           T:     s ¬ 1 || sa
                           GPR[rd]¬ GPR[rt](63–s)...0 || 0s
       Exceptions:
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
                                    Doubleword
DSRA                           Shift Right Arithmetic                              DSRA
  31          26 25             21 20           16 15          11 10          6   5               0
        SPECIAL          0                 rt            rd              sa           DSRA
       000000          00000                                                        111011
           6             5                  5            5              5              6
         Format:
                  DSRA rd, rt, sa
         Description:
                  The contents of general purpose register rt are shifted right by sa bits, sign-
                  extending the high-order bits. The result is stored in general purpose register rd.
                  This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                  Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                  causes a reserved instruction exception.
         Operation:
         64           T:     s ¬ 0 || sa
                             GPR[rd] ¬ (GPR[rt]63)s || GPR[rt] 63...s
         Exceptions:
                  Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
      SPECIAL           rs              rt              rd           0            DSRAV
     000000                                                        00000         010111
         6               5               5              5            5              6
       Format:
                DSRAV rd, rt, rs
       Description:
                The contents of general purpose register rt are shifted right by the number of bits
                specified by the low-order six bits of general purpose register rs, sign-extending
                the high-order bits. The result is stored in general purpose register rd.
                This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                causes a reserved instruction exception.
       Operation:
       64         T:     s ¬ GPR[rs]5...0
                         GPR[rd] ¬ (GPR[rt]63)s || GPR[rt]63...s
       Exceptions:
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
        SPECIAL            0              rt             rd              sa          DSRA32
       000000            00000                                                      111111
           6               5               5             5              5              6
         Format:
                  DSRA32 rd, rt, sa
         Description:
                  The contents of general purpose register rt are shifted right by 32+sa bits, sign-
                  extending the high-order bits. The result is stored in general purpose register rd.
                  This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                  Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                  causes a reserved instruction exception.
         Operation:
         64         T:     s ¬1 || sa
                           GPR[rd] ¬ (GPR[rt]63)s || GPR[rt] 63...s
         Exceptions:
                  Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
                                     Doubleword
DSRL                              Shift Right Logical
                                                                                       DSRL
31          26 25             21 20          16 15           11 10            6    5               0
      SPECIAL            0              rt              rd             sa             DSRL
     000000            00000                                                        111010
         6               5               5             5               5               6
       Format:
                DSRL rd, rt, sa
       Description:
                The contents of general purpose register rt are shifted right by sa bits, inserting
                zeros into the high-order bits. The result is stored in general purpose register rd.
                This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                causes a reserved instruction exception.
Operation:
       64         T:     s ¬ 0 || sa
                         GPR[rd] ¬ 0s || GPR[rt]63...s
       Exceptions:
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
        SPECIAL            rs              rt             rd           0            DSRLV
       000000                                                        00000         010110
           6               5                5             5            5              6
         Format:
                  DSRLV rd, rt, rs
         Description:
                  The contents of general purpose register rt are shifted right by the number of bits
                  specified by the low-order six bits of general purpose register rs, inserting zeros
                  into the high-order bits. The result is stored in general purpose register rd.
                  This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                  Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                  causes a reserved instruction exception.
Operation:
         64           T:        s ¬ GPR[rs]5...0
                                GPR[rd] ¬ 0s || GPR[rt]63...s
         Exceptions:
                  Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
      SPECIAL         0                  rt            rd             sa            DSRL32
     000000         00000                                                          111110
         6            5                   5            5              5               6
       Format:
                DSRL32 rd, rt, sa
       Description:
                The contents of general purpose register rt are shifted right by 32+sa bits,
                inserting zeros into the high-order bits. The result is stored in general purpose
                register rd.
                This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                causes a reserved instruction exception.
       Operation:
       64           T:     s ¬ 1 || sa
                           GPR[rd] ¬ 0s || GPR[rt]63...s
       Exceptions:
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
        SPECIAL            rs              rt             rd            0              DSUB
       000000                                                         00000          101110
           6               5                5             5             5               6
         Format:
                  DSUB rd, rs, rt
         Description:
                  The contents of general purpose register rt are subtracted from the contents of
                  general purpose register rs, and the result is stored in general purpose register rd.
                  An integer overflow exception takes place if the carries out of bits 62 and 63 differ
                  (2’s complement overflow). The contents of destination register rd are not
                  modified when an integer overflow exception occurs.
                  This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
                  Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
                  causes a reserved instruction exception.
Operation:
         Exceptions:
                  Integer overflow exception
                  Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
     SPECIAL            rs              rt             rd           0                DSUBU
     000000                                                       00000             101111
        6               5                5             5            5                  6
      Format:
             DSUBU rd, rs, rt
      Description:
             The contents of general purpose register rt are subtracted from the contents of
             general purpose register rs, and the result is stored in general purpose register rd.
             The only difference between this instruction and the DSUB instruction is that
             DSUBU instruction never causes an integer overflow exception.
             This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
             Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
             causes a reserved instruction exception.
Operation:
      Exceptions:
             Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
        COP0           CO                  0                                            ERET
       010000           1       000 0000 0000 0000 0000                                011000
          6             1                  19                                             6
         Format:
                  ERET
         Description:
                  ERET is the VR4300 instruction for returning from an interrupt, exception, or
                  error exception. Unlike a branch or jump instruction, ERET does not execute the
                  next instruction.
                  ERET instruction must not itself be placed in a branch delay slot.
                  If the ERL bit of the Status register is set (SR2 = 1), load the contents of the
                  ErrorEPC register to the PC and clear the ERL bit to zero. Otherwise (SR2 = 0),
                  load the PC from the EPC, and clear the EXL bit of the Status register to zero
                  (SR1 = 0).
                  An ERET instruction executed between a LL instruction and SC instruction also
                  causes the SC instruction to fail, since ERET instruction clears the LL bit to zero.
         Operation:
         32, 64        T: if SR2 = 1 then
                              PC ¬ ErrorEPC
                              SR ¬ SR31...3 || 0 || SR1...0
                          else
                              PC ¬ EPC
                              SR ¬ SR31...2 || 0 || SR0
                          endif
                          LLbit ¬ 0
         Exceptions:
                  Coprocessor unusable exception
J                                         Jump                                                 J
31          26 25                                                                              0
        J                                   target
     000010
        6                                        26
       Format:
              J target
       Description:
              The 26-bit target is shifted left two bits and combined with the high-order four bits
              of the address of the delay slot to calculate the target address. The program
              unconditionally jumps to this calculated address with a delay of one instruction.
Operation:
      32            T:   temp ¬ target
                    T+1: PC ¬ PC31...28 || temp || 02
      64            T:   temp ¬ target
                    T+1: PC ¬ PC63...28 || temp || 02
       Exceptions:
              None
         JAL                                   target
       000011
          6                                          26
         Format:
                JAL target
         Description:
                The 26-bit target is shifted left two bits and combined with the high-order four bits
                of the address of the delay slot to calculate the address. The program
                unconditionally jumps to this calculated address with a delay of one instruction.
                The address of the instruction after the delay slot is placed in the link register, r31.
         Operation:
        32    T:   temp ¬ target
                   GPR[31] ¬ PC + 8
              T+1: PC ¬ PC 31...28 || temp || 02
        64    T:   temp ¬ target
                   GPR[31] ¬ PC + 8
              T+1: PC ¬ PC 63...28 || temp || 02
         Exceptions:
                None
      SPECIAL             rs           0                  rd           0                JALR
     000000                          00000                           00000             001001
         6                 5            5                5             5                 6
       Format:
                 JALR rs
                 JALR rd, rs
       Description:
                 The program unconditionally jumps to the address contained in general purpose
                 register rs, with a delay of one instruction. The address of the instruction after the
                 delay slot is stored in general purpose register rd. The default value of rd, if
                 omitted in the assembly language instruction, is 31.
                 Register numbers rs and rd should not be equal, because such an instruction does
                 not have the same effect when re-executed. If they are equal, the contents of rs
                 are destroyed by storing link address. However, if an attempt is made to execute
                 this instruction, an exception will not occur, and the result of executing such an
                 instruction is undefined.
                 Since instructions must be word-aligned, a Jump and Link Register instruction
                 must specify a target register (rs) which contains an address whose low-order two
                 bits are zero. If these low-order two bits are not zero, an address exception will
                 occur when the jump target instruction is fetched.
Operation:
       Exceptions:
                 None
JR                                      Jump Register                                       JR
  31               26        25        21 20                                  65                 0
        SPECIAL                   rs                 0                               JR
       000000                                000 0000 0000 0000                    001000
           6                      5                  15                               6
         Format:
                  JR rs
         Description:
                  The program unconditionally jumps to the address contained in general purpose
                  register rs, with a delay of one instruction.
                  Since instructions must be word-aligned, a Jump Register instruction must
                  specify a target register (rs) which contains an address whose low-order two bits
                  are zero. If these low-order two bits are not zero, an address exception will occur
                  when the jump target instruction is fetched.
Operation:
         Exceptions:
                  None
LB                                   Load Byte                                            LB
31         26 25             21 20          16 15                                             0
       LB             base            rt                          offset
     100000
        6               5               5                         16
      Format:
              LB rt, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The contents of the byte at the memory
              location specified by the address are sign-extended and loaded into general
              purpose register rt.
Operation:
      Exceptions:
              TLB miss exception
              TLB invalid exception
              Bus error exception
              Address error exception
         Format:
                LBU rt, offset(base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The contents of the byte at the memory
                location specified by the address are zero-extended and loaded into general
                purpose register rt.
         Operation:
        32     T:       vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
                        (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                        pAddr ¬ pAddrPSIZE – 1 ... 3 || (pAddr2...0 xor ReverseEndian3)
                        mem ¬ LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
                        byte ¬ vAddr2...0 xor BigEndianCPU3
                        GPR[rt] ¬ 024 || mem7+8* byte...8* byte
        64     T:       vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
                        (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                        pAddr ¬ pAddrPSIZE – 1...3 || (pAddr2...0 xor ReverseEndian3)
                        mem ¬ LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)
                        byte ¬ vAddr2...0 xor BigEndianCPU3
                        GPR[rt] ¬ 056 || mem7+8* byte...8* byte
         Exceptions:
                TLB miss exception           TLB invalid exception
                Bus error exception          Address error exception
LD                              Load Doubleword
                                                                                            LD
31          26 25            21 20          16 15                                               0
       LD            base              rt                           offset
     110111
        6              5                5                           16
      Format:
             LD rt, offset(base)
      Description:
             The 16-bit offset is sign-extended and added to the contents of general purpose
             register base to form a virtual address. The contents of the 64-bit doubleword at
             the memory location specified by the address are loaded into general purpose
             register rt.
             If any of the low-order three bits of the address are not zero, an address error
             exception occurs.
             This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
             Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
             causes a reserved instruction exception.
      Operation:
64    T:   vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
           (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
           mem ¬ LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)
           GPR[rt] ¬ mem
             Remark        In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                           virtual address creation.
      Exceptions:
             TLB miss exception
             TLB invalid exception
             Bus error exception
             Address error exception
             Reserved instruction exception        (VR4300 in 32-bit User or Supervisor mode)
          Format:
                  LDCz rt, offset(base)
          Description:
                  The 16-bit offset is sign-extended and added to the contents of general purpose
                  register base to form a virtual address. The processor loads a doubleword from
                  the addressed memory location to CPz. The manner in which each coprocessor
                  uses the data is defined by the individual coprocessor specifications.
                  If any of the low-order three bits of the address are not zero, an address error
                  exception takes place.
                  This instruction is not valid for use with CP0.
                  When the CP1 is specified, the FR bit of the Status register equals zero, and the
                  least-significant bit in the rt field is not zero; the operation of the instruction is
                  undefined. If FR bit equals one, an odd or even register is specified by the rt.
Operation:
     Exceptions:
            TLB miss exception
            TLB invalid exception
            Bus error exception
            Address error exception
            Coprocessor unusable exception
           Bit # 31 30 29 28 27 26                                                      0
          LDC2 1    1   0   1     1   0
         Format:
                  LDL rt, offset(base)
         Description:
                  This instruction is used in combination with the LDR instruction to load the
                  doubleword data in the memory that is not at the word boundary to general
                  purpose register rt. The LDL instruction loads the high-order portion of the data
                  to the register, while the LDR instruction loads the low-order portion.
                  The 16-bit offset is sign-extended and added to the contents of general purpose
                  register base to generate a virtual address that can specify any byte. Of the
                  doubleword data in the memory whose most-significant byte is specified by the
                  generated address, only the data at the same word boundary as the target address
                  is loaded and stored to the high-order portion of general purpose register rt. The
                  remaining portion of the register is not affected. Depending on the address
                  specified, the number of bytes to be loaded changes from 1 to 8.
                  In other words, first the addressed byte is stored to the most-significant byte
                  position of general purpose register rt. If there is data of the low-order byte that
                  follows the same doubleword boundary, the operation to store this data to the next
                  byte of general purpose register rt is repeated. The remaining low-order byte is
                  not affected.
                           memory
                         (big-endian)
                                                                   register
address 8     8     9   10 11 12 13 14 15               before A B C D E F G H
              0     1   2 3 4 5 6 7                     loading                $24
address 0
                                    LDL $24,3($0)
                                                      after   3 4 5 6 7 F G H $24
                                                      loading
             The contents of general purpose register rt are internally bypassed within the
             processor so that no NOP instruction is needed between an immediately preceding
             load instruction which targets general purpose register rt and a subsequent LDL
             (or LDR) instruction.
             The address error exception does not occur even if the specified address is not at
             the doubleword boundary.
             This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
             Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
             causes a reserved instruction exception.
Operation:
             Remark     In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                        virtual address creation.
                  LDL
                  Register          A           B          C       D    E       F        G        H
Memory I J K L M N O P
                          BigEndianCPU = 0                                              BigEndianCPU = 1
                                                               offset                                         offset
 vAddr2...0           destination                   type                        destination           type
                                                           LEM BEM                                           LEM BEM
      0       P   B   C   DE    F       G   H        0         0    7   I   J   K   L   MN    O   P    7      0    0
      1       O   P   C   DE    F       G   H        1         0    6   J   K   L   M   N O   P   H    6      0    1
      2       N   O   P   DE    F       G   H        2         0    5   K   L   M   N   OP    G   H    5      0    2
      3       M   N   O   PE    F       G   H        3         0    4   L   M   N   O   P F   G   H    4      0    3
      4       L   M   N   OP    F       G   H        4         0    3   M   N   O   P   E F   G   H    3      0    4
      5       K   L   M   NO    P       G   H        5         0    2   N   O   P   D   E F   G   H    2      0    5
      6       J   K   L   MN    O       P   H        6         0    1   O   P   C   D   E F   G   H    1      0    6
      7       I   J   K   L M   N       O   P        7         0    0   P   B   C   D   E F   G   H    0      0    7
                      Remark Type:      access type output to memory (Refer to Figure 3-2 Byte
                                        Access within a Doubleword.)
                                Offset: pAddr2...0 output to memory
                                        LEM Little-endian memory (BigEndianMem = 0)
                                        BEM Big-endian memory (BigEndianMem = 1)
          Exceptions:
                      TLB miss exception
                      TLB invalid exception
                      Bus error exception
                      Address error exception
                      Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
      Format:
               LDR rt, offset(base)
      Description:
               This instruction is used in combination with the LDL instruction to load the word
               data in the memory that is not at the word boundary to general purpose register rt.
               The LDL instruction loads the high-order portion of the data to the register, while
               the LDR instruction loads the low-order portion.
               The 16-bit offset is sign-extended and added to the contents of general purpose
               register base to generate a virtual address that can specify any byte. Of the word
               data in the memory whose least-significant byte is specified by the generated
               address, only the data at the same doubleword boundary as the target address is
               loaded and stored to the low-order portion of general purpose register rt. The
               remaining portion of the register is not affected. Depending on the address
               specified, the number of bytes to be loaded changes from 1 to 8.
               In other words, first the addressed byte is stored to the least-significant byte
               position of general purpose register rt. If there is data of the high-order byte that
               follows the same doubleword boundary, the operation to store this data to the next
               byte of general purpose register rt is repeated. The remaining high-order byte is
               not affected.
                       memory
                    (big-endian)
                                                                           register
address 8 8    9   10 11 12 13 14 15               before        A B C D E F G H $24
address 0 0    1   2 3 4 5 6 7                     loading
                                       LDR $24,4($0)
                                                    after
                                                    loading A B C 0 1 2 3 4                     $24
               The contents of general purpose register rt are bypassed within the processor so
               that no NOP instruction is needed between an immediately preceding load
               instruction which targets general purpose register rt and a subsequent LDR (or
               LDL) instruction.
               The address error exception does not occur even if the specified address is not
               located at the doubleword boundary.
               This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
               Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
               causes a reserved instruction exception.
Operation:
               Remark     In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                          virtual address creation.
                    The relationship between the address given to the LDR instruction and the result
                    (bytes for registers) is shown below:
                LDR
                Register            A        B      C        D       E       F     G       H
Memory I J K L M N O P
                        BigEndianCPU = 0                                         BigEndianCPU = 1
                                                        offset                                          offset
vAddr2..0           destination              type                        destination           type
                                                    LEM BEM                                           LEM BEM
   0        I   J   K   L   M   N   O   P     7       0      0   A   B   C   D   E F G    I     0      7     0
   1        A   I   J   K   L   M   N   O     6       1      0   A   B   C   D   E F I    J     1      6     0
   2        A   B   I   J   K   L   M   N     5       2      0   A   B   C   D   E I J    K     2      5     0
   3        A   B   C   I   J   K   L   M     4       3      0   A   B   C   D   I J K    L     3      4     0
   4        A   B   C   D   I   J   K   L     3       4      0   A   B   C   I   J K L    M     4      3     0
   5        A   B   C   D   E   I   J   K     2       5      0   A   B   I   J   K L M    N     5      2     0
   6        A   B   C   D   E   F   I   J     1       6      0   A   I   J   K   L MN     O     6      1     0
   7        A   B   C   D   E   F   G   I     0       7      0   I   J   K   L   MNO      P     7      0     0
                    Remark Type:          access type output to memory (Refer to Figure 3-2 Byte
                                          Access within a Doubleword.)
                                  Offset: pAddr2...0 output to memory
                                          LEM Little-endian memory (BigEndianMem = 0)
                                          BEM Big-endian memory (BigEndianMem = 1)
        Exceptions:
                    TLB miss exception
                    TLB invalid exception
                    Bus error exception
                    Address error exception
                    Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
LH                                       Load Halfword                                           LH
  31              26 25          21 20           16 15                                               0
         LH               base             rt                             offset
       100001
          6                  5               5                            16
           Format:
                   LH rt, offset(base)
           Description:
                   The 16-bit offset is sign-extended and added to the contents of general purpose
                   register base to form a virtual address. The contents of the halfword at the
                   memory location specified by the address are sign-extended and loaded into
                   general purpose register rt.
                   If the least-significant bit of the address is not zero, an address error exception
                   occurs.
           Operation:
      32     T:    vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
                   (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                   pAddr ¬ pAddrPSIZE – 1...3 || (pAddr2...0 xor (ReverseEndian2 || 0))
                   mem ¬ LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
                   byte ¬ vAddr2...0 xor (BigEndianCPU2 || 0)
                   GPR[rt] ¬ (mem15+8*byte)16 || mem15+8*byte...8* byte
      64     T:    vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
                   (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                   pAddr ¬ pAddrPSIZE – 1...3 || (pAddr2...0 xor (ReverseEndian2 || 0))
                   mem ¬ LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
                   byte ¬ vAddr2...0 xor (BigEndianCPU2 || 0)
                   GPR[rt] ¬ (mem15+8*byte)16 || mem15+8*byte...8* byte
           Exceptions:
                   TLB miss exception
                   TLB invalid exception
                   Bus error exception
                   Address error exception
      Format:
              LHU rt, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The contents of the halfword at the
              memory location specified by the address are zero-extended and loaded into
              general purpose register rt.
              If the least-significant bit of the address is not zero, an address error exception
              occurs.
      Operation:
 32     T:    vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
              (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
              pAddr ¬ pAddrPSIZE – 1...3 || (pAddr2...0 xor (ReverseEndian2 || 0))
              mem ¬ LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)
              byte ¬ vAddr2...0 xor (BigEndianCPU2 || 0)
              GPR[rt] ¬ 016 || mem15+8*byte...8*byte
      Exceptions:
              TLB miss exception           TLB invalid exception
              Bus error exception          Address error exception
LL                                     Load Linked                                            LL
  31          26 25            21 20             16 15                                           0
         LL             base                rt                        offset
       110000
         6               5                   5                        16
         Format:
                LL rt, offset(base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The contents of the word at the memory
                location specified by the address are loaded into general purpose register rt. In 64-
                bit mode, the loaded word is sign-extended. In addition, the specified physical
                address of the memory is stored to the LLAddr register, and sets 1 to LLbit.
                Afterward, the processor checks whether the address stored to the LLAddr register
                is not rewritten by the other processors or devices.
                Load Linked (LL) and Store Conditional (SC) instructions can be used to
                atomically update memory:
                                      L1:
                                             LL      T1, (T0)
                                             ADD     T2, T1, 1
                                             SC      T2, (T0)
                                             BEQ     T2, 0, L1
                                             NOP
                This atomically increments the word addressed by T0. Changing the ADD
                instruction to an OR instruction changes this to an atomic bit set.
                This instruction is available in User mode, and it is not necessary to enable CP0.
                This instruction is defined to maintain the software compatibility with the
                VR4400.
                                   Load Linked
LL                                (continued)                                               LL
             If the specified address is in the non-cache area, the operation of the LL instruction
             is undefined. A cache miss that occurs between the LL and SC instructions
             hinders execution of the SC instruction. Usually, therefore, do not use a load or
             store instruction between the LL and SC instructions. Otherwise, the operation of
             the SC instruction is not guaranteed. If an exception frequently occurs, the
             exception also hinders execution of the SC instruction. It is therefore necessary
             to disable the exception temporarily.
             If either of the low-order two bits of the address are not zero, an address error
             exception takes place.
      Operation:
 32    T:   vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
            (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
            pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian || 02))
            mem ¬ LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
            byte ¬ vAddr2...0 xor (BigEndianCPU || 02)
            GPR[rt] ¬ mem31+8*byte...8*byte
            LLbit ¬ 1
            LLAddr ¬ pAddr
 64    T:   vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
            (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
            pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian || 02))
            mem ¬ LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
            byte ¬ vAddr2...0 xor (BigEndianCPU || 02)
            GPR[rt] ¬ (mem31+8*byte)32 || mem31+8*byte...8*byte
            LLbit ¬ 1
            LLAddr ¬ pAddr
      Exceptions:
             TLB miss exception
             TLB invalid exception
             Bus error exception
             Address error exception
         Format:
                LLD rt, offset(base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The contents of the doubleword at the
                memory location specified by the address are loaded into general purpose register
                rt. In addition, the specified physical address of the memory is stored to the
                LLAddr register, and sets 1 to LLbit. Afterward, the processor checks whether the
                address stored to the LLAddr register is not rewritten by the other processors or
                devices.
                Load Linked Doubleword (LLD) instruction and Store Conditional Doubleword
                (SCD) instruction can be used to atomically update the memory:
                                    L1:
                                           LLD     T1, (T0)
                                           DADD    T2, T1, 1
                                           SCD     T2, (T0)
                                           BEQ     T2, 0, L1
                                           NOP
           If the specified address is in the non-cache area, the operation of the LLD
           instruction is undefined. A cache miss that may occur between the LLD and SCD
           instructions hinders execution of the SCD instruction. Usually, therefore, do not
           use a load or store instruction between the LLD and SCD instructions. Otherwise,
           the operation of the SCD instruction will not be guaranteed. If an exception
           frequently occurs, the exception also hinders execution of the SCD instruction. It
           is therefore necessary to disable the exception temporarily.
           This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
           Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
           causes a reserved instruction exception.
Operation:
           Remark     In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                      virtual address creation.
         Exceptions:
                TLB miss exception
                TLB invalid exception
                Bus error exception
                Address error exception
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
       LUI           0               rt                         immediate
     001111        00000
        6            5                5                            16
      Format:
              LUI rt, immediate
      Description:
              The 16-bit immediate is shifted left 16 bits and combined to 16 bits of zeros. The
              result is placed into general purpose register rt. In 64-bit mode, the loaded word
              is sign-extended to 64 bits.
Operation:
      Exceptions:
              None
LW                                     Load Word
                                                                                           LW
  31          26 25            21 20         16 15                                                 0
         LW             base            rt                            offset
       100011
          6              5               5                            16
         Format:
                LW rt, offset(base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The contents of the word at the memory
                location specified by the address are loaded into general purpose register rt. In 64-
                bit mode, the loaded word is sign-extended to 64 bits.
                If either of the low-order two bits of the address is not zero, an address error
                exception occurs.
         Operation:
         Exceptions:
                TLB miss exception
                TLB invalid exception
                Bus error exception
                Address error exception
        Format:
                LWCz rt, offset(base)
        Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The processor loads a word at the
                addressed memory location to the general purpose register rt of the CPz. The
                manner in which each coprocessor uses the data is defined by the individual
                coprocessor specifications.
                If either of the low-order two bits of the address is not zero, an address error
                exception occurs.
                This instruction is not valid for use with CP0.
         Operation:
 32     T:   vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
             (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
             pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian || 02))
             mem ¬ LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
             byte ¬ vAddr2...0 xor (BigEndianCPU || 02)
             COPzLW (byte, rt, mem)
 64     T:   vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
             (pAddr, uncached)¬ AddressTranslation (vAddr, DATA)
             pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian || 02))
             mem ¬ LoadMemory (uncached, WORD, pAddr, vAddr, DATA)
             byte ¬ vAddr2...0 xor (BigEndianCPU || 02)
             COPzLW (byte, rt, mem)
         Exceptions:
                TLB miss exception
                TLB invalid exception
                Bus error exception
                Address error exception
                Coprocessor unusable exception
              Bit # 31 30 29 28 27 26                                             0
             LWC2 1     1   0   0     1   0
     Format:
            LWL rt, offset(base)
     Description:
            This instruction is used in combination with the LWR instruction to load the word
            data in the memory that is not at the word boundary to general purpose register rt.
            The LWL instruction loads the high-order portion of the data to the register, while
            the LWR instruction loads the low-order portion.
            The 16-bit offset is sign-extended and added to the contents of general purpose
            register base to generate a virtual address that can specify any byte. Of the word
            data in the memory whose most-significant byte is specified by the generated
            address, only the data at the same word boundary as the target address is loaded
            and stored to the high-order portion of general purpose register rt. The remaining
            portion of the register is not affected. Depending on the address specified, the
            number of bytes to be loaded changes from 1 to 4.
            In other words, first the addressed byte is stored to the most-significant byte
            position of general purpose register rt. If there is data of the low-order byte that
            follows the same word boundary, the operation to store this data to the next byte
            of general purpose register rt is repeated.
            The remaining low-order byte is not affected.
                    memory
                   (big-endian)                                   register
address 4      4      5      6      7         before          A     B        C      D     $24
address 0      0      1      2      3         loading
                              LWL $24,1($0)
                                               after
                                               loading        1     2        3      D     $24
                  The contents of general purpose register rt are bypassed within the processor so
                  that no NOP instruction is needed between an immediately preceding load
                  instruction which targets general purpose register rt and a subsequent LWL (or
                  LWR) instruction.
                  The address exception error does not occur even if the specified address is not
                  located at the word boundary.
Operation:
                     The relationship, between the address given to the LWL instruction and the result
                     (bytes for registers) is shown below:
                 LWL
                 Register          A             B          C       D    E       F        G        H
Memory I J K L M N O P
                         BigEndianCPU = 0                                                BigEndianCPU = 1
                                                                offset                                          offset
vAddr2...0           destination                     type                        destination           type
                                                            LEM BEM                                           LEM BEM
   0         S   S   S   S   P   F       G   H        0         0    7   S   S   S   S   I J   K   L    3      4     0
   1         S   S   S   S   O   P       G   H        1         0    6   S   S   S   S   J K   L   H    2      4     1
   2         S   S   S   S   N   O       P   H        2         0    5   S   S   S   S   K L   G   H    1      4     2
   3         S   S   S   S   M   N       O   P        3         0    4   S   S   S   S   L F   G   H    0      4     3
   4         S   S   S   S   L   F       G   H        0         4    3   S   S   S   S   MN    O   P    3      0     4
   5         S   S   S   S   K   L       G   H        1         4    2   S   S   S   S   N O   P   H    2      0     5
   6         S   S   S   S   J   K       L   H        2         4    1   S   S   S   S   OP    G   H    1      0     6
   7         S   S   S   S   I   J       K   L        3         4    0   S   S   S   S   P F   G   H    0      0     7
                     Remark Type:        access type output to memory (Refer to Figure 3-2 Byte
                                         Access within a Doubleword.)
                                 Offset: pAddr2...0 output to memory
                                         LEM Little-endian memory (BigEndianMem = 0)
                                         BEM Big-endian memory (BigEndianMem = 1)
                                 S:      sign-extension of destination bit 31
        Exceptions:
                     TLB miss exception
                     TLB invalid exception
                     Bus error exception
                     Address error exception
         Format:
                LWR rt, offset(base)
         Description:
                This instruction is used in combination with the LWL instruction to load the word
                data in the memory that is not at the word boundary to general purpose register rt.
                The LWL instruction loads the high-order portion of the data to the register, while
                the LWR instruction loads the low-order portion.
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to generate a virtual address that can specify any byte. Of the word
                data in the memory whose least-significant byte is specified by the generated
                address, only the data at the same word boundary as the target address is loaded
                and stored to the low-order portion of general purpose register rt. The remaining
                portion of the register is not affected. Depending on the address specified, the
                number of bytes to be loaded changes from 1 to 4.
                In other words, first the addressed byte is stored to the least-significant byte
                position of general purpose register rt. If there is data of the high-order byte that
                follows the same word boundary, the operation to store this data to the next byte
                of general purpose register rt is repeated.
                The remaining high-order byte is not affected.
                         memory
                       (big-endian)                                      register
  address 4        4     5       6       7           before
                                                     loading         A    B       C    D     $24
  address 0        0     1       2       3
                                  LWR $24,4($0)
                                                     after
                                                     loading         A    B       C    4     $24
             The contents of general purpose register rt are bypassed within the processor so
             that no NOP instruction is needed between an immediately preceding load
             instruction which targets general purpose register rt and a following LDL (or
             LWR) instruction.
             The address error exception does not occur even if the specified address is not
             located at the word boundary.
Operation:
                  LWR
                  Register            A        B      C        D       E       F     G     H
Memory I J K L M N O P
                          BigEndianCPU = 0                                         BigEndianCPU = 1
                                                          offset                                       offset
 vAddr2...0           destination              type                        destination         type
                                                      LEM BEM                                         LEM BEM
      0       S   S   S   S   M   N   O   P     3       0      4   X   X   X   X   E F G   I    0      7    0
      1       X   X   X   X   E   M   N   O     2       1      4   X   X   X   X   E F I   J    1      6    0
      2       X   X   X   X   E   F   M   N     1       2      4   X   X   X   X   E I J   K    2      5    0
      3       X   X   X   X   E   F   G   M     0       3      4   S   S   S   S   I J K   L    3      4    0
      4       S   S   S   S   I   J   K   L     3       4      0   X   X   X   X   E F G   M    0      3    4
      5       X   X   X   X   E   I   J   K     2       5      0   X   X   X   X   E F M   N    1      2    4
      6       X   X   X   X   E   F   I   J     1       6      0   X   X   X   X   E MN    O    2      1    4
      7       X   X   X   X   E   F   G   I     0       7      0   S   S   S   S   MNO     P    3      0    4
                      Remark Type:          access type output to memory (Refer to Figure 3-2 Byte
                                            Access within a Doubleword.)
                                    Offset: pAddr2...0 output to memory
                                            LEM Little-endian memory (BigEndianMem = 0)
                                            BEM Big-endian memory (BigEndianMem = 1)
                                    S:      Sign-extension of destination bit 31
                                    x:      Not affected (in 32-bit mode)
                                    Sign-extension of destination bit 31 (in 64-bit mode)
          Exceptions:
                      TLB miss exception
                      TLB invalid exception
                      Bus error exception
                      Address error exception
      Format:
             LWU rt, offset(base)
      Description:
             The 16-bit offset is sign-extended and added to the contents of general purpose
             register base to form a virtual address. The contents of the word at the memory
             location specified by the address are loaded into general purpose register rt. The
             loaded word is zero-extended in 64-bit mode.
             If either of the low-order two bits of the effective address is not zero, an address
             error exception occurs.
             This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
             Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
             causes a reserved instruction exception.
Operation:
             Remark        In the32-bit Kernel mode, the high-order 32 bits are ignored during
                           virtual address creation.
         Exceptions:
                TLB miss exception
                TLB invalid exception
                Bus error exception
                Address error exception
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
                             Move From
MFC0                 System Control Coprocessor                               MFC0
31         26 25            21 20         16 15          11 10                              0
      COP0           MF              rt            rd                  0
     010000        00000                                         000 0000 0000
       6              5               5              5                 11
      Format:
              MFC0 rt, rd
      Description:
              The contents of general purpose register rd of the CP0 are loaded into general
              purpose register rt.
      Operation:
      32   T:      data ¬ CPR[0,rd]
           T+1: GPR[rt] ¬ data
      64   T:      data ¬ CPR[0,rd]
           T+1: GPR[rt] ¬ (data31)32 || data31...0
      Exceptions:
              Coprocessor unusable exception      (VR4300 in 64-/32-bit User and Supervisor
                                                  mode if CP0 is disabled)
         COPz           MF                rt              rd               0
       0 1 0 0 x x*    00000                                         000 0000 0000
            6            5                 5              5                11
         Format:
                  MFCz rt, rd
         Description:
                  The contents of general purpose register rd of CPz are loaded into general purpose
                  register rt.
Operation:
         32           T:     data ¬ CPR[z,rd]
                      T+1: GPR[rt] ¬ data
         64           T:     if rd0 = 0 then
                                     data ¬ CPR[z, rd4...1 || 0]31...0
                             else
                                     data ¬ CPR[z, rd4...1 || 0]63...32
                             endif
                      T+1: GPR[rt] ¬ (data31)32 || data
          Exceptions:
                  Coprocessor unusable exception
        Bit # 31 30 29 28 27 26 25 24 23 22 21                                            0
       MFC1 0    1    0   0     0   1   0   0   0   0    0
        Bit # 31 30 29 28 27 26 25 24 23 22 21                                            0
       MFC2 0    1    0   0     1   0   0   0   0   0    0
       SPECIAL              0                       rd                 0                MFHI
       000000         00 0000 0000                                  00000              010000
          6                10                       5                 5                   6
         Format:
                   MFHI rd
         Description:
                   The contents of special register HI are loaded into general purpose register rd.
                   To ensure proper operation in the event of interruptions, the two instructions
                   which follow a MFHI instruction may not be any of the instructions which modify
                   the HI register: MULT, MULTU, DIV, DIVU, MTHI, DMULT, DMULTU,
                   DDIV, DDIVU.
Operation:
32, 64 T: GPR[rd] ¬ HI
         Exceptions:
                   None
     SPECIAL            0                        rd                0                 MFLO
     000000       00 0000 0000                                   00000              010010
        6              10                        5                 5                   6
      Format:
                MFLO rd
      Description:
                The contents of special register LO are loaded into general purpose register rd.
                To ensure proper operation in the event of interruptions, the two instructions
                which follow a MFLO instruction may not be any of the instructions which
                modify the LO register: MULT, MULTU, DIV, DIVU, MTLO, DMULT,
                DMULTU, DDIV, DDIVU.
Operation:
32, 64 T: GPR[rd] ¬ LO
      Exceptions:
                None
                                  Move To
MTC0                     System Control Coprocessor                                   MTC0
  31          26 25             21 20           16 15          11 10                                0
        COP0            MT                rt              rd                  0
       010000         00100                                             000 0000 0000
         6               5                 5               5                 11
         Format:
                  MTC0 rt, rd
         Description:
                  The contents of general purpose register rt are loaded into general purpose register
                  rd of CP0.
                  Because the contents of the TLB may be altered by this instruction, the operation
                  of load instructions, store instructions, and TLB operations immediately prior to
                  and after this instruction are undefined.
                  If the register manipulated by this instruction is used by an instruction before or
                  after this instruction, place that instruction at an appropriate position by referring
                  to Chapter 19 Coprocessor 0 Hazards.
Operation:
         Exceptions:
                  Coprocessor unusable exception        (VR4300 in 64-/32-bit User and Supervisor
                                                        mode if CP0 is disabled)
       COPz            MT              rt              rd                0
     0 1 0 0 x x*     00100                                        000 0000 0000
          6             5               5              5                  11
       Format:
               MTCz rt, rd
       Description:
               The contents of general purpose register rt are loaded into general purpose register
               rd of CPz.
       Operation:
       32    T:   data ¬ GPR[rt]
             T+1: CPR[z, rd] ¬ data
       64    T:   data ¬ GPR[rt]31...0
             T+1: if rd0 = 0
                       CPR[z, rd4...1 || 0] ¬ CPR[z, rd4...1 || 0]63...32 || data
                  else
                       CPR[z, rd4...1 || 0] ¬ data || CPR[z, rd4...1 || 0]31...0
                  endif
       Exceptions:
               Coprocessor unusable exception
                           Move To Coprocessor z
MTCz                            (continued)                                   MTCz
              Bit # 31 30 29 28 27 26 25 24 23 22 21                                 0
             MTC1 0    1    0   0     0   1   0   0   1   0   0
              Bit # 31 30 29 28 27 26 25 24 23 22 21                                 0
             MTC2 0    1    0   0     1   0   0   0   1   0   0
                      Opcode
                                                            Coprocessor Sub-opcode
                           Coprocessor Number
     SPECIAL                      rs                0                          MTHI
     000000                                 000 000000000000                  010001
       6                           5                15                           6
      Format:
             MTHI rs
      Description:
             The contents of general purpose register rs are loaded into special register HI.
             If the MTHI instruction is executed following the MULT, MULTU, DIV, or
             DIVU instruction, the operation is performed normally. However, if the MFLO,
             MFHI, MTLO, or MTHI instruction is executed following the MTHI instruction,
             the contents of special register LO are undefined.
Operation:
      Exceptions:
             None
         SPECIAL                   rs                   0                         MTLO
        000000                                   000000000000000                 010011
            6                      5                    15                          6
         Format:
                MTLO rs
         Description:
                The contents of general purpose register rs are loaded into special register LO.
                If the MTLO instruction is executed following the MULT, MULTU, DIV, or
                DIVU instruction, the operation is performed normally. However, if the MFLO,
                MFHI, MTLO, or MTHI instruction is executed following the MTLO instruction,
                the contents of special register HI are undefined.
         Operation:
             32,64       T–2: LO ¬ undefined
                         T–1: LO ¬ undefined
                         T:        LO ¬ GPR[rs]
         Exceptions:
                None
     Format:
            MULT rs, rt
     Description:
            The contents of general purpose registers rs and rt are multiplied, treating both
            operands as 32-bit signed integers. An integer overflow exception never occurs.
            In 64-bit mode, the operands must be valid 32-bit, sign-extended values.
            When the operation completes, the low-order word of the double result is loaded
            into special register LO, and the high-order word of the double result is loaded into
            special register HI. In the 64-bit mode, the respective results are sign-extended
            and stored.
            If either the two instructions immediately preceding this instruction is the MFHI
            or MFLO instruction, the execution result of the transfer instruction is undefined.
            To obtain the correct result, insert two or more other instructions in between the
            MFHI or MFLO and MULT instruction.
                                    Multiply
MULT                              (continued)                        MULT
         Operation:
             32     T–2: LO        ¬ undefined
                         HI        ¬ undefined
                    T–1: LO        ¬ undefined
                         HI        ¬ undefined
                    T:   t         ¬ GPR[rs] * GPR[rt]
                         LO        ¬ t31...0
                         HI        ¬ t63...32
             64     T–2: LO        ¬ undefined
                         HI        ¬ undefined
                    T–1: LO        ¬ undefined
                         HI        ¬ undefined
                    T:   t         ¬ GPR[rs]31...0 * GPR[rt]31...0
                         LO        ¬ (t31)32 || t31...0
                         HI        ¬ (t63)32 || t63...32
         Exceptions:
                  None
     SPECIAL            rs              rt                 0                        MULTU
     000000                                           00 0000 0000                 011001
        6                5               5                 10                         6
      Format:
               MULTU rs, rt
      Description:
               The contents of general purpose register rs and the contents of general purpose
               register rt are multiplied, treating both operands as 32-bit unsigned values. An
               overflow exception never occurs.
               In 64-bit mode, the operands must be valid 32-bit, sign-extended values.
               When the operation completes, the low-order word of the doubleword result is
               loaded into special register LO, and the high-order word of the doubleword result
               is loaded into special register HI. In 64-bit mode, these results are sign-extended
               and loaded.
               If either of the two preceding instructions is MFHI or MFLO, the execution results
               of these transfer instructions are undefined. To obtain the correct result, insert two
               or more additional instructions in between the MFHI or MFLO and MULT
               instructions.
                        Multiply Unsigned
MULTU                      (continued)                             MULTU
Operation:
         32   T–2: LO   ¬ undefined
                   HI   ¬ undefined
              T–1: LO   ¬ undefined
                   HI   ¬ undefined
              T:   t    ¬ (0 || GPR[rs]) * (0 || GPR[rt])
                   LO   ¬ t31...0
                   HI   ¬ t63...32
         64   T–2: LO   ¬ undefined
                   HI   ¬ undefined
              T–1: LO   ¬ undefined
                   HI   ¬ undefined
              T:   t    ¬ (0 || GPR[rs]31...0) * (0 || GPR[rt]31...0)
                   LO   ¬ (t31)32 || t31...0
                   HI   ¬ (t63)32 || t63...32
         Exceptions:
                None
      SPECIAL             rs              rt             rd            0             NOR
     000000                                                          00000          100111
         6                 5               5             5             5               6
       Format:
                 NOR rd, rs, rt
       Description:
                 A logical NOR operation applied between the contents of general purpose
                 registers rs and rt is executed in bit units. The result is stored in general purpose
                 register rd.
Operation:
       Exceptions:
                 None
OR                                                Or                                            OR
  31              26 25            21 20         16 15           11 10          6    5              0
        SPECIAL             rs             rt              rd            0              OR
       000000                                                          00000          100101
           6                 5              5              5             5               6
         Format:
                   OR rd, rs, rt
         Description:
                   A logical OR operation applied between the contents of general purpose registers
                   rs and rt is executed in bit unites. The result is stored in general purpose register
                   rd.
Operation:
         Exceptions:
                   None
       ORI             rs             rt                          immediate
     001101
        6              5               5                             16
      Format:
              ORI rt, rs, immediate
      Description:
              A logical OR operation applied between 16-bit zero-extended immediate and the
              contents of general purpose register rs is executed in bit units. The result is stored
              in general purpose register rt.
Operation:
      Exceptions:
              None
SB                                      Store Byte                                           SB
  31          26 25            21 20          16 15                                              0
         SB             base            rt                             offset
       101000
          6               5              5                             16
         Format:
                SB rt, offset(base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The least-significant byte of register rt is
                stored in the memory specified by the address.
Operation:
         Exceptions:
                TLB miss exception
                TLB invalid exception
                TLB modification exception
                Bus error exception
                Address error exception
SC                             Store Conditional                                            SC
31         26 25            21 20           16 15                                                0
       SC            base             rt                             offset
     111000
        6               5              5                             16
      Format:
              SC rt, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The contents of general purpose register rt
              are stored at the memory location specified by the address only when the LL bit is
              set.
              If the other processor or device changes the physical address after the previous LL
              instruction has been executed, or if the ERET instruction exists between the LL
              and SC instructions, the register contents are not stored to the memory, and storing
              fails.
              The success or failure of the SC operation is indicated by the contents of general
              purpose register rt after execution of the instruction. A successful SC instruction
              sets the contents of general purpose register rt to 1; an unsuccessful SC instruction
              sets it to 0.
              The operation of SC is undefined when the address is different from the address
              used in the last LL instruction.
              This instruction is available in User mode; it is not necessary for CP0 to be
              enabled.
              If either of the low-order two bits of the address is not zero, an address error
              exception takes place.
              If this instruction both fails and causes an exception, the exception takes
              precedence.
              This instruction is defined to maintain software compatibility with the VR4400.
                                 Store Conditional
SC                                  (continued)                                 SC
           Operation:
      32     T:   vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
                  (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                  data ¬ GPR[rt]31...0
                  if LLbit then
                       StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
                  endif
                  GPR[rt] ¬ 031 || LLbit
      64     T:   vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
                  (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                  data ¬ GPR[rt]31...0
                  if LLbit then
                       StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
                  endif
                  GPR[rt] ¬ 063 || LLbit
           Exceptions:
                  TLB miss exception
                  TLB invalid exception
                  TLB modification exception
                  Bus error exception
                  Address error exception
      Format:
              SCD rt, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The contents of general purpose register rt
              are stored at the memory location specified by the address only when the LL bit is
              set.
              If another processor or device changes the target address after the previous LLD
              instruction has been executed, or if the ERET instruction exists between the LLD
              and SCD instructions, the register contents are not stored to the memory, and
              storing fails.
              The success or failure of the SCD operation is indicated by the contents of general
              purpose register rt after execution of the instruction. A successful SCD
              instruction sets the contents of general purpose register rt to 1; an unsuccessful
              SCD instruction sets it to 0.
              The operation of SCD is undefined when the address is different from the address
              used in the last LLD.
              This instruction is available in User mode; it is not necessary for CP0 to be
              enabled.
              If either of the low-order three bits of the address is not zero, an address error
              exception takes place.
              If this instruction both fails and causes an exception, the exception takes
              precedence.
              This instruction is defined in the 64-bit mode and 32-bit Kernel mode. If this
              instruction is executed in the 32-bit User or Supervisor mode, the reserved
              instruction exception occurs.
              This instruction is defined to maintain software compatibility with the VR4400.
Operation:
                Remark     In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                           virtual address creation.
         Exceptions:
                TLB miss exception
                TLB invalid exception
                TLB modification exception
                Bus error exception
                Address error exception
                Reserved instruction exception (32-bit User or Supervisor mode)
SD                             Store Doubleword                                             SD
31          26 25            21 20          16 15                                                0
       SD            base             rt                            offset
     111111
        6              5               5                             16
      Format:
              SD rt, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The contents of general purpose register rt
              are stored at the memory location specified by the address.
              If either of the low-order three bits of the address are not zero, an address error
              exception occurs.
              This operation is defined for the VR4300 operating in 64-bit mode and in 32-bit
              Kernel mode. Execution of this instruction in 32-bit User or Supervisor mode
              causes a reserved instruction exception.
Operation:
              Remark       In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                           virtual address creation.
SD                             Store Doubleword
                                  (continued)
                                                                                  SD
         Exceptions:
                TLB miss exception
                TLB invalid exception
                TLB modification exception
                Bus error exception
                Address error exception
                Reserved instruction exception (32-bit User or Supervisor mode)
        Format:
                SDCz rt, offset(base)
        Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. Register rt of coprocessor unit z sources a
                doubleword, which the processor writes to the addressed memory location. The
                stored data is defined by individual coprocessor specifications.
                If any of the low-order three bits of the address is not zero, an address error
                exception takes place.
                This instruction is not valid for use with CP0.
                When the CP1 is specified, the FR bit of the Status register equals 0, and the least-
                significant bit in the rt field is not 0, the operation of this instruction is undefined.
                If the FR bit equals 1, both odd and even registers can be specified by rt.
                                 Store Doubleword
SDCz                            From Coprocessor z                  SDCz
                                    (continued)
Operation:
         Exceptions:
                TLB miss exception
                TLB invalid exception
                TLB modification exception
                Bus error exception
                Address error exception
                Coprocessor unusable exception
        Opcode Bit Encoding:
               Bit # 31 30 29 28 27 26
 SDCz                                                                        0
              SDC1 1    1   1   1     0   1
               Bit # 31 30 29 28 27 26                                       0
              SDC2 1    1   1   1     1   0
      Format:
                SDL rt, offset(base)
      Description:
                This instruction is used in combination with the SDR instruction to store the
                doubleword data in the register to the doubleword in the memory that is not at the
                doubleword boundary. The SDL instruction stores the high-order portion of the
                data to the memory, while the SDR instruction stores the low-order portion.
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to generate a virtual address. Of the doubleword data in the memory
                whose most-significant byte is specified by the generated address, only the high-
                order portion of general purpose register rt is stored to the memory at the same
                doubleword boundary as the target address. Depending on the address specified,
                the number of bytes to be stored changes from 1 to 8.
                In other words, first the most-significant byte position of general purpose register
                rt is stored to the bytes in the addressed memory. If there is data of the low-order
                byte that follows the same doubleword boundary, the operation to store this data
                to the next byte of the memory is repeated.
                           memory
                        (big-endian)
                                                                        register
address 8   8      9   10 11 12 13 14 15 before
address 0   0      1   2 3 4 5 6 7 storing                        A B C D E F G H $24
                                                   SDL $24,1($0)
address 8   8      9   10 11 12 13 14 15 after
address 0   0      A   B C D E F G storing
                  The address error exception does not occur even if the specified address is not
                  located at the doubleword boundary. This operation is defined in the 64-bit mode
                  and 32-bit Kernel mode. If this instruction is executed in the 32-bit User or
                  Supervisor mode, the reserved instruction exception occurs.
Operation:
                  Remark     In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                             virtual address creation.
                 SDL
                 Register            A        B       C      D        E       F     G        H
Memory I J K L M N O P
                         BigEndianCPU = 0                                         BigEndianCPU = 1
                                                        offset                                          offset
vAddr2...0           destination              type                        destination            type LEM BEM
                                                     LEM BEM
   0         I   J   K   L   M   N   O   A     0       0    7     A   B   C   D   E F   G   H    7      0     0
   1         I   J   K   L   M   N   A   B     1       0    6     I   A   B   C   D E   F   G    6      0     1
   2         I   J   K   L   M   A   B   C     2       0    5     I   J   A   B   C D   E   F    5      0     2
   3         I   J   K   L   A   B   C   D     3       0    4     I   J   K   A   B C   D   E    4      0     3
   4         I   J   K   A   B   C   D   E     4       0    3     I   J   K   L   A B   C   D    3      0     4
   5         I   J   A   B   C   D   E   F     5       0    2     I   J   K   L   MA    B   C    2      0     5
   6         I   A   B   C   D   E   F   G     6       0    1     I   J   K   L   MN    A   B    1      0     6
   7         A   B   C   D   E   F   G   H     7       0    0     I   J   K   L   MN    O   A    0      0     7
                     Remark Type:        access type output to memory (Refer to Figure 3-2 Byte
                                         Access within a Doubleword.)
                                 Offset: pAddr2...0 output to memory
                                         LEM Little-endian memory (BigEndianMem = 0)
                                         BEM Big-endian memory (BigEndianMem = 1)
        Exceptions:
                     TLB miss exception
                     TLB invalid exception
                     TLB modification exception
                     Bus error exception
                     Address error exception
                     Reserved instruction exception (32-bit User or Supervisor mode)
         Format:
                  SDR rt, offset(base)
         Description:
                  This instruction is used in combination with the SDL instruction to store the
                  doubleword data in the register to the word data in the memory that is not at the
                  doubleword boundary. The SDL instruction stores the high-order portion of the
                  data to the memory, while the SDR instruction stores the low-order portion.
                  The 16-bit offset is sign-extended and added to the contents of general purpose
                  register base to generate a virtual address. Of the doubleword data in the memory
                  whose least-significant byte is specified by the generated address, only the low-
                  order portion of general purpose register rt is stored to the memory at the same
                  doubleword boundary as the target address. Depending on the address specified,
                  the number of bytes to be stored changes from 1 to 8.
                  In other words, first the least-significant byte position of general purpose register
                  rt is stored to the bytes in the addressed memory. If there is data of the high-order
                  byte that follows the same doubleword boundary, the operation to store this data
                  to the next byte of the memory is repeated.
                             memory
                           (big-endian)
                                                                           register
 address 8    8     9   10 11 12 13 14 15             before        A B C D E F G H $24
 address 0    0     1   2 3 4 5 6 7                   storing
                                                     SDR $24,10($0)
 address 8    8    9    10 11 12 13 14 15             after
 address 0                                            storing
              E    F    G H 4 5 6 7
            The address error exception does not occur even if the specified address is not
            located at the doubleword boundary. This operation is defined in the 64-bit mode
            and 32-bit Kernel mode. If this instruction is executed in the 32-bit User or
            Supervisor mode, the reserved instruction exception occurs.
Operation:
            Remark     In the 32-bit Kernel mode, the high-order 32 bits are ignored during
                       virtual address creation.
                   SDR
                   Register          A       B       C       D    E       F        G        H
Memory I J K L M N O P
                           BigEndianCPU = 0                                       BigEndianCPU = 1
                                                         offset                                         offset
  vAddr2...0           destination           type                         destination           type
                                                     LEM BEM                                           LEM BEM
      0        A   B   C   DE    F   G   H       7       0   0    H   J   K   L   MN    O   P    0      7    0
      1        B   C   D   EF    G   H   P       6       1   0    G   H   K   L   MN    O   P    1      6    0
      2        C   D   E   F G   H   O   P       5       2   0    F   G   H   L   MN    O   P    2      5    0
      3        D   E   F   GH    N   O   P       4       3   0    E   F   G   H   MN    O   P    3      4    0
      4        E   F   G   HM    N   O   P       3       4   0    D   E   F   G   H N   O   P    4      3    0
      5        F   G   H   L M   N   O   P       2       5   0    C   D   E   F   GH    O   P    5      2    0
      6        G   H   K   L M   N   O   P       1       6   0    B   C   D   E   F G   H   P    6      1    0
      7        H   J   K   L M   N   O   P       0       7   0    A   B   C   D   E F   G   H    7      0    0
                    Remark Type:         access type output to memory (Refer to Figure 3-2 Byte
                                         Access within a Doubleword.)
                                 Offset: pAddr2...0 output to memory
                                         LEM Little-endian memory (BigEndianMem = 0)
                                         BEM Big-endian memory (BigEndianMem = 1)
          Exceptions:
                    TLB miss exception
                    TLB invalid exception
                    TLB modification exception
                    Bus error exception
                    Address error exception
                    Reserved instruction exception (32-bit User or Supervisor mode)
SH                                  Store Halfword                                          SH
31          26 25           21 20           16 15                                                0
       SH            base             rt                             offset
     101001
        6               5              5                             16
      Format:
              SH rt, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The least-significant halfword of register
              rt is stored in the memory specified by the address.
              If the least-significant bit of the address is not zero, an address error exception
              occurs.
      Operation:
      32    T:     vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
                         (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                   pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian2 || 0))
                   byte ¬ vAddr2...0 xor (BigEndianCPU2 || 0)
                   data ¬ GPR[rt]63–8*byte...0 || 08*byte
                   StoreMemory (uncached, HALFWORD, data, pAddr, vAddr, DATA)
      64    T:     vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
                         (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                   pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian2 || 0))
                   byte ¬ vAddr2...0 xor (BigEndianCPU2 || 0)
                   data ¬ GPR[rt]63–8*byte...0 || 08*byte
                   StoreMemory (uncached, HALFWORD, data, pAddr, vAddr, DATA)
SH                               Store Halfword
                                  (Continued)
                                                               SH
         Exceptions:
                TLB miss exception
                TLB invalid exception
                TLB modification exception
                Bus error exception
                Address error exception
      SPECIAL         0                  rt             rd              sa             SLL
     000000         00000                                                            000000
         6            5                   5             5               5               6
       Format:
                SLL rd, rt, sa
       Description:
                The contents of general purpose register rt are shifted left by sa bits, inserting
                zeros into the low-order bits. The result is stored in general purpose register rd.
                In the 64-bit mode, the value resulting from sign-extending the shifted 32-bit
                value is stored as a result. If the shift value is 0, the low-order 32 bits of the 64-
                bit value is sign-extended. This instruction can generate a 64-bit value that sign-
                extends a 32-bit value.
Operation:
       64    T:     s ¬ 0 || sa
                    temp ¬ GPR[rt]31-s...0 || 0s
                    GPR[rd] ¬ (temp31)32 || temp
       Exceptions:
                None
                Caution If the shift value of this instruction is 0, the assembler may treats
                        this instruction as NOP. When using this instruction for sign
                        extension, check the specifications of the assembler.
        SPECIAL            rs              rt               rd            0                 SLLV
       000000                                                           00000              000100
           6                5               5               5             5                  6
         Format:
                  SLLV rd, rt, rs
         Description:
                  The contents of general purpose register rt are shifted left the number of bits
                  specified by the low-order five bits of the contents of the general purpose register
                  rs, inserting zeros into the low-order bits. The result is stored in general purpose
                  register rd. In the 64-bit mode, the value resulting from sign-extending the shifted
                  32-bit value is stored as a result. If the shift value is 0, the low-order 32 bits of the
                  64-bit value is sign-extended. This instruction can generate a 64-bit value that
                  sign-extends a 32-bit value.
Operation:
         32   T:      s ¬ GPR[rs]4...0
                      GPR[rd]¬ GPR[rt](31–s)...0 || 0s
         64   T:      s ¬ 0 || GPR[rs]4...0
                      temp ¬ GPR[rt](31–s)...0 || 0s
                      GPR[rd] ¬ (temp31)32 || temp
         Exceptions:
                  None
                  Caution If the shift value of this instruction is 0, the assembler may treats
                          this instruction as NOP. When using this instruction for sign
                          extension, check the specifications of the assembler.
      SPECIAL            rs              rt             rd           0                  SLT
     000000                                                        00000              101010
         6                5               5             5            5                  6
       Format:
                SLT rd, rs, rt
       Description:
                The contents of general purpose register rt are subtracted from the contents of
                general purpose register rs. Assuming these register contents as signed integers,
                if the contents of general purpose register rs are less than the contents of general
                purpose register rt, one is stored in the general purpose register rd; otherwise zero
                is stored in the general purpose register rd.
                An integer overflow exception never occurs. The comparison is valid even if the
                subtraction used during the comparison overflows.
Operation:
       Exceptions:
                None
         SLTI             rs             rt                        immediate
       001010
          6                  5            5                           16
         Format:
                SLTI rt, rs, immediate
         Description:
                The 16-bit immediate is sign-extended and subtracted from the contents of general
                purpose register rs. Assuming these values are signed integers, if rs contents are
                less than the sign-extended immediate, one is stored in the general purpose register
                rt; otherwise zero is stored in the general purpose register rt.
                An integer overflow exception never occurs. The comparison is valid even if the
                subtraction overflows.
Operation:
         Exceptions:
                None
      SLTIU           rs             rt                        immediate
     001011
        6              5                5                         16
      Format:
              SLTIU rt, rs, immediate
      Description:
              The 16-bit immediate is sign-extended and subtracted from the contents of general
              purpose register rs. Assuming these values are unsigned integers, if rs contents
              are less than the sign-extended immediate, one is stored in the general purpose
              register rt; otherwise zero is stored in the general purpose register rt.
              An integer overflow exception never occurs. The comparison is valid even if the
              subtraction overflows.
Operation:
      Exceptions:
              None
        SPECIAL            rs             rt              rd            0              SLTU
       000000                                                         00000          101011
           6                5              5              5             5               6
         Format:
                  SLTU rd, rs, rt
         Description:
                  The contents of general purpose register rt are subtracted from the contents of
                  general purpose register rs. Assuming these values are unsigned integers, if the
                  contents of general purpose register rs are less than the contents of general
                  purpose register rt, one is stored in the general purpose register rd; otherwise zero
                  is stored in the general purpose register rd.
                  An integer overflow exception never occurs. The comparison is valid even if the
                  subtraction overflows.
Operation:
         Exceptions:
                  None
      SPECIAL          0                 rt             rd             sa            SRA
     000000          00000                                                         000011
         6             5                  5             5              5              6
       Format:
                SRA rd, rt, sa
       Description:
                The contents of general purpose register rt are shifted right by sa bits, inserting
                signed bits into the high-order bits. The result is stored in the general purpose
                register rd. In 64-bit mode, the sign-extended 32-bit value is stored as the result.
Operation:
       64   T:      s ¬ 0 || sa
                    temp ¬ (GPR[rt]31)s || GPR[rt] 31...s
                    GPR[rd] ¬ (temp31)32 || temp
       Exceptions:
                None
                                       Shift Right
SRAV                              Arithmetic Variable                              SRAV
  31          26 25             21 20          16 15           11 10          6   5               0
        SPECIAL           rs             rt              rd            0              SRAV
       000000                                                        00000          000111
           6               5              5              5             5               6
         Format:
                  SRAV rd, rt, rs
         Description:
                  The contents of general purpose register rt are shifted right by the number of bits
                  specified by the low-order five bits of general purpose register rs, sign-extending
                  the high-order bits. The result is stored in the general purpose register rd. In 64-
                  bit mode, the sign-extended 32-bit value is stored as the result.
Operation:
         32   T:      s ¬ GPR[rs]4...0
                      GPR[rd] ¬ (GPR[rt]31)s || GPR[rt]31...s
         64   T:      s ¬ GPR[rs]4...0
                      temp ¬ (GPR[rt]31)s || GPR[rt]31...s
                      GPR[rd] ¬ (temp31)32 || temp
         Exceptions:
                  None
      SPECIAL         0                  rt             rd             sa             SRL
     000000         00000                                                           000010
         6               5                5             5              5               6
       Format:
                SRL rd, rt, sa
       Description:
                The contents of general purpose register rt are shifted right by sa bits, inserting
                zeros into the high-order bits. The result is stored in the general purpose register
                rd. In 64-bit mode, the sign-extended 32-bit value is stored as the result.
Operation:
32 T: GPR[rd] ¬ 0 sa || GPR[rt]31...sa
       64    T:     s ¬ 0 || sa
                    temp ¬ 0s || GPR[rt]31...s
                    GPR[rd] ¬ (temp31)32 || temp
       Exceptions:
                None
        SPECIAL           rs              rt              rd            0              SRLV
       000000                                                         00000          000110
           6               5               5              5             5               6
         Format:
                  SRLV rd, rt, rs
         Description:
                  The contents of general purpose register rt are shifted right by the number of bits
                  specified by the low-order five bits of general purpose register rs, inserting zeros
                  into the high-order bits. The result is stored in the general purpose register rd. In
                  64-bit mode, the sign-extended 32-bit value is stored as the result.
Operation:
         32    T:     s ¬ GPR[rs]4...0
                      GPR[rd] ¬ 0s || GPR[rt]31...s
         64    T:     s ¬ GPR[rs]4...0
                      temp ¬ 0s || GPR[rt]31...s
                      GPR[rd] ¬ (temp31)32 || temp
         Exceptions:
                  None
      SPECIAL            rs              rt             rd            0               SUB
     000000                                                         00000           100010
         6               5                5             5             5                6
       Format:
                SUB rd, rs, rt
       Description:
                The contents of general purpose register rt are subtracted from the contents of
                general purpose register rs, and result is stored into general purpose register rd. In
                64-bit mode, the sign-extended 32-bit values is stored as the result.
                An integer overflow exception occurs if the carries out of bits 30 and 31 differ (2’s
                complement overflow). The destination register rd is not modified when an
                integer overflow exception occurs.
       Operation:
       32    T:     GPR[rd] ¬ GPR[rs] – GPR[rt]
       Exceptions:
                Integer overflow exception
       SPECIAL           rs             rt              rd           0                SUBU
       000000                                                      00000             100011
          6               5              5              5            5                 6
         Format:
                SUBU rd, rs, rt
         Description:
                The contents of general purpose register rt are subtracted from the contents of
                general purpose register rs and the result is stored in general purpose register rd.
                In 64-bit mode, the sign-extended 32-bit values is stored as the result.
                The only difference between this instruction and the SUB instruction is that
                SUBU never causes an integer overflow exception.
         Operation:
         32    T:     GPR[rd] ¬ GPR[rs] – GPR[rt]
         Exceptions:
                None
SW                                  Store Word                                           SW
31         26 25            21 20          16 15                                                0
       SW            base             rt                            offset
     101011
        6              5               5                            16
      Format:
              SW rt, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The contents of general purpose register rt
              are stored in the memory location specified by the address. If either of the low-
              order two bits of the address are not zero, an address error exception occurs.
      Operation:
      32   T:      vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
                   (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                   data ¬ GPR[rt]31...0
                   StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
      Exceptions:
              TLB miss exception
              TLB invalid exception
              TLB modification exception
              Bus error exception
              Address error exception
          Format:
                  SWCz rt, offset(base)
          Description:
                  The 16-bit offset is sign-extended and added to the contents of general purpose
                  register base to form a virtual address. Coprocessor register rt of the CPz is stored
                  in the addressed memory. The data to be stored is defined by individual
                  coprocessor specifications. This instruction is not valid for use with CP0.
                  If either of the low-order two bits of the address is not zero, an address error
                  exception occurs.
          Operation:
          32      T: vAddr ¬ ((offset15)16 || offset15...0) + GPR[base]
                           (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                     pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian || 02))
                     byte ¬ vAddr2...0 xor (BigEndianCPU || 02)
                     data ¬ COPzSW (byte, rt)
                     StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
          64      T:    vAddr ¬ ((offset15)48 || offset15...0) + GPR[base]
                             (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                       pAddr ¬ pAddrPSIZE-1...3 || (pAddr2...0 xor (ReverseEndian || 02))
                       byte ¬ vAddr2...0 xor (BigEndianCPU || 02)
                       data ¬ COPzSW (byte,rt)
                       StoreMemory (uncached, WORD, data, pAddr, vAddr DATA)
  Exceptions:
         TLB miss exception
         TLB invalid exception
         TLB modification exception
         Bus error exception
         Address error exception
         Coprocessor unusable exception
       Bit # 31 30 29 28 27 26                                                       0
     SWC2 1      1   1   0     1   0
         Format:
                SWL rt, offset(base)
         Description:
                This instruction is used in combination with the SWR instruction to store the word
                in the register to the word in the memory that is not at the word boundary. The
                SWL instruction stores the high-order portion of the data to the memory, while the
                SWR instruction stores the low-order portion.
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to generate a virtual address. Of the word data in the memory whose
                most-significant byte is specified by the generated address, only the high-order
                portion of general purpose register rt is stored to the memory at the same word
                boundary as the target address.
                Depending on the address specified, the number of bytes to be stored changes
                from 1 to 4.
                In other words, first the most-significant byte position of general purpose register
                rt is stored to the bytes in the addressed memory. If there is data of the low-order
                byte that follows the same word boundary, the operation to store this data to the
                next byte of the memory is repeated.
                No address exceptions occur due to the specified address which is not located at
                the word boundary.
                         memory
                       (big-endian)                                       register
   address 4       4       5      6        7       before             A     B        C   D   $24
   address 0       0       1      2        3       storing
Operation:
                      The relationships between the contents given to the SWL instruction and the result
                      (bytes for words in the memory) are shown below:
                  SWL
                  Register          A         B      C      D           E       F      G      H
Memory I J K L M N O P
                          BigEndianCPU = 0                                          BigEndianCPU = 1
                                                       offset                                             offset
 vAddr2...0           destination            type                           destination           type
                                                    LEM BEM                                              LEM BEM
      0       I   J   K   L M   N   O   E     0       0    7        E   F   G   H   MN    O   P    3      4    0
      1       I   J   K   L M   N   E   F     1       0    6        I   E   F   G   MN    O   P    2      4    1
      2       I   J   K   L M   E   F   G     2       0    5        I   J   E   F   MN    O   P    1      4    2
      3       I   J   K   L E   F   G   H     3       0    4        I   J   K   E   MN    O   P    0      4    3
      4       I   J   K   EM    N   O   P     0       4    3        I   J   K   L   E F   G   H    3      0    4
      5       I   J   E   F M   N   O   P     1       4    2        I   J   K   L   ME    F   G    2      0    5
      6       I   E   F   GM    N   O   P     2       4    1        I   J   K   L   MN    E   F    1      0    6
      7       E   F   G   HM    N   O   P     3       4    0        I   J   K   L   MN    O   E    0      0    7
                      Remark Type:      access type output to memory (Refer to Figure 3-2 Byte
                                        Access within a Doubleword.)
                                Offset: pAddr2...0 output to memory
                                        LEM Little-endian memory (BigEndianMem = 0)
                                        BEM Big-endian memory (BigEndianMem = 1)
          Exceptions:
                      TLB miss exception
                      TLB invalid exception
                      TLB modification exception
                      Bus error exception
                      Address error exception
      Format:
              SWR rt, offset(base)
      Description:
              This instruction is used in combination with the SWL instruction to store word
              data in the register to the word data in the memory that is not at the word boundary.
              The SWL instruction stores the high-order portion of the data to the memory,
              while the SWR instruction stores the low-order portion.
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to generate a virtual address. Of the word data in the memory whose
              least-significant byte is specified by the generated address, only the low-order
              portion of general purpose register rt is stored to the memory at the same word
              boundary as the target address. Depending on the address specified, the number
              of bytes to be stored changes from 1 to 4.
              In other words, first the least-significant byte position of general purpose register
              rt is stored to the bytes in the addressed memory. If there is data of the high-order
              byte that follows the same word boundary, the operation to store this data to the
              next byte of the memory is repeated.
              No address exceptions occur due to the specified address which is not located at
              the word boundary.
                     memory
                    (big-endian)                                      register
address 4       4      5      6        7       before             A     B        C      D     $24
address 0       0      1      2        3       storing
                                                         SWR $24,4($0)
address 4       D      5      6        7       after
address 0       0      1      2        3       storing
           Operation:
      32      T: vAddr ¬ ((offset15)16 || offset 15...0) + GPR[base]
                 (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                 pAddr ¬ pAddrPSIZE – 1...3 || (pAddr2...0 xor ReverseEndian3)
                 BigEndianMem = 0 then
                      pAddr ¬ pAddr31...2 || 02
                 endif
                 byte ¬ vAddr1...0 xor BigEndianCPU2
                 if (vAddr2 xor BigEndianCPU) = 0 then
                      data ¬ 032 || GPR[rt]31-8*byte...0 || 08*byte
                 else
                      data ¬ GPR[rt]31-8*byte...0 || 08*byte || 032
                 endif
                 Storememory (uncached, WORD-byte, data, pAddr, vAddr, DATA)
      64      T: vAddr ¬ ((offset15)48 || offset 15...0) + GPR[base]
                 (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                 pAddr ¬ pAddrPSIZE – 1...3 || (pAddr2...0 xor ReverseEndian3)
                 If BigEndianMem = 0 then
                      pAddr ¬ pAddr31...2 || 02
                 endif
                 byte ¬ vAddr1...0 xor BigEndianCPU2
                 if (vAddr2 xor BigEndianCPU) = 0 then
                      data ¬ 032 || GPR[rt]31-8*byte...0 || 08*byte
                 else
                      data ¬ GPR[rt]31-8*byte...0 || 08*byte || 032
                 endif
                 StoreMemory (uncached, WORD-byte, data, pAddr, vAddr, DATA)
                  The relationships between the register contents given to the SWR instruction and
                  the result (bytes for words in the memory) are shown below:
                 SWR
                 Register            A       B       C       D    E        F         G       H
Memory I J K L M N O P
                         BigEndianCPU = 0                                          BigEndianCPU = 1
                                                         offset                                          offset
vAddr2...0           destination             type                         destination            type
                                                     LEM BEM                                            LEM BEM
   0         I   J   K   L   E   F   G   H       3       0   4    H   J   K    L   MN    O   P   0       7    0
   1         I   J   K   L   F   G   H   P       2       1   4    G   H   K    L   MN    O   P   1       6    0
   2         I   J   K   L   G   H   O   P       1       2   4    F   G   H    L   MN    O   P   2       5    0
   3         I   J   K   L   H   N   O   P       0       3   4    E   F   G    H   MN    O   P   3       4    0
   4         E   F   G   H   M   N   O   P       3       4   0    I   J   K    L   H N   O   P   0       3    4
   5         F   G   H   L   M   N   O   P       2       5   0    I   J   K    L   GH    O   P   1       2    4
   6         G   H   K   L   M   N   O   P       1       6   0    I   J   K    L   F G   H   P   2       1    4
   7         H   J   K   L   M   N   O   P       0       7   0    I   J   K    L   E F   G   H   3       0    4
                  Remark Type:           access type output to memory (Refer to Figure 3-2 Byte
                                         Access within a Doubleword.)
                                 Offset: pAddr2...0 output to memory
                                         LEM Little-endian memory (BigEndianMem = 0)
                                         BEM Big-endian memory (BigEndianMem = 1)
       Exceptions:
                  TLB miss exception
                  TLB invalid exception
                  TLB modification exception
                  Bus error exception
                  Address error exception
        SPECIAL                       0                                              SYNC
       000000             0000 0000 0000 0000 0000                                  001111
           6                                     20                                   6
         Format:
                  SYNC
         Description:
                  The SYNC instruction is executed as a NOP on the VR4300. This operation
                  maintains compatibility with code that conforms to the VR4400.
                  This instruction is defined to maintain software compatibility with the VR4400.
Operation:
32, 64 T: SyncOperation ()
         Exceptions:
                  None
       Format:
                SYSCALL
       Description:
                A system call exception occurs after this instruction is executed, unconditionally
                transferring control to the exception handler.
                A parameter can be sent to the exception handler by using the code area. If the
                exception handler uses this parameter, the contents of the memory word including
                the instruction must be loaded as data.
Operation:
32, 64 T: SystemCallException
       Exceptions:
                System Call exception
         Format:
                   TEQ rs, rt
         Description:
                   The contents of general purpose register rt are compared with general purpose
                   register rs. If the contents of general purpose register rs are equal to the contents
                   of general purpose register rt, a trap exception occurs.
                   A parameter can be sent to the exception handler by using the code area. If the
                   exception handler uses this parameter, the contents of the memory word including
                   the instruction must be loaded as data.
         Operation:
         32, 64        T:    if GPR[rs] = GPR[rt] then
                                 TrapException
                            endif
         Exceptions:
                   Trap exception
      Format:
              TEQI rs, immediate
      Description:
              The 16-bit immediate is sign-extended and compared with the contents of general
              purpose register rs. If the contents of general purpose register rs are equal to the
              sign-extended immediate, a trap exception occurs.
      Operation:
      32    T:     if GPR[rs] = (immediate15)16 || immediate15...0 then
                       TrapException
                   endif
      Exceptions:
              Trap exception
         Format:
                  TGE rs, rt
         Description:
                  The contents of general purpose register rt are compared with the contents of
                  general purpose register rs. Assuming both register contents are signed integers,
                  if the contents of general purpose register rs are greater than or equal to the
                  contents of general purpose register rt, a trap exception occurs.
                  A parameter can be sent to the exception handler by using the code area. If the
                  exception handler uses this parameter, the contents of the memory word including
                  the instruction must be loaded as data.
Operation:
         Exceptions:
                  Trap exception
      Format:
              TGEI rs, immediate
      Description:
              The 16-bit immediate is sign-extended and compared with the contents of general
              purpose register rs. Assuming both values are signed integers, if the contents of
              general purpose register rs are greater than or equal to the sign-extended
              immediate, a trap exception occurs.
Operation:
      Exceptions:
              Trap exception
         Format:
                   TGEIU rs, immediate
         Description:
                   The 16-bit immediate is sign-extended and compared with the contents of general
                   purpose register rs. Assuming both values are unsigned integers, if the contents
                   of general purpose register rs are greater than or equal to the sign-extended
                   immediate, a trap exception occurs.
Operation:
         Exceptions:
                   Trap exception
       Format:
                 TGEU rs, rt
       Description:
                 The contents of general purpose register rt are compared with the contents of
                 general purpose register rs. Assuming both values are unsigned integers, if the
                 contents of general purpose register rs are greater than or equal to the contents of
                 general purpose register rt, a trap exception occurs.
                 A parameter can be sent to the exception handler by using the code area. If the
                 exception handler uses this parameter, the contents of the memory word including
                 the instruction must be loaded as data.
Operation:
       Exceptions:
                 Trap exception
        COP0       CO                      0                                             TLBP
       010000       1           000 0000 0000 0000 0000                                 001000
          6         1                      19                                              6
         Format:
                TLBP
         Description:
                Searches a TLB entry that matches with the contents of the entry Hi register and
                sets the number of that TLB entry to the index register. If a TLB entry that
                matches is not found, sets the most significant bit of the index register.
                The architecture does not specify the operation of memory references associated
                with the instruction immediately after a TLBP instruction, nor is the operation
                specified if more than one TLB entry matches.
Operation:
         Exceptions:
                Coprocessor unusable exception
      COP0       CO                     0                                             TLBR
     010000       1          000 0000 0000 0000 0000                                 000001
        6         1                     19                                              6
      Format:
              TLBR
      Description:
              The EntryHi and EntryLo registers are loaded with the contents of the TLB entry
              pointed at by the contents of the Index register. The G bit (which controls ASID
              matching) read from the TLB is written into both of the EntryLo0 and EntryLo1
              registers.
              The operation is invalid if the contents of the Index register are greater than the
              number of TLB entries in the processor.
Operation:
       32     T: PageMask ¬ TLB[Index5...0]127...96
                 EntryHi ¬ TLB[Index5...0]95...64 and not TLB[Index5...0]127...96
                 EntryLo1 ¬TLB[Index5...0]63...33|| TLB[Index5...0]76
                 EntryLo0 ¬ TLB[Index5...0]31...1|| TLB[Index5...0]76
       64     T: PageMask ¬ TLB[Index5...0]255...192
                 EntryHi ¬ TLB[Index5...0]191...128 and not TLB[Index5...0]255...192
                 EntryLo1 ¬TLB[Index5...0]127...65 || TLB[Index5...0]140
                 EntryLo0 ¬ TLB[Index5...0]63...1 || TLB[Index5...0]140
      Exceptions:
              Coprocessor unusable exception
        COP0        CO                    0                                             TLBWI
       010000        1         000 0000 0000 0000 0000                                 000010
          6          1                    19                                              6
         Format:
                TLBWI
         Description:
                The TLB entry pointed at by the Index register is loaded with the contents of the
                EntryHi and EntryLo registers. The G bit of the TLB is written with the logical
                AND of the G bits in the EntryLo0 and EntryLo1 registers.
                The operation is invalid if the contents of the Index register are greater than the
                number of TLB entries in the processor.
Operation:
       32, 64 T: TLB[Index5...0] ¬
                   PageMask || (EntryHi and not PageMask) || EntryLo1 || EntryLo0
         Exceptions:
                Coprocessor unusable exception
      COP0        CO                   0                                            TLBWR
     010000        1        000 0000 0000 0000 0000                                000110
        6          1                   19                                              6
      Format:
              TLBWR
      Description:
              The TLB entry pointed at by the Random register is loaded with the contents of
              the EntryHi and EntryLo registers. The G bit of the TLB is written with the logical
              AND of the G bits in the EntryLo0 and EntryLo1 registers.
Operation:
     32, 64 T: TLB[Random5...0] ¬
                 PageMask || (EntryHi and not PageMask) || EntryLo1 || EntryLo0
      Exceptions:
              Coprocessor unusable exception
         Format:
                  TLT rs, rt
         Description:
                  The contents of general purpose register rt are compared with general purpose
                  register rs. Assuming both values are signed integers, if the contents of general
                  purpose register rs are less than the contents of general purpose register rt, a trap
                  exception occurs.
                  A parameter can be sent to the exception handler by using the code area. If the
                  exception handler uses this parameter, the contents of the memory word including
                  the instruction must be loaded as data.
Operation:
         Exceptions:
                  Trap exception
      Format:
              TLTI rs, immediate
      Description:
              The 16-bit immediate is sign-extended and compared with the contents of general
              purpose register rs. Assuming both values are signed integers, if the contents of
              general purpose register rs are less than the sign-extended immediate, a trap
              exception occurs.
      Operation:
      32      T: if GPR[rs] < (immediate15)16 || immediate15...0 then
                     TrapException
                 endif
      64      T: if GPR[rs] < (immediate15)48 || immediate15...0 then
                     TrapException
                 endif
      Exceptions:
              Trap exception
         Format:
                TLTIU rs, immediate
         Description:
                The 16-bit immediate is sign-extended and compared with the contents of general
                purpose register rs. Assuming both values are unsigned integers, if the contents
                of general purpose register rs are less than the sign-extended immediate, a trap
                exception occurs.
         Operation:
         32     T:    if (0 || GPR[rs]) < (0 || (immediate15)16 || immediate15...0) then
                           TrapException
                      endif
         64     T:     if (0 || GPR[rs]) < (0 || (immediate15)48 || immediate15...0) then
                            TrapException
                      endif
         Exceptions:
                Trap exception
       Format:
                TLTU rs, rt
       Description:
                The contents of general purpose register rt are compared with general purpose
                register rs. Assuming both values are unsigned integers, if the contents of general
                purpose register rs are less than the contents of general purpose register rt, a trap
                exception occurs.
                A parameter can be sent to the exception handler by using the code area. If the
                exception handler uses this parameter, the contents of the memory word including
                the instruction must be loaded as data.
Operation:
       Exceptions:
                Trap exception
         Format:
                  TNE rs, rt
         Description:
                  The contents of general purpose register rt are compared with general purpose
                  register rs. If the contents of general purpose register rs are not equal to the
                  contents of general purpose register rt, a trap exception occurs.
                  A parameter can be sent to the exception handler by using the code area. If the
                  exception handler uses this parameter, the contents of the memory word including
                  the instruction must be loaded as data.
Operation:
         Exceptions:
                  Trap exception
      Format:
              TNEI rs, immediate
      Description:
              The 16-bit immediate is sign-extended and compared with the contents of general
              purpose register rs. If the contents of general purpose register rs are not equal to
              the sign-extended immediate, a trap exception occurs.
Operation:
      Exceptions:
              Trap exception
        SPECIAL           rs             rt              rd           0                 XOR
       000000                                                       00000             100110
           6               5              5              5            5                  6
         Format:
                  XOR rd, rs, rt
         Description:
                  The contents of general purpose register rs and the contents of general purpose
                  register rt are logical exclusive ORed bit-wise. The result is stored into general
                  purpose register rd.
Operation:
         Exceptions:
                  None
      XORI             rs              rt                            immediate
     001110
       6                5               5                              16
      Format:
              XORI rt, rs, immediate
      Description:
              The 16-bit zero-extended immediate and the contents of general purpose register
              rs are logical exclusive ORed bit-wise.
              The result is stored in general purpose register rt.
Operation:
      Exceptions:
              None
          28...26                                      Opcode
 31...29    0       1                   2          3            4         5      6          7
    0    SPECIAL REGIMM                 J         JAL         BEQ       BNE    BLEZ       BGTZ
    1      ADDI   ADDIU                SLTI      SLTIU        ANDI       ORI   XORI        LUI
    2      COP0   COP1                COP2         *          BEQL      BNEL   BLEZL      BGTZL
    3     DADDIe DADDIUe              LDLe       LDRe          *         *         *        *
    4       LB     LH                  LWL        LW           LBU       LHU      LWR      LWUe
    5       SB     SH                  SWL        SW          SDLe      SDRe      SWR    CACHE d
    6       LL    LWC1                LWC2         *          LLDe      LDC1      LDC2      LDe
    7       SC    SWC1                SWC2         *          SCDe      SDC1      SDC2      SDe
             18...16                               REGIMM rt
 20...19       0       1      2                   3             4        5         6        7
    0         BLTZ   BGEZ   BLTZL               BGEZL           *        *         *        *
    1         TGEI   TGEIU   TLTI               TLTIU         TEQI       *        TNEI      *
    2        BLTZAL BGEZAL BLTZALL             BGEZALL          *        *         *        *
    3          *            *           *          *            *        *         *        *
             23...21                                   COPz rs
 25...24       0           1            2          3           4         5          6       7
    0         MF          DMFe         CF          g          MT        DMTe       CT       g
    1         BC            g           g          g           g          g         g       g
    2                                                    CO
    3
          18...16                                COPz rt
20...19     0          1         2          3              4       5          6            7
           BCF        BCT       BCFL      BCTL             g       g          g            g
   0
   1        g          g         g          g              g       g          g            g
            g          g         g          g              g       g          g            g
   2
   3        g          g         g          g              g       g          g            g
                                           CP0 Function
         2 ... 0
 5 ... 3    0           1        2           3             4       5         6             7
      0     f         TLBR    TLBWI          f             f       f      TLBWR            f
      1 TLBP            f        f           f             f       f         f             f
      2      x          f        f           f             f       f         f             f
      3 ERET c          f        f           f             f       f         f             f
      0     f           f        f           f             f       f         f             f
      1     f           f        f           f             f       f         f             f
      2     f           f        f           f             f       f         f             f
      3     f           f        f           f             f       f         f             f
               Key:
               *             If the operation code marked with an asterisk is executed with the
                             current VR4300, the reserved instruction exception occurs. This
                             code is reserved for future expansion.
               g             Operation codes marked with a gamma cause a reserved
                             instruction exception. They are reserved for future expansion.
               d             Operation codes marked with a delta are valid only for VR4000
                             processors with CP0 enabled, and cause a reserved instruction
                             exception on other processors.
               f             Operation codes marked with a phi are invalid but do not cause
                             reserved instruction exceptions in VR4300 operation.
               x             Operation codes marked with a xi cause a reserved instruction
                             exception on only VR4300 processors.
               c             Operation codes marked with a chi are valid only on VR4000
                             series processors.
               e             The operation code marked with an epsilon is valid in the 64-bit
                             mode and 32-bit Kernel mode. In the 32-bit User or Supervisor
                             mode, this code generates the reserved instruction exception.
17
                                           Source Format
    Operation
                     Single         Double         Word          Longword
ADD                     V              V             R                R
SUB                     V              V             R                R
MUL                     V              V             R                R
DIV                     V              V             R                R
SQRT                    V              V             R                R
ABS                     V              V             R                R
MOV                     V              V
NEG                     V              V             R                R
TRUNC.L                 V              V
ROUND.L                 V              V
CEIL.L                  V              V
FLOOR.L                 V              V
TRUNC.W                 V              V
ROUND.W                 V              V
CEIL.W                  V              V
FLOOR.W                 V              V
CVT.S                                  V             V                V
CVT.D                   V                            V                V
CVT.W                   V              V
CVT.L                   V              V
C                       V              V              R               R
                    The FPU branch instruction can be used with the logic of the condition reversed.
                    To compare all the 32 conditions, therefore, comparison need only be performed
                    16 times, as shown in Table 17-2.
Remark F:         False
       T:         True
Floating-Point Operations
       The floating-point unit instruction set includes:
           •    floating-point add
           •    floating-point subtract
           •    floating-point multiply
           •    floating-point divide
           •    floating-point square root
           •    convert between fixed-point and floating-point formats
           •    convert between floating-point formats
           •    floating-point compare
       These operations satisfy the requirements of IEEE Standard 754 requirements for
       accuracy. Specifically, these operations obtain a result which is identical to an
       infinite-precision result rounded to the specified format, using the current
       rounding mode.
       Instructions must specify the format of their operands. Except for conversion
       functions, mixed-format operations cannot be performed.
                  Example #1:
                      GPR[rt] ¬ immediate || 016
                  Sixteen zero bits are concatenated with a low-order immediate value (typically
                  16 bits), and the 32-bit string is assigned to General Purpose Register rt.
Example #2:
                      (immediate15)16 || immediate15...0
                  Bit 15 (the sign bit) of an immediate value is extended for 16 bit positions, and
                  the result is concatenated with bits 15 through 0 of the immediate value to
                  form a 32-bit sign-extended value.
Example #3:
                     Figure 17-1 shows the I-Type instruction format used by load and store
                     instructions.
 I-Type (Immediate)
31 26 25 21 20 16 15 0
op base ft offset
             6                5                 5                            16
  op         is a 6-bit opcode
  base       is the 5-bit base register specifier
  ft         is a 5-bit source (for stores) or destination (for loads) FPU register specifier
  offset     is the 16-bit signed immediate offset
                     All coprocessor loads and stores reference data which is located at the word
                     boundary. Thus, for word loads and stores, the access type field is always WORD,
                     and the low-order two bits of the address must always be zero. For doubleword
                     loads and stores, the access type field is always DOUBLEWORD, and the low-
                     order three bits of the address must always be zero.
                     Regardless of byte-numbering order (endianness), the address specifies that byte
                     which has the smallest byte-address in the accessed field. For a big-endian
                     system, this is the leftmost byte; for a little-endian system, this is the rightmost
                     byte.
R-Type (Register)
31 26 25 21 20 16 15 11 10 6 5 0
         6                 5               5              5              5                6
       COP1       is a 6-bit opcode
       fmt        is a 5-bit format specifier
       fs         is a 5-bit source1 register
       ft         is a 5-bit source2 register
       fd         is a 5-bit destination register
       function   is a 6-bit function field
      Code
                 Mnemonic                                    Operation
      (5: 0)
        0       ADD             Add
        1       SUB             Subtract
        2       MUL             Multiply
        3       DIV             Divide
        4       SQRT            Square root
        5       ABS             Absolute value
        6       MOV             Transfer
        7       NEG             Sign reverse
        8       ROUND.L         Convert to 64-bit fixed-point, rounded to nearest/even
        9       TRUNC.L         Convert to 64-bit fixed-point, rounded toward zero
       10       CEIL.L          Convert to 64-bit fixed-point, rounded to + ¥
       11       FLOOR.L         Convert to 64-bit fixed-point, rounded to – ¥
       12       ROUND.W         Convert to 32-bit fixed-point, rounded to nearest/even
       13       TRUNC.W         Convert to 32-bit fixed-point, rounded toward zero
       14       CEIL.W          Convert to 32-bit fixed-point, rounded to + ¥
       15       FLOOR.W         Convert to 32-bit fixed-point, rounded to – ¥
      16–31     –               Reserved
       32       CVT.S           Convert to single floating-point
       33       CVT.D           Convert to double floating-point
       34       –               Reserved
       35       –               Reserved
       36       CVT.W           Convert to 32-bit fixed-point
       37       CVT.L           Convert to 64-bit fixed-point
      38–47     –               Reserved
      48–63     C               Floating-point compare
In the following pages, the notation FGR means the 32 FPU General Purpose
registers FGR0 through FGR31 of the FPU, and FPR refers to the floating-point
registers of the FPU.
An FGR (for some parts, CPR is described instead) is used for the load/store
instructions, and the data transfer instruction to/from the CPU. FPR is used for
the transfer instruction, arithmetic instruction, and conversion instruction in the
CP1.
    •    When the FR bit in the Status register (26 bit) equals zero, only the
         even floating-point registers are valid and the 32 FPUs are 32-bit
         wide.
    •    When the FR bit in the Status register (26 bit) equals one, both odd
         and even FPRs can be used and the 32 FPUs are 64-bit wide.
The following routines are used in the description of the floating-point operations
to retrieve the value of an FPR or to change the value of an FGR:
32 Bit Mode
64 Bit Mode
                                     Floating-point
ABS.fmt                              Absolute Value                         ABS.fmt
 31            26 25         21 20          16 15           11 10           6 5                  0
      Format:
               ABS.fmt fd, fs
      Description:
               The absolute value of the contents of floating-point register fs is taken and the
               value to floating-point register fd is stored. The operand is processed in the
               floating-point format fmt.
               The absolute value operation is arithmetically performed. If the operand is NaN,
               therefore, the invalid operation exception occurs.
               This instruction is valid only in the single- and double-precision floating-point
               formats.
               If the FR bit of the Status register is 0, only an even number can be specified as a
               register number because adjacent even-numbered and odd-numbered registers are
               used in pairs as a floating-point registers. If an odd number is specified, the
               operation is undefined.
               If the FR bit of the Status bit is 1, both the odd and even register numbers are valid.
Operation:
      Exceptions:
               Coprocessor unusable exception
               Floating-point exception
      Floating-Point Exceptions:
               Unimplemented operation exception
               Invalid operation exception
         Format:
                  ADD.fmt fd, fs, ft
         Description:
                  The contents of floating-point registers fs and ft are added, and stores the result is
                  stored to floating-point register fd. The operand is processed in the floating-point
                  format fmt. The operation is executed as if the accuracy were infinite, and the
                  result is rounded according to the current rounding mode.
                  This instruction is valid only in the single- and double-precision floating-point
                  formats.
                  If the FR bit of the status register is 0, only an even number can be specified as a
                  register number because adjacent even-numbered and odd-numbered registers are
                  used in pairs as a floating-point registers. If an odd number is specified, the
                  operation is undefined. If the FR bit of the Status bit is 1, both the odd and even
                  register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, ValueFPR (fs, fmt) + ValueFPR (ft, fmt) )
         Exceptions:
                  Coprocessor unusable exception
                  Floating-point exception
         Floating-Point Exceptions:
                  Unimplemented operation exception
                  Invalid operation exception
                  Inexact operation exception
                  Overflow exception
                  Underflow exception
      Format:
                 BC1F offset
      Description:
                 A branch target address is computed from the sum of the address of the instruction
                 in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. If
                 the CPz condition signal sampled while the instruction immediately preceding is
                 being executed is false (0), the program branches to the branch target address, with
                 a delay of one instruction.
                 Because the result of comparison is sampled while the instruction immediately
                 preceding is executed, at least one instruction must be inserted in between the
                 floating-point compare instruction and this instruction.
Operation:
      Exceptions:
                 Coprocessor unusable exception
         Format:
                  BC1FL offset
         Description:
                  A branch target address is computed from the sum of the address of the instruction
                  in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. If
                  the CPz condition signal sampled while the instruction immediately preceding is
                  being executed is false (0), the program branches to the branch target address, with
                  a delay of one instruction. If the branch is not taken, the instruction in the branch
                  delay slot is nullified.
                  Because the result of comparison is sampled while the instruction immediately
                  preceding is executed, at least one instruction must be inserted in between the
                  floating-point compare instruction and this instruction.
         Operation:
             32       T–1:      condition ¬ not COC[1]
                      T:        target ¬ (offset15)14 || offset || 02
                      T+1:      if condition then
                                     PC ¬ PC + target
                                else
                                     NullifyCurrentInstruction
                                endif
             64       T–1:      condition ¬ not COC[1]
                      T:        target ¬ (offset15)46 || offset || 02
                      T+1:      if condition then
                                     PC ¬ PC + target
                                else
                                     NullifyCurrentInstruction
                                endif
         Exceptions:
                  Coprocessor unusable exception
      Format:
              BC1T offset
      Description:
              A branch target address is computed from the sum of the address of the instruction
              in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. If
              the CPz condition signal sampled while the instruction immediately preceding is
              being executed is true (1), the program branches to the branch target address, with
              a delay of one instruction.
              Because the result of comparison is sampled while the instruction immediately
              preceding is executed, at least one instruction must be inserted in between the
              floating-point compare instruction and this instruction.
      Operation:
        32      T–1:      condition ¬ COC[1]
                T:        target ¬ (offset15)14 || offset || 02
                T+1:      if condition then
                               PC ¬ PC + target
                          endif
      Exceptions:
              Coprocessor unusable exception
         Format:
                   BC1TL offset
         Description:
                   A branch target address is computed from the sum of the address of the instruction
                   in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. If
                   the result of the last floating-point compare is true (1), the program branches to the
                   branch target address, with a delay of one instruction. If the branch is not taken,
                   the instruction in the branch delay slot is nullified.
                   Because the result of comparison is sampled while the instruction immediately
                   preceding is executed, at least one instruction must be inserted in between the
                   floating-point compare instruction and this instruction.
         Operation:
             32     T–1:      condition ¬ COC[1]
                    T:        target ¬ (offset15)14 || offset || 02
                    T+1:      if condition then
                                   PC ¬ PC + target
                              else
                                   NullifyCurrentInstruction
                              endif
             64     T–1:      condition ¬ COC[1]
                    T:        target ¬ (offset15)46 || offset || 02
                    T+1:      if condition then
                                   PC ¬ PC + target
                              else
                                   NullifyCurrentInstruction
                              endif
         Exceptions:
                   Coprocessor unusable exception
                                   Floating-point
C.cond.fmt                            Compare                      C.cond.fmt
 31         26 25          21 20            16 15         11 10         6 5     43           0
      Format:
             C.cond.fmt fs, ft
      Description:
             Compares the contents of floating-point register fs with those of floating-point
             register ft based on compare condition cond, and sets the result to condition signal
             COC [1]. The operand is processed in the floating-point format fmt. If one of the
             values is NaN and if the most-significant bit of compare condition cond is set, the
             invalid operation exception occurs (the result of the comparison is used to test the
             FPU branch instruction). At least one instruction is necessary between this
             instruction and the FPU branch instruction.
             Comparison is performed normally, and does not overflow or underflow. One of
             four mutually exclusive relations results, “less than”, “equal to”, “greater than”,
             or “cannot be compared”, occurs. If one of or both the operands are NaN, the
             result of the comparison is always “cannot be compared”.
             During comparison, the sign of 0 is ignored (+0 = –0).
             This instruction is valid only in the single- and double-precision floating-point
             format.
             If the FR bit of the status register is 0, only an even number can be specified as a
             register number because adjacent even-numbered and odd-numbered registers are
             used in pairs as a floating-point registers. If an odd number is specified, the
             operation is undefined. If the FR bit of the status bit is 1, both the odd and even
             register numbers are valid.
                                    Floating-point
C.cond.fmt                             Compare                    C.cond.fmt
                                      (continued)
Operation:
         32, 64     T:    if NaN (ValueFPR (fs, fmt) ) or NaN (ValueFPR (ft, fmt) ) then
                                  less ¬ false
                                  equal ¬ false
                                  unordered ¬ true
                                  if cond3 then
                                      signal InvalidOperationException
                                  endif
                          else
                                  less ¬ ValueFPR (fs, fmt) < ValueFPR (ft, fmt)
                                  equal ¬ ValueFPR (fs, fmt) = ValueFPR (ft, fmt)
                                  unordered ¬ false
                          endif
                          condition ¬ (cond2 and less) or (cond1 and equal) or
                                       (cond0 and unordered)
                          FCR[31]23 ¬ condition
                          COC[1] ¬ condition
         Exceptions:
                  Coprocessor unusable
                  Floating-point exception
         Floating-Point Exceptions:
                  Unimplemented operation exception
                  Invalid operation exception
                              Floating-point
CEIL.L.fmt                   Ceiling To Long                           CEIL.L.fmt
                            Fixed-point Format
31         26 25          21 20          16 15          11 10            6 5                  0
     Format:
            CEIL.L.fmt fd, fs
     Description:
            The contents of floating-point register fs are arithmetically converted into a 64-bit
            fixed-point format, and the result is stored to floating-point register fd. The source
            operand is processed in the floating-point format fmt.
            The result of the conversion is rounded toward the + ¥ direction, regardless of the
            current rounding mode.
            This instruction is valid only for conversion from the single- or double-precision
            floating-point format.
            If the FR bit of the Status register is 0, only an even number can be specified as a
            register number because adjacent even-numbered and odd-numbered registers are
            used in pairs as a floating-point registers. If an odd number is specified, the
            operation is undefined. If the FR bit of the Status register is 1, both the odd and
            even register numbers are valid.
            If the source operand is infinite or NaN, and if the rounded result is outside the
            range of 263 –1 to –263, the invalid operation exception occurs. If the invalid
            operation exception is not enabled, the exception does not occur, and 263–1 is
            returned.
            This operation is defined in the 64-bit mode and 32-bit Kernel mode. If this
            instruction is executed during 32-bit User/Supervisor mode, a reserved instruction
            exception occurs.
                                     Floating-point
CEIL.L.fmt                          Ceiling To Long                       CEIL.L.fmt
                                   Fixed-point Format
                                       (continued)
Operation:
         Exceptions:
                  Coprocessor unusable exception
                  Floating-point exception
                  Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
         Floating-Point Exceptions:
                  Invalid operation exception
                  Unimplemented operation exception
                  Inexact operation exception
                  Overflow exception
         Restrictions:
                  An unimplemented operation exception will occur in the following cases.
                         •   If an overflow occurs during conversion to integer format
                         •   If the source operand is an infinite number
                         •   If the source operand is NaN
                  Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                  format to a fixed-point format is 1, an unimplemented operation exception will
                  occur. This includes cases when there is an overflow during conversion.
                              Floating-point
CEIL.W.fmt                   Ceiling To Single                    CEIL.W.fmt
                            Fixed-point Format
31         26 25          21 20          16 15          11 10           6 5                   0
     Format:
            CEIL.W.fmt fd, fs
     Description:
            The contents of floating-point register fs are arithmetically converted into a 32-bit
            fixed-point format, and the result is stored to floating-point register fd. The source
            operand is processed in the floating-point format fmt.
            The result of the conversion is rounded toward the +¥ direction, regardless of the
            current rounding mode.
            This instruction is valid only for conversion from the single- or double-precision
            floating-point format.
            If the FR bit of the Status register is 0, only an even number can be specified as a
            register number because adjacent even-numbered and odd-numbered registers are
            used in pairs as a floating-point registers. If an odd number is specified, the
            operation is undefined. If the FR bit of the Status register is 1, both the odd and
            even register numbers are valid.
            If the source operand is infinite or NaN, and if the rounded result is outside the
            range of 231 –1 to –231, the invalid operation exception occurs. If the invalid
            operation exception is not enabled, the exception does not occur, and 231–1 is
            returned.
                                      Floating-point
CEIL.W.fmt                          Ceiling To Single                   CEIL.W.fmt
                                   Fixed-point Format
                                       (continued)
Operation:
         Exceptions:
                  Coprocessor unusable exception
                  Floating-point exception
         Floating-Point Exceptions:
                  Invalid operation exception
                  Unimplemented operation exception
                  Inexact operation exception
                  Overflow exception
         Restrictions:
                  An unimplemented operation exception will occur in the following cases.
                         •   If an overflow occurs during conversion to integer format
                         •   If the source operand is an infinite number
                         •   If the source operand is NaN
                  Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                  format to a fixed-point format is 1, an unimplemented operation exception will
                  occur. This includes cases when there is an overflow during conversion.
      COP1         CF                rt              fs                 0
     010001      00010                                            000 0000 0000
        6           5                 5              5                   11
       Format:
              CFC1 rt, fs
       Description:
              The contents of the floating-point control register fs are loaded into general
              purpose register rt.
              This instruction is only defined when fs equals 0 or 31.
              The contents of general purpose register rt are undefined while the instruction
              immediately following this load instruction is being executed.
Operation:
       32     T:   temp ¬ FCR[fs]
              T+1: GPR[rt] ¬ temp
       64     T:   temp ¬ FCR[fs]
              T+1: GPR[rt] ¬ (temp31)32 || temp
       Exceptions:
              Coprocessor unusable exception
        COP1             CT                 rt               fs                 0
       010001           00110                                              000 0000 0000
         6                5                  5              5                    11
         Format:
                   CTC1 rt, fs
         Description:
                   The contents of general purpose register rt are loaded to floating-point register fs.
                   This instruction is defined if fs is 0 or 31.
                   If the cause bit of the floating-point control/status register (FCR31) and the
                   corresponding enable bit are set by writing data to FCR31, the floating-point
                   exception occurs. Write the data to the register before the exception occurs.
                   The contents of the floating-point control register fs are undefined while the
                   instruction immediately following this instruction is executed.
         Operation:
              32         T:        temp ¬ GPR[rt]
                         T+1:      FCR[fs] ¬ temp
                                   COC[1] ¬ FCR[31]23
              64         T:        temp ¬ GPR[rt]31...0
                         T+1:      FCR[fs] ¬ temp
                                   COC[1] ¬ FCR[31]23
         Exceptions:
                   Coprocessor unusable exception
                   Floating-point exception
         Floating-Point Exceptions:
                   Invalid operation exception
                   Unimplemented operation exception
                   Division by zero exception
                   Inexact operation exception
                   Overflow exception
                   Underflow exception
572                                  User’s Manual U10504EJ7V0UM00
                                                                        FPU Instruction Set Details
                               Floating-point
CVT.D.fmt                    Convert To Double                          CVT.D.fmt
                            Floating-point Format
31 26 25 21 20 16 15 11 10 6 5 0
     Format:
              CVT.D.fmt fd, fs
     Description:
              The contents of floating-point register fs are arithmetically converted into a
              double-precision floating-point format, and the result is stored to floating-point
              register fd. The source operand is processed in the floating-point format fmt.
              This instruction is valid only for conversion from the single-precision floating-
              point format, and 32-bit or 64-bit fixed floating-point format.
              In the single-precision floating-point format or 32-bit fixed point format, this
              conversion operation is executed correctly without the accuracy becoming
              degraded.
              If the FR bit of the Status register is 0, only an even number can be specified as a
              register number because adjacent even-numbered and odd-numbered registers are
              used in pairs as a floating-point registers. If an odd number is specified, the
              operation is undefined. If the FR bit of the Status register is 1, both the odd and
              even register numbers are valid.
     Operation:
     32, 64     T:     StoreFPR (fd, D, ConvertFmt (ValueFPR (fs, fmt) , fmt, D) )
     Exceptions:
              Coprocessor unusable exception
              Floating-point exception
     Floating-Point Exceptions:
              Invalid operation exception
              Unimplemented operation exception
              Inexact operation exception
                                 Floating-point
CVT.D.fmt                      Convert To Double                         CVT.D.fmt
                              Floating-point Format
                                   (continued)
         Restrictions:
                An unimplemented operation exception will occur in the following cases.
                    •    If an overflow occurs during conversion to integer format
                    •    If the source operand is an infinite number
                    •    If the source operand is NaN
                              Floating-point
CVT.L.fmt                    Convert To Long                           CVT.L.fmt
                            Fixed-point Format
31         26 25          21 20          16 15          11 10            6 5                  0
     Format:
            CVT.L.fmt fd, fs
     Description:
            The contents of floating-point register fs are arithmetically converted into a 64-bit
            fixed-point format, and the result is stored to floating-point register fd. The source
            operand is processed in the floating-point format fmt.
            This instruction is valid only for conversion from the single- or double-precision
            floating-point format.
            If the FR bit of the Status register is 0, only an even number can be specified as a
            register number because adjacent even-numbered and odd-numbered registers are
            used in pairs as a floating-point registers. If an odd number is specified, the
            operation is undefined. If the FR bit of the Status register is 1, both the odd and
            even register numbers are valid.
            If the source operand is infinite or NaN, and if the rounded result is outside the
            range of 263 –1 to –263, the invalid operation exception occurs. If the invalid
            operation exception is not enabled, the exception does not occur, and 263–1 is
            returned.
            This operation is defined in the 64-bit mode and 32-bit Kernel mode. If this
            instruction is executed during 32-bit User/Supervisor mode, a reserved instruction
            exception occurs.
                                   Floating-point
CVT.L.fmt                         Convert To Long                        CVT.L.fmt
                                 Fixed-point Format
                                     (continued)
Operation:
         Exceptions:
                Coprocessor unusable exception
                Floating-point exception
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
         Floating-Point Exceptions:
                Invalid operation exception
                Unimplemented operation exception
                Inexact operation exception
                Overflow exception
         Restrictions:
                An unimplemented operation exception will occur in the following cases.
                       •   If an overflow occurs during conversion to integer format
                       •   If the source operand is an infinite number
                       •   If the source operand is NaN
                Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                format to a fixed-point format is 1, an unimplemented operation exception will
                occur. This includes cases when there is an overflow during conversion.
                               Floating-point
CVT.S.fmt                     Convert To Single                       CVT.S.fmt
                            Floating-point Format
31 26 25 21 20 16 15 11 10 6 5 0
      Format:
               CVT.S.fmt fd, fs
      Description:
               The contents of floating-point register fs are arithmetically converted into a
               single-precision floating-point format, and the result is stored to floating-point
               register fd. The source operand is processed in the floating-point format fmt. The
               result of the conversion is rounded according to the current rounding mode.
               This instruction is valid only for conversion from the double-precision floating-
               point format, and 32-bit or 64-bit fixed floating-point format.
               If the FR bit of the Status register is 0, only an even number can be specified as a
               register number because adjacent even-numbered and odd-numbered registers are
               used in pairs as a floating-point registers. If an odd number is specified, the
               operation is undefined. If the FR bit of the Status register is 1, both the odd and
               even register numbers are valid.
      Operation:
      32, 64     T:     StoreFPR (fd, S, ConvertFmt (ValueFPR (fs, fmt) , fmt, S) )
      Exceptions:
               Coprocessor unusable exception
               Floating-point exception
      Floating-Point Exceptions:
               Invalid operation exception
               Unimplemented operation exception
               Inexact operation exception
               Overflow exception
               Underflow exception
                                 Floating-point
CVT.S.fmt                       Convert To Single                        CVT.S.fmt
                              Floating-point Format
                                   (continued)
         Restrictions:
                An unimplemented operation exception will occur in the following cases.
                    •    If an overflow occurs during conversion to integer format
                    •    If the source operand is an infinite number
                    •    If the source operand is NaN
                              Floating-point
CVT.W.fmt                   Convert To Single                      CVT.W.fmt
                            Fixed-point Format
31 26 25 21 20 16 15 11 10 6 5 0
     Format:
            CVT.W.fmt fd, fs
     Description:
            The contents of floating-point register fs are arithmetically converted into a 32-bit
            fixed-point format, and the result is stored to floating-point register fd. The source
            operand is processed in the floating-point format fmt.
            This instruction is valid only for conversion from the single- or double-precision
            floating-point format.
            If the FR bit of the Status register is 0, only an even number can be specified as a
            register number because adjacent even-numbered and odd-numbered registers are
            used in pairs as a floating-point registers. If an odd number is specified, the
            operation is undefined. If the FR bit of the Status register is 1, both the odd and
            even register numbers are valid.
            If the source operand is infinite or NaN, and if the rounded result is outside the
            range of 231 –1 to –231, the invalid operation exception occurs. If the invalid
            operation exception is not enabled, the exception does not occur, and 231–1 is
            returned.
                                     Floating-point
CVT.W.fmt                          Convert To Single                     CVT.W.fmt
                                   Fixed-point Format
                                       (continued)
Operation:
         Exceptions:
                  Coprocessor unusable exception
                  Floating-point exception
         Floating-Point Exceptions:
                  Invalid operation exception
                  Unimplemented operation exception
                  Inexact operation exception
                  Overflow exception
         Restrictions:
                  An unimplemented operation exception will occur in the following cases.
                         •   If an overflow occurs during conversion to integer format
                         •   If the source operand is an infinite number
                         •   If the source operand is NaN
                  Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                  format to a fixed-point format is 1, an unimplemented operation exception will
                  occur. This includes cases when there is an overflow during conversion.
     Format:
              DIV.fmt fd, fs, ft
     Description:
              The contents of floating-point register fs are divided by those of floating-point
              register ft, and the result are stored to floating-point register rd. The operand is
              processed in the floating-point format fmt. The operation is executed as if the
              accuracy were infinite, and the result is rounded according to the current rounding
              mode.
              This instruction is valid only for conversion from the single- or double-precision
              floating-point format.
              If the FR bit of the Status register is 0, only an even number can be specified as a
              register number because adjacent even-numbered and odd-numbered registers are
              used in pairs as a floating-point registers. If an odd number is specified, the
              operation is undefined. If the FR bit of the Status register is 1, both the odd and
              even register numbers are valid.
Operation:
     Exceptions:
              Coprocessor unusable exception
              Floating-point exception
     Floating-Point Exceptions:
              Unimplemented operation exception                Invalid operation exception
              Division-by-zero exception                       Inexact operation exception
              Overflow exception                               Underflow exception
        COP1         DMF                rt              fs                  0
       010001       00001                                             000 0000 0000
          6           5                  5               5                 11
         Format:
                DMFC1 rt, fs
         Description:
                The contents of Floating-Point General Purpose register fs are stored into CPU
                general purpose register rt.
                The contents of general purpose register rt are undefined while the instruction
                immediately following this instruction is being executed.
                The FR bit of the Status register indicates whether all the 32 registers of the
                processor can be specified. If the FR bit is 0, and the least-significant bit of fs is
                1, this instruction is undefined.
                The operation is undefined if an odd number is specified when the FP bit of the
                status register is 0. If the FR bit is 1, both the odd-numbered and even-numbered
                registers are valid.
                This operation is defined in 64-bit mode or 32-bit Kernel mode.
Operation:
  64      T:       if  SR26 = 1 then
                       data ¬ FGR [fs]
                   else
                   if  fs0 = 0 then
                       data ¬ FGR [fs + 1] || FGR [fs]
                   else
                       data ¬ undefined64
                   endif
       T+1:        GPR[rt] ¬ data
  Exceptions:
         Coprocessor unusable exception
         Floating-point exception
         Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
  Floating-Point Exceptions:
         Unimplemented operation exception
        COP1         DMT                rt              fs                  0
       010001       00101                                             000 0000 0000
          6           5                  5               5                 11
         Format:
                DMTC1 rt, fs
         Description:
                The contents of general purpose register rt are loaded into Floating-Point General
                Purpose register fs.
                The contents of fs are undefined while the instruction immediately following this
                instruction is being executed.
                The FR bit of the Status register indicates whether all the 32 registers of the
                processor can be specified. If the FR bit is 0, and the least-significant bit of fs is
                1, this instruction is undefined.
                The operation is undefined if an odd number is specified when the FR bit of the
                status register is 0. If the FR bit is 1, both the odd-numbered and even-numbered
                registers are valid.
                This operation is defined in 64-bit mode or 32-bit Kernel mode.
  Operation:
  64       T:      data ¬ GPR[rt]
           T+1:    if SR26 = 1 then
                      FGR [fs] ¬ data
                   else
                   if fs0 = 0 then
                       FGR [fs+1] ¬ data63..32
                       FGR [fs] ¬ data31..0
                   else
                       undefined_result
                   endif
  Exceptions:
         Coprocessor unusable exception
         Floating-point exception
         Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
  Floating-Point Exceptions:
         Unimplemented operation exception
                                     Floating-point
FLOOR.L.fmt                          Floor To Long                FLOOR.L.fmt
                                   Fixed-point Format
31 26 25 21 20 16 15 11 10 6 5 0
         Format:
                FLOOR.L.fmt fd, fs
         Description:
                The contents of floating-point register fs are arithmetically converted into a 64-bit
                fixed-point format, and the result is stored to floating-point register fd. The source
                operand is processed in the floating-point format fmt.
                The result of the conversion is rounded toward the – ¥ direction, regardless of the
                current rounding mode.
                This instruction is valid only for conversion from the single- or double-precision
                floating-point format.
                If the FR bit of the Status register is 0, only an even number can be specified as a
                register number because adjacent even-numbered and odd-numbered registers are
                used in pairs as a floating-point registers. If an odd number is specified, the
                operation is undefined. If the FR bit of the Status register is 1, both the odd and
                even register numbers are valid.
                If the source operand is infinite or NaN, and if the rounded result is outside the
                range of 263 –1 to –263, the invalid operation exception occurs. If the invalid
                operation exception is not enabled, the exception does not occur, and 263–1 is
                returned.
                This operation is defined in the 64-bit mode and 32-bit Kernel mode. If this
                instruction is executed during 32-bit User/Supervisor mode, a reserved instruction
                exception occurs.
                                Floating-point
FLOOR.L.fmt                    Floor To Long                 FLOOR.L.fmt
                             Fixed-point Format
                                 (continued)
Operation:
   Exceptions:
          Coprocessor unusable exception
          Floating-point exception
          Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
   Floating-Point Exceptions:
          Invalid operation exception
          Unimplemented operation exception
          Inexact operation exception
          Overflow exception
   Restrictions:
          An unimplemented operation exception will occur in the following cases.
                 •   If an overflow occurs during conversion to integer format
                 •   If the source operand is an infinite number
                 •   If the source operand is NaN
          Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
          format to a fixed-point format is 1, an unimplemented operation exception will
          occur. This includes cases when there is an overflow during conversion.
                                     Floating-point
FLOOR.W.fmt                          Floor To Single           FLOOR.W.fmt
                                   Fixed-point Format
31 26 25 21 20 16 15 11 10 6 5 0
         Format:
                FLOOR.W.fmt fd, fs
         Description:
                The contents of floating-point register fs are arithmetically converted into a 32-bit
                fixed-point format, and the result is stored to floating-point register fd. The source
                operand is processed in the floating-point format fmt.
                The result of the conversion is rounded toward the – ¥ direction, regardless of the
                current rounding mode.
                This instruction is valid only for conversion from the single- or double-precision
                floating-point format.
                If the FR bit of the Status register is 0, only an even number can be specified as a
                register number because adjacent even-numbered and odd-numbered registers are
                used in pairs as a floating-point registers. If an odd number is specified, the
                operation is undefined. If the FR bit of the Status register is 1, both the odd and
                even register numbers are valid.
                If the source operand is infinite or NaN, and if the rounded result is outside the
                range of 231 –1 to –231, the invalid operation exception occurs. If the invalid
                operation exception is not enabled, the exception does not occur, and 231–1 is
                returned.
                                 Floating-point
FLOOR.W.fmt                     Floor To Single           FLOOR.W.fmt
                              Fixed-point Format
                                  (continued)
Operation:
  Exceptions:
           Coprocessor unusable exception
           Floating-point exception
  Floating-Point Exceptions:
           Invalid operation exception
           Unimplemented operation exception
           Inexact operation exception
           Overflow exception
  Restrictions:
           An unimplemented operation exception will occur in the following cases.
                  •   If an overflow occurs during conversion to integer format
                  •   If the source operand is an infinite number
                  •   If the source operand is NaN
           Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
           format to a fixed-point format is 1, an unimplemented operation exception will
           occur. This includes cases when there is an overflow during conversion.
         Format:
                LDC1 ft, offset (base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address.
                If the FR bit of the Status register is 0, the contents of the doubleword at the
                memory location specified by the virtual address are loaded to floating-point
                registers ft and ft+1. At this time, the high-order 32 bits of the doubleword are
                stored to an odd-numbered register specified by ft+1, and the low-order 32 bits are
                stored to an even-numbered register specified by ft. The operation is undefined if
                the least significant bit in the ft field is not 0.
                If the FR bit is 1, the contents of the doubleword at the memory location specified
                by the virtual address are loaded to floating-point register ft.
                If any of the low-order three bits of the address are not zero, an address error
                exception occurs.
Operation:
     Exceptions:
            Coprocessor unusable
            TLB miss exception
            TLB invalid exception
            Bus error exception
            Address error exception
         Format:
                LWC1 ft, offset (base)
         Description:
                The 16-bit offset is sign-extended and added to the contents of general purpose
                register base to form a virtual address. The contents of the word at the memory
                location specified by the virtual address are loaded to floating-point register ft.
                If the FR bit of the Status register is 0 and if the least-significant bit in the ft field
                is 0, the contents of the word are stored to the low-order 32 bits of floating-point
                register ft. If the least-significant bit in the ft area is 1, the contents of the word
                are stored to the high-order 32 bits of floating-point register ft-1.
                If the FR bit is 1, all the 64-bit floating-point registers can be accessed; therefore,
                the contents of the word are stored to floating-point register ft. The value of the
                high-order 32 bits is undefined.
                If either of the low-order two bits of the address is not zero, an address error
                exception occurs.
Operation:
     Exceptions:
            Coprocessor unusable exception
            TLB miss exception
            TLB invalid exception
            Bus error exception
            Address error exception
        COP1          MF                 rt                fs                 0
       010001       00000                                               000 0000 0000
         6             5                   5              5                   11
         Format:
                MFC1 rt, fs
         Description:
                The contents of floating-point general purpose register fs are stored to the general
                purpose register rt of the CPU register rt.
                The contents of general purpose register rt are undefined while the instruction
                immediately following this instruction is being executed.
                If the FR bit of the Status register is 0 and if the least-significant bit in the ft field
                is 0, the low-order 32 bits of floating-point register ft are stored to the general
                purpose register rt. If the least-significant bit in the ft area is 1, the high-order 32
                bits of floating-point register ft-1 are stored to the general purpose register rt.
                If the FR bit is 1, all the 64-bit floating-point registers can be accessed; therefore,
                the low-order 32 bits of floating-point register ft are stored to the general purpose
                register rt.
         Operation:
         32     T:       data ¬ FGR [fs]31...0
                T+1:     GPR [rt] ¬ data
         64     T:       data ¬ FGR [fs]31...0
                T+1:     GPR[rt] ¬ (data31)32 || data
         Exceptions:
                Coprocessor unusable exception
     Format:
              MOV.fmt fd, fs
     Description:
              The contents of floating-point register fs are stored to floating-point register fd.
              The operand is processed in the floating-point format fmt.
              This instruction is not executed arithmetically, and the IEEE754 exception does
              not occur.
              This instruction is valid only in the single- and double-precision floating-point
              formats.
              If the FR bit of the status register is 0, only an even number can be specified as a
              register number because adjacent even-numbered and odd-numbered registers are
              used in pairs as a floating-point registers. If an odd number is specified, the
              operation is undefined. If the FR bit of the status bit is 1, both the odd and even
              register numbers are valid.
Operation:
     Exceptions:
              Coprocessor unusable exception
              Floating-point exception
     Floating-Point Exceptions:
              Unimplemented operation exception
                                    Move To FPU
MTC1                               (Coprocessor 1)                                  MTC1
  31          26 25           21 20           16 15           11 10                               0
        COP1            MT              rt               fs                 0
       010001          00100                                          000 0000 0000
         6               5               5              5                   11
         Format:
                MTC1 rt, fs
         Description:
                The contents of general purpose of the CPU register rt are loaded into the floating-
                point general purpose register fs.
                The contents of floating-point register fs is undefined while the instruction
                immediately following this instruction is being executed.
                The FR bit of the Status register specifies the method of access to the Floating-
                Point General Purpose registers.
                If FR bit equals zero, all 32 Floating-Point General Purpose registers can be
                accessed. Access an odd-numbered register for the high-order 32 bits and an
                even-numbered register for the low-order 32 bits in the format of the floating-
                point operation instruction when transferring double-precision data.
                If the FR bit is 1, all the 32 floating-point general purpose registers can be
                accessed, but the low-order 32 bits of the register are accessed for data.
Operation:
         Exceptions:
                Coprocessor unusable exception
     Format:
              MUL.fmt fd, fs, ft
     Description:
              The contents of floating-point register fs are multiplied by those of floating-point
              register ft, and the result is stored to floating-point register fd. The operand is
              processed in the floating-point format fmt.
              This instruction is valid only for conversion from the single- or double-precision
              floating-point format.
              If the FR bit of the Status register is 0, only an even number can be specified as a
              register number because adjacent even-numbered and odd-numbered registers are
              used in pairs as a floating-point registers. If an odd number is specified, the
              operation is undefined. If the FR bit of the Status register is 1, both the odd and
              even register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, ValueFPR (fs, fmt) * ValueFPR (ft, fmt) )
     Exceptions:
              Coprocessor unusable exception
              Floating-point exception
     Floating-Point Exceptions:
              Unimplemented operation exception
              Invalid operation exception
              Inexact operation exception
              Overflow exception
              Underflow exception
         Format:
                  NEG.fmt fd, fs
         Description:
                  The sign of the contents of floating-point register fs is inverted and the result to
                  floating-point register fd is stored. The operand is processed in the floating-point
                  format fmt.
                  The sign is inverted arithmetically. Therefore, the instruction is invalid if NaN is
                  specified as the operand.
                  This instruction is valid only for conversion from the single- or double-precision
                  floating-point format.
                  If the FR bit of the Status register is 0, only an even number can be specified as a
                  register number because adjacent even-numbered and odd-numbered registers are
                  used in pairs as a floating-point registers. If an odd number is specified, the
                  operation is undefined. If the FR bit of the Status register is 1, both the odd and
                  even register numbers are valid.
Operation:
         Exceptions:
                  Coprocessor unusable exception
                  Floating-point exception
         Floating-Point Exceptions:
                  Unimplemented operation exception
                  Invalid operation exception
                                 Floating-point
ROUND.L.fmt                     Round To Long                 ROUND.L.fmt
                               Fixed-point Format
31 26 25 21 20 16 15 11 10 6 5 0
     Format:
            ROUND.L.fmt fd, fs
     Description:
            The contents of floating-point register fs are converted into the 64-bit fixed-point
            format, and the result is stored to floating-point register fd. The source operand is
            processed in the floating-point format fmt.
            The result of the conversion is rounded to the closest value or even number
            regardless of the current rounding mode.
            This instruction is valid only for conversion from the single- or double-precision
            floating-point format.
            If the FR bit of the Status register is 0, only an even number can be specified as a
            register number because adjacent even-numbered and odd-numbered registers are
            used in pairs as a floating-point registers. If an odd number is specified, the
            operation is undefined. If the FR bit of the Status register is 1, both the odd and
            even register numbers are valid.
            If the source operand is infinite or NaN, and if the rounded result is outside the
            range of 263 –1 to –263, the invalid operation exception occurs. If the invalid
            operation exception is not enabled, the exception does not occur, and 263–1 is
            returned.
            This operation is defined in the 64-bit mode and 32-bit Kernel mode. If this
            instruction is executed during 32-bit User/Supervisor mode, a reserved instruction
            exception occurs.
                                     Floating-point
ROUND.L.fmt                         Round To Long                  ROUND.L.fmt
                                   Fixed-point Format
                                      (continued)
Operation:
         Exceptions:
                Coprocessor unusable exception
                Floating-point exception
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
         Floating-Point Exceptions:
                Invalid operation exception
                Unimplemented operation exception
                Inexact operation exception
                Overflow exception
         Restrictions:
                An unimplemented operation exception will occur in the following cases.
                       •   If an overflow occurs during conversion to integer format
                       •   If the source operand is an infinite number
                       •   If the source operand is NaN
                Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                format to a fixed-point format is 1, an unimplemented operation exception will
                occur. This includes cases when there is an overflow during conversion.
ROUND.W.fmt Round
             Floating-point
                   To Single
                             ROUND.W.fmt
                              Fixed-point Format
31 26 25 21 20 16 15 11 10 6 5 0
     Format:
            ROUND.W.fmt fd, fs
     Description:
            The contents of floating-point register fs are converted into the 32-bit fixed-point
            format, and the result is stored to floating-point register fd. The source operand is
            processed in the floating-point format fmt.
            The result of the conversion is rounded to the closest value or even number
            regardless of the current rounding mode.
            This instruction is valid only for conversion from the single- or double-precision
            floating-point format.
            If the FR bit of the Status register is 0, only an even number can be specified as a
            register number because adjacent even-numbered and odd-numbered registers are
            used in pairs as a floating-point registers. If an odd number is specified, the
            operation is undefined. If the FR bit of the Status register is 1, both the odd and
            even register numbers are valid.
            If the source operand is infinite or NaN, and if the rounded result is outside the
            range of 231 –1 to –231, the invalid operation exception occurs. If the invalid
            operation exception is not enabled, the exception does not occur, and 231–1 is
            returned.
ROUND.W.fmt Round
             Floating-point
                   To Single
                             ROUND.W.fmt
                                    Fixed-point Format
                                        (continued)
Operation:
         Exceptions:
                 Coprocessor unusable exception
                 Floating-point exception
         Floating-Point Exceptions:
                 Invalid operation exception
                 Unimplemented operation exception
                 Inexact operation exception
                 Overflow exception
         Restrictions:
                 An unimplemented operation exception will occur in the following cases.
                        •   If an overflow occurs during conversion to integer format
                        •   If the source operand is an infinite number
                        •   If the source operand is NaN
                 Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                 format to a fixed-point format is 1, an unimplemented operation exception will
                 occur. This includes cases when there is an overflow during conversion.
      Format:
              SDC1 ft, offset(base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address.
              The contents of floating-point registers ft and ft+1 are stored to the memory
              position specified by the virtual address as a doubleword if the FR bit of the Status
              register is 0. At this time, the contents of the odd-numbered register specified by
              ft+1 correspond to the high-order 32 bits of the doubleword, and the contents of
              the even-numbered register specified by ft correspond to the low-order 32 bits.
              If the least significant bit in the ft field is not 0, this instruction is not defined.
              If the FR bit is 1, the contents of floating-point register ft are stored to the memory
              location specified by the virtual address as a doubleword.
              If any of the low-order three bits of the address are not zero, an address error
              exception occurs.
         Operation:
   32        T:   vAddr ¬ ( (offset15)16 || offset15...0) + GPR [base]
                  (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                  if SR26 = 1
                      data ¬ FGR [ft]63...0
                  elseif ft0 = 0 then
                      data ¬ FGR [ft+1]31...0 || FGR [ft]31...0
                  else
                      data ¬ undefined64
                  endif
                  StoreMemory (uncached, DOUBLEWORD, data, pAddr, vAddr, DATA)
   64         T: vAddr ¬ ( (offset15)48 || offset15...0) + GPR [base]
                 (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                 if SR26 = 1
                     data ¬ FGR [ft]63...0
                 elseif ft0 = 0 then
                     data ¬ FGR [ft+1]31...0 || FGR [ft]31...0
                 else
                     data ¬ undefined64
                 endif
                 StoreMemory (uncached, DOUBLEWORD, data, pAddr, vAddr, DATA)
         Exceptions:
                   Coprocessor unusable
                   TLB miss exception
                   TLB invalid exception
                   TLB modification exception
                   Bus error exception
                   Address error exception
                                   Floating-point
SQRT.fmt                           Square Root                            SQRT.fmt
 31            26 25         21 20          16 15          11 10           6 5                   0
      Format:
               SQRT.fmt fd, fs
      Description:
               The positive arithmetic square root of the contents of floating-point register fs is
               calculated and the result is stored to floating-point register fd. The operand is
               processed in the floating-point format fmt. The result is rounded as if calculated
               to infinite precision and then rounded according to the current rounding mode. If
               the value of the source operand is –0, the result will be –0. The result is placed in
               the floating-point register specified by fd.
               This instruction is valid only for conversion from the single- or double-precision
               floating-point format.
               If the FR bit of the Status register is 0, only an even number can be specified as a
               register number because adjacent even-numbered and odd-numbered registers are
               used in pairs as a floating-point registers. If an odd number is specified, the
               operation is undefined. If the FR bit of the Status register is 1, both the odd and
               even register numbers are valid.
Operation:
      Exceptions:
               Coprocessor unusable exception
               Floating-point exception
      Floating-Point Exceptions:
               Unimplemented operation exception
               Invalid operation exception
               Inexact operation exception
         Format:
                  SUB.fmt fd, fs, ft
         Description:
                  The contents of floating-point register ft from those of floating-point register fs,
                  and the result is stored to floating-point register fd. The result is rounded as if
                  calculated to infinite precision and then rounded according to the current rounding
                  mode.
                  This instruction is valid only for conversion from the single- or double-precision
                  floating-point format.
                  If the FR bit of the Status register is 0, only an even number can be specified as a
                  register number because adjacent even-numbered and odd-numbered registers are
                  used in pairs as a floating-point registers. If an odd number is specified, the
                  operation is undefined. If the FR bit of the Status register is 1, both the odd and
                  even register numbers are valid.
Operation:
32, 64 T: StoreFPR (fd, fmt, ValueFPR (fs, fmt) – ValueFPR (ft, fmt) )
         Exceptions:
                  Coprocessor unusable exception
                  Floating-point exception
         Floating-Point Exceptions:
                  Unimplemented operation exception
                  Invalid operation exception
                  Inexact operation exception
                  Overflow exception
                  Underflow exception
      Format:
              SWC1 ft, offset (base)
      Description:
              The 16-bit offset is sign-extended and added to the contents of general purpose
              register base to form a virtual address. The contents of the floating-point general
              purpose register ft are stored at the memory location of the specified address.
              If the FR bit of the Status register is 0 and the least-significant bit in the ft field is
              0, the contents of the low-order 32 bits of floating-point register ft are stored. If
              the least-significant bit in the ft field is 1, the contents of the high-order 32 bits of
              floating-point register ft-1 are stored.
              If the FR bit is 1, all the 64-bit floating-point registers can be accessed; therefore,
              the contents of the low-order 32 bits in the ft field are stored.
              If either of the low-order two bits of the address are not zero, an address error
              exception occurs.
         Operation:
  32         T:     vAddr ¬ ( (offset15)16 || offset15...0) + GPR[base]
                    (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                    data ¬ FGR [ft]31...0
                    StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
  64         T:     vAddr ¬ ( (offset15)48 || offset15...0) + GPR[base]
                    (pAddr, uncached) ¬ AddressTranslation (vAddr, DATA)
                    data ¬ FGR [ft]31...0
                    StoreMemory (uncached, WORD, data, pAddr, vAddr, DATA)
         Exceptions:
                  Coprocessor unusable
                  TLB miss exception
                  TLB invalid exception
                  TLB modification exception
                  Bus error exception
                  Address error exception
                                 Floating-point
TRUNC.L.fmt                    Truncate To Long               TRUNC.L.fmt
                               Fixed-point Format
31 26 25 21 20 16 15 11 10 6 5 0
     Format:
            TRUNC.L.fmt fd, fs
     Description:
            The contents of floating-point register fs are converted into the 64-bit fixed-point
            format, and the result is stored to floating-point register fd. The source operand is
            processed in the floating-point format fmt.
            The result of the conversion is rounded toward the 0 direction, regardless of the
            current rounding mode.
            This instruction is valid only for conversion from the single- or double-precision
            floating-point format.
            If the FR bit of the Status register is 0, only an even number can be specified as a
            register number because adjacent even-numbered and odd-numbered registers are
            used in pairs as a floating-point registers. If an odd number is specified, the
            operation is undefined. If the FR bit of the Status register is 1, both the odd and
            even register numbers are valid.
            If the source operand is infinite or NaN, and if the rounded result is outside the
            range of 263 –1 to –263, the invalid operation exception occurs. If the invalid
            operation exception is not enabled, the exception does not occur, and 263–1 is
            returned.
            This operation is defined in the 64-bit mode and 32-bit Kernel mode. If this
            instruction is executed during 32-bit User/Supervisor mode, a reserved instruction
            exception occurs.
                                      Floating-point
TRUNC.L.fmt                        Truncate To Long                TRUNC.L.fmt
                                   Fixed-point Format
                                       (continued)
Operation:
         Exceptions:
                Coprocessor unusable exception
                Floating-point exception
                Reserved instruction exception (VR4300 in 32-bit User or Supervisor mode)
         Floating-Point Exceptions:
                Invalid operation exception
                Unimplemented operation exception
                Inexact operation exception
                Overflow exception
         Restrictions:
                An unimplemented operation exception will occur in the following cases.
                       •   If an overflow occurs during conversion to integer format
                       •   If the source operand is an infinite number
                       •   If the source operand is NaN
                Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                format to a fixed-point format is 1, an unimplemented operation exception will
                occur. This includes cases when there is an overflow during conversion.
                                  Floating-point
TRUNC.W.fmt                     Truncate To Single             TRUNC.W.fmt
                                Fixed-point Format
31 26 25 21 20 16 15 11 10 6 5 0
      Format:
             TRUNC.W.fmt fd, fs
      Description:
             The contents of floating-point register fs are arithmetically converted into a 32-bit
             fixed-point single format, and the result is stored to floating-point register fd. The
             source operand is processed in the floating-point format fmt.
             The result of the conversion is rounded toward the 0 direction, regardless of the
             current rounding mode.
             This instruction is valid only for conversion from the single- or double-precision
             floating-point format.
             If the FR bit of the Status register is 0, only an even number can be specified as a
             register number because adjacent even-numbered and odd-numbered registers are
             used in pairs as a floating-point registers. If an odd number is specified, the
             operation is undefined. If the FR bit of the Status register is 1, both the odd and
             even register numbers are valid.
             If the source operand is infinite or NaN, and if the rounded result is outside the
             range of 231 –1 to –231, the invalid operation exception occurs. If the invalid
             operation exception is not enabled, the exception does not occur, and 231 –1 is
             returned.
TRUNC.W.fmt                             Floating-point
                                     Truncate To Single
                                                                     TRUNC.W.fmt
                                     Fixed-point Format
                                         (continued)
Operation:
         Exceptions:
                  Coprocessor unusable exception
                  Floating-point exception
         Floating-Point Exceptions:
                  Invalid operation exception
                  Unimplemented operation exception
                  Inexact operation exception
                  Overflow exception
         Restrictions:
                  An unimplemented operation exception will occur in the following cases.
                         •   If an overflow occurs during conversion to integer format
                         •   If the source operand is an infinite number
                         •   If the source operand is NaN
                  Essentially, if any of bits 53 to 62 of the result of conversion from a floating-point
                  format to a fixed-point format is 1, an unimplemented operation exception will
                  occur. This includes cases when there is an overflow during conversion.
                                                   Opcode
          28...26
31...29    0             1             2           3         4         5             6            7
  0
  1
  2                    COP1
  3
  4
  5
  6                   LWC1                                           LDC1
  7                   SWC1                                           SDC1
                                                       sub
         23...21
 25...24    0          1            2             3           4       5           6           7
   0      MF          DMFh         CF             g          MT      DMTh        CT           g
   1       BC           g           g              g          g        g          g           g
   2        S          D            g              g         W         Lh         g           g
   3        g           g           g              g          g         g         g           g
        18...16                                         br
 20...19 0               1          2              3         4         5             6        7
   0     BCF           BCT        BCFL          BCTL         *         *             *        *
   1      *              *           *             *         *         *             *        *
   2      *              *           *             *         *         *             *        *
   3      *              *           *             *         *         *             *        *
             2...0                                function
  5...3         0            1 2                3        4       5                 6        7
    0        ADD             MUL
                            SUB                DIV     SQRT     ABS               MOV      NEG
    1     ROUND.Lh TRUNC.Lh CEIL.Lh          FLOOR.Lh ROUND.W TRUNC.W          CEIL.W    FLOOR.W
    2        g        g        g                g        g       g                 g         g
    3        g        g        g                g        g       g                 g         g
    4      CVT.S    CVT.D      g                g      CVT.W CVT.Lh                g         g
    5        g        g        g                g        g       g                 g         g
    6        C.F          C.UN  C.EQ           C.UEQ      C.OLT        C.ULT   C.OLE      C.ULE
    7        C.SF        C.NGLE C.SEQ          C.NGL       C.LT        C.NGE    C.LE      C.NGT
                     Key:
                     *            When the operation code marked with an asterisk is executed, the
                                  reserved instruction exception occurs. This code is reserved for
                                  future expansion.
                     h            When the operation code marked with an eta is executed, the result
                                  is valid only when use of the MIPS III instruction set is enabled.
                                  If the operation code is executed when use of the instruction set is
                                  disabled (in the 32 bit User/Supervisor mode), the unimplemented
                                  operation exception occurs.
18
                  Connect several passive elements externally to the VR4300 so that the processor
                  can operate normally. Connect the elements to the PLLCap0, PLLCap1, VDDP,
                  and GNDP pins.
                      Figure 18-1 shows the connections of the passive elements for PLL.
VDD
VR4300 R L
VDDP
             PLLCap1
                                          Cp            C2                       C1           C3
                                          %1
                GNDP
                                          Cp
                                          %2
             PLLCap0
                                                                            R         L
GND
                  Remarks 1.        C1, C2, C3, Cp%1, Cp%2, R, and L are mounted on the board.
                               2.   Either R or L may do in a system where it has been confirmed
                                    through experiment that noise is not superimposed on VDDP and
                                    GNDP.
                               3.   The value of each element differs depending on the system. Find
                                    the appropriate values for each system through experiment.
                  Figure 18-1       Connection Example of PLL Passive Elements
Figure 18-2 shows a layout example of 120-pin plastic QFP and capacitor on
PWB.
PWB x
%2
                                 mPD30200GD
                                                C2
                x
%1
                                                    x
                                 x
Separate the wiring of the power (VDDP) and ground (GNDP) for PLL from the
normal power (VDD) and ground (GND) wiring. Here is an example of the value
of each element.
R = 5W C1 = 1 nF C2 = 82 nF
C3 = 10 mF Cp = 470 pF
Because the optimum values of filter elements differ depending on the application
and noise environment of the system. Therefore, the above values are given for
reference only. Find the optimum values for users’ application through trial and
error. A choke element (inductor: L) may be used instead of the resistor (R) used
as a power filter.
19
The status in which CP0 hazard must be taken into consideration when each
instruction is executed is explained below.
(1) MTC0
Destination: Completion of writing to destination register (CP0) by MTC0
instruction
(2) MFC0
Source: Determination of source register (CP0) of MFC0 instruction
(3) TLBR
Source: Determination of TLB status and Index register before execution of TLBR
instruction
(4) TLBWI, TLBWR
Source: Determination of source register of TLBWI and TLBWR instructions and
register used for TLB entry specification
Destination: Completion of writing to TLB by TLBWI and TLBWR instructions
(5) TLBP
Source: Determination of PageMask register and EntryHi register before
execution of TLBP instruction
Destination: Completion of writing result of TLBP instruction execution to Index
register
(6) ERET
Source: Determination of register holding information necessary for ERET
instruction execution
Destination: Completion of processor status transition due to ERET instruction
execution
(7) CACHE Index Load Tag
Destination: Completion of writing execution of this instruction to each register
(8) CACHE Index Store Tag
Source: Determination of register holding information necessary for execution of
this instruction
(9) Coprocessor use test
Source: Determination of mode set by bit value of CP0 register in Source column
             Examples 1. When accessing the CP0 register in the user mode after changing
                         the content of the Status.CU0 bit or when executing an instruction
                         using the resources of CP0 (such as TLB instruction, Cache
                         instruction, or branch instruction)
                        2. When accessing the CP0 register in the operating mode used after
                           the contents of the Status.KSU, EXL, and ERL bits have been
                           changed
                        3. When using the FPU (CP1) after the content of the Status.CU1 bit
                           has been changed
             (10) Instruction fetch
             Source: Determination of operating mode and TLB necessary for instruction fetch
             Examples 1. When fetching instructions after the mode has been changed from
                         User to Kernel after the contents of the Status.KSU, EXL, and
                         ERL bits have been changed
                        2. When rewriting TLB and fetching an instruction by using its TLB
                           entry
             (11) Instruction fetch exception
             Destination: Completion of writing to each register holding information related to
             an exception when the exception has occurred as a result of instruction fetch
             (12) Interrupt
             Source: Determination of each register that identifies an exception generation
             condition when an interrupt cause occurs
             (13) Load/store
             Source: Determination of operating mode related to address generation by load/
             store instruction, determination of TLB entry, determination of cache mode set by
             the Config.K0 bit, and determination of a register that sets a watch exception
             generation condition
               Example     When executing the load/store instruction in the kernel area after
                           the mode has been changed from User to Kernel
             (14) Load/store exception
             Destination: Completion of writing to each register holding information related to
             an exception when the exception occurs as a result of a load/store operation
             (15) TLB shut down
             Destination: Completion of writing to Status.TS bit when TLB shut down occurs
               Table 19-2 shows examples of calculating the number of CP0 hazards and the
               number of instructions to be inserted.
Table19-2 Example of Calculating Number of CP0 Hazards and Number of Instructions Inserted
                                                      Conflicting Number of
       Destination                 Source              Internal Instructions Expression
                                                      Resources    Inserted
    TLBWR/TLBWI           TLBP                        TLB Entry        3      8–(4+1)
                          Load/store using newly
    TLBWR/TLBWI                                        TLB Entry          3         8–(4+1)
                          rewritten TLB
                          Instruction fetch using
    TLBWR/TLBWI                                        TLB Entry          5         8–(2+1)
                          newly rewritten TLB
                          Coprocessor instruction      Status
    MTC0, Status [CU]                                                     4         7–(2+1)
                          requiring setting of CU      [CU]
    TLBWR                 MFC0 EntryHi                 EntryHi            3         8–(4+1)
    MTC0 EntryLo0         TLBWR/TLBWI                  EntryLo0           1         7–(5+1)
    TLBP                  MFC0 Index                   Index              2         7–(4+1)
    MTC0 EntryHi          TLBP                         EntryHi            1         7–(5+1)
    MTC0 EPC              ERET                         EPC                2         7–(4+1)
    MTC0 Status           ERET                         Status             3         7–(3+1)
                          Instruction causing
    MTC0 Status [IE]*                                  Status [IE]        3         7–(3+1)
                          interrupt
         The following table describes the differences between the VR4300, VR4305, and
         VR4310.
                   *1. The 1.5 times frequency setting is allowed with the 100 MHz model only.
                       (With the 133 MHz model, this setting is reserved.)
                   *2. The 4 times frequency setting is allowed with the 133 MHz model only.
                       (With the 100 MHz model, this setting is reserved.)
                   *3. The 2.5 times frequency setting is allowed with the 167 MHz model only.
                       (With the 133 MHz model, this setting is reserved.)
                   *4. The 133 MHz model of the VR4300 is not supported.
         The VR4300 is slightly different from the VR4400 in terms of system design and
         software. This Appendix describes the differences between the VR4300 and
         VR4400.
         The major differences lie in cache handling. This is because the VR4300 does not
         support a secondary cache control function and a multi-processing function and
         because it employs a 32-bit external bus interface.
             The CH bit of the VR4300 can be written only by software. With the VR4400,
             however, this bit is set or cleared by hardware when a secondary cache instruction
             is executed.
             The CE and DE bits of the Status register of the VR4300 are used to manipulate
             the parity and do not affect the operation.
             For details, refer to 6.3.5 Status Register (12).
                     Product Name
                                                  VR4300                          VR4400
 Function
 CACHE              Secondary cache     Not supported                  Supported
 instruction        Parity              None                           Provided
                                                          *
 Status register    Bit 27              Low power mode                 0
                    Bit 24              Instruction trace support      0
                                        Do not affect processor
                    CE and DE bits                                     Used for parity
                                        operation
 Config register                        Only part of bit functions
                                                                       All supported
                                        supported
 Unimplemented operation                Cause bits other than E bit    Cause bits other than E bit
 exception                              cleared                        undefined
 Integer zero division                  Value returned to register differs
 Cache error exception                  Does not occur                 Always normal operation
         Effect of RP Bit
                With the VR4400, SClock and TClock are not affected by the RP bit. The
                VR4300, in contrast, can reduce the clock frequencies of SClock and TClock to
                the 1/4 of the normal level by using the RP bit*.
                To use this function, if there is an external circuit (such as a DRAM refresh
                counter) that is affected by changes in the frequency of the clock supplied by the
                VR4300 to external devices, incorporate a process that supports frequency
                conversion of the external circuit into the software.
                * 100 MHz model of the VR4300 and the VR4305 only
                     Product Name
                                                 VR4300                       VR4400
 Function
 Initialization of processor           Set by external pins         Set by software
 System       Bus width                32                           64
 interface    Data check               Not performed                Parity/ECC selectable
              Multi-processing and     Not supported                Supported
              secondary cache
              Line size of cache       Instruction: 8 words         4/8 words selectable for
                                       Data: 4 words                both instruction/data cache
              Data rate                2 types                      9 types
              TClock                   1                            2
              RClock                   None                         2
              Effect of RP bit         Reduces frequencies of       Does not affect TClock and
                                       TClock and SClock to 1/4*    SClock
              Product Name
                                            VR4300                    VR4000         VR4400
 Item
 Cache        Instruction      16 KB                                8 KB           16 KB
 capacity     Data             8 KB                                 8 KB           16 KB
 Line size                     Instruction: 8 words (32 bytes)      4/8 words selectable
                               Data: 4 words (16 bytes)
 Method                        Direct map, virtual index
B.3.2 TLB
        TLB Entry
              The VR4300 has a full-associate TLB with 32 entries. Each entry is mapped to the
              even/odd page of a page frame number.
              The TLB of the VR4400 is the same as that of the VR4300 in structure, but has 48
              entries.
B.3.4 Pipeline
              The VR4400 uses an 8-stage super pipeline.
              The VR4300 uses a 5-stage pipeline like that of the VR3000. The pipeline of the
              VR4300 is not a super pipeline, but is not different from the super pipeline in terms
              of functions. However, if the program is optimized, the performance of the
              pipeline may be influenced.
              The number of stall cycles that are generated by the VR4300 is fewer than that of
              the VR4400.
B.3.5 Interrupt
             The bit 15 of the cause register of the VR4300 is dedicated to the timer interrupt
             that occurs if the value of the counter register coincides with the value of the
             compare register. Therefore, the VR4300 is not provided with the Int5 pin that is
             provided to the VR4400.
             Because the VR4300 does not have bit 5 in the interrupt register*, it does not
             operate even if data is written to the interrupt register via the system interface.
             With the VR4400, the user can select whether to use the timer interrupt, or the bit
             5 of the interrupt register, by using the bit 15 of the cause register.
             * This register cannot be directly written by the user via software.
B.3.7 JTAG
             The VR4300 conforms to IEEE149.1-1990. Consequently, the JTDO signal
             becomes active in the shift IR and shift DR modes.
             Because the VR4400 conforms to the previous version of the IEEE149.1, the
             JTDO signal is not driven.
                          Product Name
                                                      VR4300                   VR4000        VR4400
 Item
 Instruction cache size                     16 KB                          8 KB             16 KB
 Data cache size                            8 KB                           8 KB             16 KB
 TLB               TLB size                 32 entries                     48 entries
                   Interaction between      TLB operation is corrected     TLB invalid exception
                   IMT and TLB                                             occurs
                   manipulations
 Floating-point    Data path                Shared with integer            Processed by dedicated
 operation                                  operation pipeline             pipeline
                   Instruction              All multi-cycle                Each multi-cycle
                   execution time           instructions are executed      instruction is executed in
                                            in 1 cycle when source         the same number of cycles
                                            exception occurs.              regardless of whether
                                                                           exception occurs.
                   Cvt.[s, d].I             All bits 63 to 55 are 1 or 0   All bits 63 to 52 are 1 or 0
                   instruction
                   (checking of
                   floating-point
                   unimplemented
                   operation exception)
                   Effect of RP bit         Reduces operating              Does not affect operating
                                            frequency to 1/4*              frequency
 Pipeline                                   5 stages                       8 stages
                                            Basic pipeline                 Super pipeline
 Interrupt         Cause register           Dedicated to timer             Selectable by user
                   (bit 15)                 interrupt
                   Interrupt register       None
                   (bit 5)
 Kernel physical           Physical         32 bits                        36 bits
 address segment           address space
 configuration             supported
 (xkphys)                  Valid address    8                              5
                           space
 JTAG                                       JTDO active in shift IR        JTDO not driven in shift
                                            and shift DR modes             IR and shift DR modes
         The VR4300 is slightly different from the VR4200 in terms of system design and
         software. This Appendix describes the differences between the VR4300 and
         VR4200.
         The major differences are that the VR4300 employs a new 32-bit system interface
         and deletes the data check function by parity.
                        Product Name
                                                   VR4300                      VR4200
Function
Cache parity                              Not supported               Supported
Status register                           CE and DE bits do not       Used to manipulate parity
                                          function
Config register   BE bit and EP area      Set default values          Set information on
                                                                      external pins
                  Bits 18 and 19          01                          00
C.2.2 Clock
              The VR4300 does not output the MasterOut and RClock signals.
              The frequency of the pipeline clock (PClock) of the VR4400 and VR4200 is
              usually two times faster than MasterClock. The VR4300 can change the
              frequency ratio by using the value of DivMode(1:0)*1 pins. (Refer to Table 2-2
              Clock/Control Interface Signals.) The frequency ratio PClock:MasterClock
              can be selected from 2:1, 3:1, 4:1 or 3:2*2. The VR4200 usually generates SClock
              and TClock by dividing PClock by 2. The PClock of the VR4300 is usually at
              the same frequency as MasterClock.
              In the low power mode*3, the speeds of PClock, SClock, and TClock of the
              VR4300 can be reduced to the 1/4 of the normal level like the VR4200.
              *1. In VR4300 and VR4305. In VR4310, DivMode(2:0).
             *2. In VR4300. In VR4305, the frequency ratio can be set to 1:1, 2:1, or 3:1. In
                 VR4310, it can be set to 2:1, 3:1 4:1, 5:1, 6:1, or 5:2.
             *3. 100 MHz model of the VR4300 and the VR4305 only
C.2.3 Package
             The VR4200 employs a 208-pin plastic QFP. The VR4300 is housed in a 120-pin
             plastic QFP.
                Product Name
                                           VR4300                         VR4200
Function
System          SysAD bus       No parity, 32 bits             With parity, 64 bits
interface       Instruction     Word data, 8 times             Doubleword data, 4 times
                block write
                Data block      Word data, 4 times             Doubleword data, 2 times
                write
                Data pattern    Set by config register         Set by external pins
                                (D, Dxx)                       (DDx, Dxx)
Clock           MasterOut,      Not output                     Output
                RClock
                PClock          Frequency ratio to             Frequency two times higher
                                MasterClock variable           than normal MasterClock
                TClock          Same frequency as normal       PClock divided by two
                                MaterClock
Package                         120-pin plastic QFP            208-pin plastic QFP
C.3.3 Reset
                    The VR4200 simultaneously asserts the ColdReset and Reset signals active.
                    These signals of the VR4300 need not to be asserted active at the same time. The
                    Reset signal of the VR4300 may be active or inactive during cold reset. However,
                    do not change the value of this signal during reset sequence.The ColdReset signal
                    of the VR4300 needs not to be synchronized with the MasterClock signal.
                Product Name
                                                VR4300                             VR4200
 Function
 Physical address                    32 bits                           33 bits
 Write buffer                        4-entry                           2-entry
                                     Word buffer                       Doubleword buffer
 ColdReset signal and                Need not to be synchronized       Must be synchronized
 MasterClock
 Status (3:0) pins                   Not provided                      Provided
C                                                     D
Cache error register ... 178                          Data cache ... 36, 277, 283
CACHE instruction ... 112, 305                        Data cache addressing ... 278
Cache line ... 275, 283                               Data cache busy ... 111
Cache line replacement ... 280, 282                   Data cache miss ... 111
Cache memory ... 273                                  Data cache read request ... 290
Cache operation ... 279                               Data cycle ... 292
Cache state transition ... 283                        Data format ... 41
Cache states ... 283                                  Data identifier ... 333, 337
Cause register ... 171                                Data load miss ... 281
Clock generator ... 35                                Data store miss ... 281
Clock interface ... 257                               DCB ... 111
Clock-to-Q delay ... 258                              DCM ... 111
J                                                        O
Joint TLB ... 48                                         Opcode bit encoding ... 544, 613
JTAG ... 341                                             Operating mode ... 49, 127, 169
JTLB ... 48                                              Operation during no branch ... 78
Jump instruction ... 77, 369                             Overflow exception ... 242
K                                                        P
Kernel address space ... 169                             PageMask register ... 148, 149
Kernel extended addressing mode ... 255                  Parity error register ... 178
Kernel mode ... 133                                      PClock ... 259
                                                         Phase-locked loop (PLL) ... 263
                                                         Phase-locked system ... 265, 266
L
                                                         Physical address ... 123, 289
LDI ... 110
                                                         Pin configuration (Top View) ... 52
LLAddr register ... 154
                                                         Pin function ... 51, 54
Load delay ... 95
                                                         Pipeline ... 36, 89
Load delay slot ... 61
                                                         Pipeline exception ... 114
Load instruction ... 61, 367, 553
                                                         PLL ... 263
Load interlock ... 110
                                                         PLL passive element ... 615
Load miss ... 304
                                                         Power-ON reset ... 248, 249
Low power mode ... 254, 264, 360
                                                         Privilege mode ... 255
Processor read request ... 301, 306                      Status register ... 165
Processor request ... 293, 298, 306                      Status on reset ... 170
Processor revision identifier register ... 151           Store delay slot ... 61
Processor write request ... 301, 309                     Store instruction ... 61, 367, 553
Power mode ... 254                                       Store miss ... 304
Power off mode ... 255, 361                              Successive processing of request ... 321
Precision of exception ... 161                           SyncIn/CyncOut ... 259
Priority (exception) ... 182                             System call exception ... 191
Priority (exception and interlock) ... 116               System control coprocessor (CP0) ... 44, 142
PRId register ... 151                                    System control coprocessor (CP0)
                                                             instruction ... 86, 370
                                                         System event ... 299
R                                                        System interface ... 35, 289, 296
Random register ... 147                                  System interface address ... 339
Read command ... 327                                     System interface cycle time ... 332
Read request ... 334                                     System timing parameter ... 263
Read response ... 303, 313, 317, 330
Re-executing command ... 325
Release latency time ... 332                             T
Request control ... 300, 302                             TagHi register ... 154
Request issuance ... 300, 302                            TagLo register ... 154
Reserved instruction exception ... 194                   TAP ... 347
Reverse endianness ... 256                               TAP controller ... 348
                                                         TClock ... 260
                                                         Test access port ... 347
S                                                        Timer interrupt ... 354
Saving and returning ... 244                             TLB ... 48, 122
SClock ... 260, 263                                      TLB entry ... 143
Sequential ordering ... 339                              TLB exception ... 187
Slave state ... 298                                      TLB invalid exception ... 188
Soft reset ... 248, 251                                  TLB instruction ... 158
Soft reset exception ... 184                             TLB miss ... 158
Software interrupt ... 354                               TLB miss exception ... 187
Special instruction ... 81                               TLB modification exception ... 189
Subblock ordering ... 339                                Translation lookaside buffer ... 48, 122
Supervisor address space ... 169                         Transmission time ... 268
Supervisor extended addressing mode ... 255              Trap exception ... 195
Supervisor mode ... 129
U
Uncached area ... 305
Uncompelled change to slave state ... 298
Underflow exception ... 242
Unimplemented operation exception ... 243
User address space ... 169
User extended addressing mode ... 255
User mode ... 127
V
Virtual address ... 124
Virtual address translation ... 155
W
Watch exception ... 198
WatchHi register ... 175
WatchLo register ... 175
Wired register ... 150
Write buffer ... 120
Write command ... 325
Write request ... 330, 336
X
XContext register ... 176
Tel. FAX
Address
 North America                  Hong Kong, Philippines, Oceania   Asian Nations except Philippines
 NEC Electronics Inc.           NEC Electronics Hong Kong Ltd.    NEC Electronics Singapore Pte. Ltd.
 Corporate Communications Dept. Fax: +852-2886-9022/9044          Fax: +65-250-3583
 Fax: 1-800-729-9288
      1-408-588-6130
                                Korea                             Japan
 Europe
                                NEC Electronics Hong Kong Ltd.    NEC Semiconductor Technical Hotline
 NEC Electronics (Europe) GmbH
                                Seoul Branch                      Fax: 044-435-9608
 Technical Documentation Dept.
                                Fax: 02-528-4411
 Fax: +49-211-6503-274
 South America                  Taiwan
 NEC do Brasil S.A.             NEC Electronics Taiwan Ltd.
 Fax: +55-11-6465-6829          Fax: 02-2719-5951
Document title: