0% found this document useful (0 votes)

15 views159 pages

Module 3-1

The document covers the ARM instruction set, detailing various types of instructions including data processing, branch, load-store, and software interrupt instructions. It provides syntax and examples for move, arithmetic, logical, comparison, and multiply instructions, as well as addressing modes for load and store operations. Additionally, it explains the structure and function of branch instructions for controlling program flow.

Uploaded by

vaishnavireddy1120

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views159 pages

Module 3-1

Uploaded by

vaishnavireddy1120

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 159

Module 3

ARM Instruction Set

ARM Instruction Sets
 Data Processing Instructions
 Branch Instructions
 Load-store instructions
 Software interrupt instructions
 Program status register instructions
 Conditional Execution
Data Processing Instructions
 Manipulate data within registers
 Data processing instructions
◼ Move instructions
◼ Arithmetic instructions
◼ Logical instructions
◼ Comparison instructions
◼ Multiply instructions
Data processing
❑They are move, arithmetic, logical, comparison and
multiply instructions.
❑Most data processing instructions can process one of
their operands using the barrel shifter.

• General rules:
– All operands are 32-bit, coming
from registers or literals.
– The result, if any, is 32-bit and
placed in a register (with the
exception for long multiply which
produces a 64-bit result)
– 3-address format
Data Processing Instruction

5
ARM Instruction Set Format
Move Instruction
 Syntax: <instruction> {<cond>} {S}Rd, N
◼ N: a register or immediate value

 MOV : move
◼ MOV r0, r1; r0 = r1
◼ MOV r0, #5; r0 = 5
 MVN : move (negated)
◼ MVN r0, r1; r0 = NOT(r1)=~ (r1)
Preprocessed by Shifter
 Example 1
◼ PRE: r5 = 5, r7 = 8;

◼ MOV r7, r5, LSL #2; r7 = r5 << 2 = r5*4

◼ POST: r5 = 5, r7 = 20
Preprocessed by Shifter
 LSL: logical shift left
◼ x << y, the least significant bits are filled with zeroes
 LSR: logical shift right:
◼ (unsigned) x >> y, the most significant bits are filled with zeroes
 ASR: arithmetic shift right
◼ (signed) x >> y, copy the sign bit to the most significant bit
 ROR: rotate right
◼ ((unsigned) x >> y) | (x << (32-y))
 RRX: rotate right extended
◼ c flag <<31 | (( unsigned) x >> 1)
◼ Performs 33-bit rotate, with the CPSR’s C bit being inserted above
sign bit of the word
Shift Register Operands
– ADD r1,r2,r3,LSL #3 ;r=
– r1= r2 + (r3 << 3) 31 0 31 0

– A single instruction executed in a

single cycle
00000 00000
❑ LSL: Logical Shift Left by 0 to LSL #5 LSR #5
31 places, 0 filled at the lsb 31 0 31 0

end 0 1

❑ LSR, ASL (Arithmetic Shift

00000 0 11111 1
Left), ASR, ROR (Rotate
ASR #5 , positive operand ASR #5 , negative operand
Right), RRX (Rotate Right
31 0 31 0
etended by 1 bit) C

– ADD r5,r5,r3,LSL r2 ;
r5:=r5+r3*2r2 C C

– MOV r12,r4,ROR r3 ROR #5 RRX

;r12:=r4 rotated right
by value of r3
10
Preprocessed by Shifter (Cont.)
 Example 2
◼ PRE: r0 = 0x00000000, r1 = 0x80000004

◼ MOV r0, r1, LSL #1 ; r0 = r1 *2

◼ POST r0 = 0x00000008, r1 = 0x80000004

Arithmetic Instructions
 Syntax: <instruction> {<cond>} {S}Rd, Rn, N
◼ N: a register or immediate value
 ADD : add
◼ ADD r0, r1, r2; r0 = r1 + r2
 ADC : add with carry
◼ ADC r0, r1, r2; r0 = r1 + r2 + C
 SUB : subtract
◼ SUB r0, r1, r2; r0 = r1 - r2
 SBC : subtract with carry
◼ SUC r0, r1, r2; r0 = r1 - r2 + C -1
Arithmetic Instructions (Cont.)
 RSB : reverse subtract
◼ RSB r0, r1, r2; r0 = r2 – r1
 RSC : reverse subtract with carry
◼ RSC r0, r1, r2; r0 = r2 – r1 + C -1
 MUL : multiply
◼ MUL r0, r1, r2; r0 = r1 x r2
 MLA : multiply and accumulate
◼ MLA r0, r1, r2, r3; r0 = r1 x r2 + r3
Logical Operations
 Syntax: <instruction> {<cond>} {S}Rd, RN, N
◼ N: a register or immediate value
 AND : Bit-wise and
 ORR : Bit-wise or
 EOR : Bit-wise exclusive-or
 BIC : bit clear
◼ BIC r0, r1, r2; r0 = r1 & Not(r2)
Logical Operations (Cont)
 Example 3:
◼ PRE: r1 = 0b1111, r2 = 0b0101

◼ BIC r0, r1, r2 ; r0 = r1 AND (NOT(r2))

◼ POST: r0=0b1010
Comparison Instructions
 Compare or test a register with a 32-bit value
◼ Do not modify the registers being compared or
tested

◼ But only set the values of the NZCV bits of the

CPSR register
 Do not need to apply to S suffix for comparison
instruction to update the flags in CPSR register
Comparison Instructions (Cont.)
 Syntax: <instruction> {<cond>} {S}Rd, N
◼ N: a register or immediate value
 CMP : compare
◼ CMP r0, r1; compute (r0 - r1)and set NZCV
 CMN : negated compare
◼ CMP r0, r1; compute (r0 + r1)and set NZCV
 TST : bit-wise AND test
◼ TST r0, r1; compute (r0 AND r1)and set NZCV
 TEQ : bit-wise exclusive-or test
◼ TEQ r0, r1; compute (r0 EOR r1)and set NZCV
Comparison Instructions (Cont.)
 Example 4
◼ PRE: CPSR = nzcvqiFt_USER, r0 = 4, r9 = 4

◼ CMP r0, r9

◼ POST: CPSR = nZcvqiFt_USER

Multiply Instruction
 Syntax:
◼ MLA{<cond>} {S} Rd, Rm, Rs, Rn
◼ MUL{<cond>} {S} Rd, Rm, Rs
 MUL : multiply
◼ MUL r0, r1, r2; r0 = r1*r2
 MLA : multiply and accumulate
◼ MLA r0, r1, r2, r3; r0 = (r1*r2) + r3
Multiply Instruction (Cont.)
 Syntax: <instruction>{<cond>} {S}RdLo, RdHi, Rm, Rs
◼ Multiply onto a pair of register representing a 64-bit value
 UMULL : unsigned multiply long
◼ UMULL r0, r1, r2, r3; [r1,r0] = r2*r3
 UMLAL : unsigned multiply accumulate long
◼ UMLAL r0, r1, r2, r3; [r1,r0] = [r1,r0]+(r2*r3)
 SMULL: signed multiply long
◼ SMULL r0, r1, r2, r3; [r1,r0] = r2*r3
 SMLAL : signed multiply accumulate long
◼ SMLAL r0, r1, r2, r3; [r1,r0] = [r1,r0]+(r2*r3)
Branch Instructions
 Branch instruction
◼ Change the flow of execution
◼ Used to call a routine
 Allow applications to
◼ Have subroutines
◼ Implement if-then-else structure
◼ Implement loop structure
Branch Instructions (Cont.)
 Syntax
◼ B{<cond>} lable
◼ BL{<cond>} lable
 B : branch
◼ B label; pc (program counter) = label
◼ Used to change execution flow
 BL : branch and link
◼ BL label; pc = label, lr = address of the next
address after the BL
◼ Similar to the B instruction but can be used for subroutine
call
 Overwrite the link register (lr) with a return address
Branch Instructions (Cont.)
 Example 5
B forward
ADD r1, r2, #4
ADD r0, r6, #2
ADD r3, r7, #4
Forward
SUB r1, r2, #4
Backward
SUB r1, r2, #4
B backward
Branch Instructions (Cont.)
 Example 6:
BL subroutine
CMP r1, #5
MOVEQ r1, #0
…
subroutine
<subroutine code>
MOV pc, lr ; return by moving pc = lr
Load-Store Instructions
 Transfer data between memory and processor
registers

 Three types
◼ Single-register transfer
◼ Multiple-register transfer
◼ Swap
Simple-Register Transfer
 Moving a single data item in and out of
register

 Data item can be

◼ A word (32-bits)
◼ Halfword (16-bits)
◼ Bytes (8-bits)
Simple-Register Transfer (Cont.)
 Syntax
◼ <LDR|STR>{<cond>}{B} Rd, addressing1
◼ LDR{<cond>}SB|H|SH Rd, addressing2
◼ STR{<cond>} H Rd, addressing2
 LDR : load word into a register from memory
 LDRB : load byte
 LDRSB : load signed byte
 LDRH : load half-word
 LSRSH : load signed halfword
 STR: store word from a register to memory
 STRB : store byte
 STRH : store half-word
Simple-Register Transfer (Cont.)
 Example 7
LDR r0, [r1] ;= LDR r0, [r1, #0]
;r0 = mem32[r1]
STR r0, [r1] ;= STR r0, [r1, #0]
;mem32[r1]= r0

◼ Register r1 is called the base address register

Single-Register Load-Store Addressing Mode

 Index method, also called Base-Plus-Offset

Addressing
◼ Base register
 r0 – r15
◼ Offset, add or subtract an unsigned number
 Immediate
 Register (not PC)
 Scaled register
Single-Register Load-Store Addressing
Mode (Cont.)
 Preindex:
◼ data: mem[base+offset]
◼ Base address register: not updated
◼ Ex: LDR r0,[r1,#4] ; r0:=mem32[r1+4]
 Postindex:
◼ data: mem[base]
◼ Base address register: base + offset
◼ Ex: LDR r0,[r1],#4 ; r0:=mem32[r1], then r1:=r1+4
 Preindex with writeback (also called auto-indexing)
◼ Data: mem[base+offset]
◼ Base address register: base + offset
◼ Ex: LDR r0, [r1,#4]! ; r0:=mem32[r1+4], then r1:=r1+4
Single-Register Load-Store Addressing
Mode (Cont.)

 Example 8
◼ r0 = 0x00000000, r1 = 0x00009000,
mem32[0x00009000] = 0x01010101,
mem32[0x00009004] = 0x02020202
◼ Preindexing: LDR r0, [r1, #4]
 r0 = 0x02020202, r1=0x00009000
◼ Postindexing: LDR r0, [r1], #4
 r0 = 0x01010101, r1=0x00009004
◼ Preindexing with writeback: LDR r0, [r1, #4]!
 R0 = 0x02020202, r1=0x00009004
Single-Register Load-Store Addressing
Mode (Cont.)
Addressing mode and index method Addressing syntax
Preindex with immediate offset [Rn, #+/-offset_12]
Preindex with register offset [Rn, +/-Rm]
Preindex with scaled register offset [Rn, +/-Rm, shift #shift_imm]
Preindex writeback with immediate offset [Rn, #+/-offset_12]!
Preindex writeback with register offset [Rn, +/-Rm]!
Preindex writeback with scaled register offset [Rn, +/-Rm, shift #shift_imm]
Immediate postindexed [Rn], #+/-offset_12]
Register postindexed [Rn], +/-Rm!
Scaled register postindexed [Rn], +/-Rm, shift #shift_imm
Examples of LDR Using Different
Addressing Modes
Instruction r0= r1+=
Preindex with LDR r0, [r1, #0x4]! mem32[r1+0x4] 0x4
writeback
LDR r0, [r1,r2]! mem32[r1+r2] r2
LDR r0,[r1, r2, LSR#0x4]! mem32[r1+(r2 LSR 0x4)] (r2 LSR 0x4)
Preindex LDR r0, [r1, #0x4] mem32[r1+0x4] not updated
LDR r0, [r1, r2] mem32[r1+r2] not updated
LDR r0, [r1, -r2, LSR #0x4] Mem32[r1-(r2 LSR 0x4)] not updated
Postindex LDR r0, [r1], #0x4 mem32[r1] 0x4
LDR r0, [r1], r2 Mem32[r1] r2
LDR r0, [r1], r2 LSR #0x4 mem32[r1] (r2 LSR 0x4)
Multiple-Register Transfer
 Transfer multiple registers between memory
and the processor in a single instruction

 More efficient than single-register transfer

◼ Moving blocks of data around memory
◼ Saving and restoring context and stack
Multiple-Register Transfer (Cont.)
 Load-store multiple instruction can increase interrupt
latency
◼ Interrupt can be occurred after an instruction has been
completed
◼ Each load multiple instruction takes 2 + N*t cycles
 N: the number of registers to load
 t: the number of cycles required for sequential access to memory
◼ Compilers provides a switch to control the maximum
number of registers between transferred
 Limit the maximum interrupt latency
Multiple-Register Transfer (Cont.)
 Syntax:
◼ <LDM|STM>{<cond>} <mode> Rn{!}, <registers>{^}
◼ Address mode: See the next page
◼ ^: optional
 Can not be used in User Mode and System Mode
 If op is LDM and reglist contains the pc (r15)
◼ SPSR is also copied into the CPSR.
 Otherwise, data is transferred into or out of the User mode
registers instead of the current mode registers.
Multiple-Register Transfer (Cont.)
 Example 9
◼ PRE:
mem32[0x80018] = 0x03,
mem32[0x80014] = 0x02,
mem32[0x80010] = 0x01,
r0 = 0x00080010,
r1 = r2 = r3= 0x00000000

◼ LDMIA r0!, {r1-r3}, or LDMIA r0!, {r1, r2, r3}

 Register can be explicitly listed or use the “-” character
Pre-Condition for LDMIA Instruction
Memory Address Data
0x80020 0x00000005
0x8001c 0x00000004
0x80018 0x00000003 R3=0x00000000

0x80014 0x00000002 R2=0x00000000

R0 = 0x80010 0x80010 0x00000001 R1=0x00000000

0x8000c 0x00000000
Figure 1
Post-Condition for LDMIA Instruction
Memory Address Data
0x80020 0x00000005

R0 = 0x8001c 0x8001c 0x00000004

0x80018 0x00000003 R3=0x00000003

0x80014 0x00000002 R2=0x00000002

0x80010 0x00000001 R1=0x00000001

0x8000c 0x00000000

Figure 2
Multiple-Register Transfer (Cont.)
 Example 9 (Cont.)
◼ POST:
r0 = 0x0008001c,
r1 = 0x00000001,
r2 = 0x00000002,
r3 = 0x00000003
Multiple-Register Transfer (Cont.)
 Example 10
◼ PRE: as shown in Fig. 1
◼ LDMIB r0!, {r1-r3}
◼ POST:
r0 = 0x0008001c
r1 = 0x00000004
r2 = 0x00000003
r3 = 0x00000002
Post-Condition for LDMIB Instruction
Memory Address Data
0x80020 0x00000005
R0 = 0x8001c 0x8001c 0x00000004 R3=0x00000004

0x80018 0x00000003 R2=0x00000003

0x80014 0x00000002 R1=0x00000002

0x80010 0x00000001
0x8000c 0x00000000
Figure 3
Multiple-Register Transfer (Cont.)
 Load-store multiple pairs when base update used (!)
◼ Useful for saving a group of registers and store them later

Store multiple Load multiple

STMIA LDMDB
STMIB LDMDA
STMDA LDMIB
STMDB LDMIA
Multiple-Register Transfer (Cont.)
 Example 11
◼ PRE:
r0 = 0x00009000
r1 = 0x00000009,
r2 = 0x00000008
r3 = 0x00000007
◼ STMIB r0!, {r1-r3}
MOV r1, #1
MOV r2, #2,
MOV r3, #3
Multiple-Register Transfer (Cont.)
 Example 11 (Cont.)
◼ PRE (2):
r0 = 0x0000900c
r1 = 0x00000001,
r2 = 0x00000002
r3 = 0x00000003
◼ LDMDA r0!, {r1-r3}
◼ POST:
r0 = 0x00009000
r1 = 0x00000009,
r2 = 0x00000008
r3 = 0x00000007
Multiple-Register Transfer (Cont.)
 Example 11 (Cont.)
◼ The STMIB stores the values 7, 8, 9 to memory

◼ Then corrupt register r1 to r3 by MOV instruction

◼ Finally, the LDMDA

 Reloads the original values, and
 Restore the base pointer r0
Multiple-Register Transfer (Cont.)
 Example 12: the use of the load-store multiple
instructions with a block memory copy
;r9 points to start of source data
;r10 points to start of destination data
;r11 points to end of the source
loop
LDMIA r9!, {r0-r7} ;load 32 bytes from source and update r9
STMIA r10!, {r0-r7} ;store 32 bytes to desti. and update r10
CMP r9, r11 ;have we reached the end
BNE loop
Multiple-Register Transfer (Cont.)
High memory

r11
Source
r9

Copy memory
Location
(transfer 32 bytes in
two instructions)

Destination
r10

Low memory
Stack Operations
 ARM architecture uses the load-store multiple
instruction to carry out stack operations
◼ PUSH: use a store multiple instruction
◼ POP: use a load multiple instruction
 Stack
◼ Ascending (A): stack grows towards higher
memory addresses
◼ Descending (D): stack grows towards lower
memory addresses
Stack Operations (Cont.)
 Stack
◼ Full stack (F): stack pointer sp points to the last
valid item pushed onto the stack
◼ Empty stack (E): sp points after the last item on
the stack
 The free slot where the next data item will be placed
 There are a number of aliases available to
support stack operations
◼ See next page
Stack Operations (Cont.)
 ARM support all four forms of stacks
◼ Full ascending (FA): grows up; base register points to
the highest address containing a valid item
◼ Empty ascending (EA): grows up; base register points to
the first empty location
◼ Full descending (FD): grows down; base register points
to the lowest address containing a valid data
◼ Empty descending (ED): grows down; base register
points to the first empty location below the stack
Addressing Methods for Stack Operations
Addressing Description Pop =LDM Push =STM
mode

FA Full LDMFA LDMDA STMFA STMIB

ascending

FD Full LDMFD LDMIA STMFD STMDB

descending

EA Empty LDMEA LDMDB STMEA STMIA

ascending

ED Empty LDMED LDMIB STMED STMDA

descending
Stack Operations (Cont.)
 Example 13
◼ PRE:
 r1 = 0x00000002
 r4 = 0x00000003
 sp = 0x00080014
◼ STMFD sp!, {r1, r4}
◼ POST:
 r1 = 0x00000002
 r4 = 0x00000003
 sp = 0x0008000c
Stack Operations (Cont.)
 Example 13 (Cont.)
◼ STMFD – full stack push operation

PRE POST
Address Data Address Data
0x80018 0x00000001 0x80018 0x00000001
sp
0x80014 0x00000002 0x80014 0x00000002

0x80010 Empty 0x80010 0x00000003

sp 0x8000c 0x00000002
0x8000c Empty
Stack Operations (Cont.)
 Example 14
◼ PRE:
 r1 = 0x00000002
 r4 = 0x00000003
 sp = 0x00080010
◼ STMED sp!, {r1, r4}
◼ POST:
 r1 = 0x00000002
 r4 = 0x00000003
 sp = 0x00080008
Stack Operations (Cont.)
 Example 14 (Cont.)
◼ STMED – empty stack push operation
PRE POST
Address Data Address Data
0x80018 0x00000001 0x80018 0x00000001

0x80014 0x00000002 0x80014 0x00000002

sp 0x80010 Empty 0x80010 0x00000003

0x8000c Empty 0x8000c 0x00000002

sp
0x80008 Empty 0x80008 Empty
SWAP Instruction
 A special case of a load-store instruction
◼ Swap the contents of memory with the contents
of a register
◼ An atomic operation
 Cannot not be interrupted by any other instruction or
any other buy access
 The system “holds the bus” until the transaction is
complete
 Useful when implementing semaphores and mutual
exclusion in an operating system
SWAP Instruction (Cont.)
 Syntax: SWP{B}{<cond>} Rd, Rm, [Rn]
◼ tmp = mem32[Rn]
◼ Mem32[Rn] = Rm
◼ Rd = tmp
 SWP: swap a word between memory and a
register
 SWPB: swap a byte between memory and a
register
SWAP Instruction (Cont.)
 Example 15
◼ PRE:
 Mem32[0x9000] = 0x12345678
 r0 = 0x00000000
 r1 = 0x11112222
 r2 = 0x00009000
◼ SWP r0, r1, [r2]
◼ POST:
 mem32[0x9000] = 0x11112222
 r0 = 0x12345678
 r1 = 0x11112222
 r2 = 0x00009000
SWAP Instruction (Cont.)
 Example 15 (Cont.)
SPIN
MOV r1, =semaphore
MOV r2, #1
SWP r3, r2, [r1] ;hold the bus until complete
CMP r3, #1
BEQ spin
 The address pointed by the semaphore either contains the
value of 1 or 0
 When semaphore value == 1 , loop until semaphore becomes
0 (updated by the holding process)
Software Interrupt Instruction
 SWI: software interrupt instruction
◼ Cause a software interrupt exception
◼ Provide a mechanism for applications to call
operating system routines
◼ Each SWI instruction has an associated SWI
number
 Used to represent a particular function call or routines
Software Interrupt Instruction (Cont.)

 Syntax: SWI{<cond>} SWI_number

◼ lr_svc = address of instruction following the SWI
◼ spsr_svc = cpsr
◼ pc = vector table + 0x8 ; jump to the swi
handling
◼ cpsr mode = SVC
◼ cpsr I = 1 (mask IRQ interrupt)
Software Interrupt Instruction (Cont.)
 Example 16
◼ PRE:
 cpsr = nzcVqift_USER
 pc = 0x00008000
 lr = r14 = 0x003fffff
◼ 0x00008000 SWI 0x123456
◼ POST:
 cpsr = nzcVqIft_SVC
 spsr = nzcVqift_USER
 pc = 0x00000008
 lr = 0x00008004
Program Status Register Instructions

 MRS
◼ Transfer the contents of either the cpsr or spsr
into a register

 MSR
◼ Transter the contents of a register into the cpsr or
spsr
Program Status Register Instructions
(Cont.)
 Syntax
◼ MRS{<cond>} Rd, <cpsr|spsr>
◼ MSR{<cond>} <cpsr|spsr>_<fields>, Rm
◼ MSR{<cond>} <cpsr|spsr>_<fields>, #immediate
 Field: any combination of
◼ Flags: [24:31]
◼ Status: [16:23]
◼ eXtension[8:15]
◼ Control[0:7]
PSR Registers
Program Status Register Instructions
(Cont.)
 Note: You cannot access the SPSR in User or
System Mode
◼ Assembler cannot warn you because it does not
know which mode will be executed in
Program Status Register Instructions
(Cont.)
 Example 17
◼ PRE:
 cpsr = nzcvqIFt_SVC
◼ MRS r1, cpsr
◼ BIC r1, r1, #0x80 ;0b10000000, clear bit 7
◼ MSR cpsr_c, r1 ;enable IRQ interrupts
◼ POST:
 cpsr = nzcvqiFt_SVC
◼ Note that, this example must be in SVC mode
 In user mode, you can only read all cpsr bits and can only update
the condition flag field f, i.e., cpsr[24:31]
Conditional Execution
 Almost all ARM instruction can include an
optional condition code
◼ Instruction is only executed if the condition code
flags in the CPSR meet the specified condition
◼ The default is AL, or always execute
 Conditional executions depends on two
components
◼ The condition field: located in the instruction
◼ The condition flags: located in the cpsr
Conditional Execution (Cont.)
 Example 18

ADDEQ r0, r1, r2

; r0 = r1 + r2 if zero flag is set
Condition Codes
Conditional Execution (Cont.)
 Thus, before activate conditional execution
◼ There must be an instruction that updates the
conditional code flag according the result
◼ If not specified, instructions will not update the
flags
 To make an instruction update the flags
◼ Include the S suffix
◼ Example: ADDS r0, r1,r2
Conditional Execution (Cont.)
 However, some instructions always update the flags
◼ Do not require the S suffix
◼ CMP, CMN, TST, TEQ
 Flags are preserved until updated
 Thus, you can execute an instruction conditionally,
based upon the flags set in another instruction, either:
◼ Immediately after the instruction which updated the flags
◼ After any number of intervening instructions that have not
updated the flags.
Conditional Execution (Cont.)
 Example 18
◼ Transfer the following code into the assembly
language
◼ Assume r1 = a, r2 = b
while ( a!= b )
{
if (a > b) a -= b; else b -= a;
}
Conditional Execution (Cont.)
 Example 18: Solution 1

gcd
CMP r1, r2
BEQ complete
BLT lessthan
SUB r1, r1, r2
B gcd
lessthan
SUB r2, r2, r1
B gcd
complete
Conditional Execution (Cont.)
 Example 18: Solution 2

gcd
CMP r1, r2
SUBGT r1, r1, r2
SUBLT r2, r2, r1
BNE gcd

 Solution 2 dramatically reduces the number of

instructions !!!
References
 Andrew N. Sloss, “ARM System Developer’s
Guide: Designing and Optimizing System
Software,” Morgan Kaufmann Publishers,
2004
◼ Chapter 3: Introduction to the ARM Instruction
Set
ARM7TDMI Microprocessor

Thumb Instruction Set

107 of 37
Processor Operating States

ARM state
which executes 32-bit, word-aligned ARM
instructions.

THUMB state
which operates with 16-bit, halfword-aligned
THUMB instructions.
108 of 37
Thumb Instruction Set

•ARM architecture versions v4T and above define a 16-bit

instruction set called the Thumb instruction set. The
functionality of the Thumb instruction set is a subset of the
functionality of the 32-bit ARM instruction set.

•A processor that is executing Thumb instructions is

operating in Thumb state. A processor that is executing ARM
instructions is operating in ARM state.
109 of 37
Thumb Instruction Set

•A processor in ARM state cannot execute Thumb

instructions, and a processor in Thumb state cannot
execute ARM instructions.

•Each instruction set includes instructions to change

processor state.

Note: ARM processors always start executing code in

ARM state.
110 of 37
Thumb Instruction Set

•Thumb does not provide direct access to the CPSR or any

SPSR.

•Thumb execution is flagged by the T bit(bit[5]) in the CPSR.

T==0 32-bit instructions are fetched(ARM instruction)

T==1 16-bit instructions are fetched(Thumb instruction)

111 of 37
Thumb applications

In a typical embedded system:

use ARM code in 32-bit on-chip memory for small speed- critical routines
use Thumb code in 16-bit off-chip memory for large non-critical control routines

Note:
Switching between ARM and Thumb States of Execution Using BX
Instruction 112 of 37
Thumb applications

For Most Instruction Generated by the Compiler

Condition Execution is not used.
Source and Destination Registers are identical
Only low registers used
Constants are limited size
Inline barrel shifter not used

113 of 37
DATA TYPES

Byte (8-bit):
placed on any byte boundary.
Half-word (16-bit):
aligned to two-byte boundaries.
Word (32-bit):
aligned to four- byte boundaries.

114 of 37
Features
•Not a complete architecture
•Dynamically decompressed to ARM Instruction
•Fully supported by ARM development tools
•Both entry and exit are done using corresponding BX
Instruction
•Increases the maximum clock rate to 40 MHz
•Expanded Cache to 8 kB
•Thumb is a combination of new instruction set with16 bit long
instruction format & Hardware logic unit is present.
•Translated thumb instruction to regular
•Thumb improves ARM instruction density by about 25% to
115 of 37
35%
•16 bit wide memory
Thumb State Philosophy
The Thumb instruction set(16 bit) addresses the issue of code density.
It may be viewed as a compressed form of a subset of the ARM instruction set

Thumb instructions map onto ARM instructions

The Thumb programmer’s model map onto the ARM programmer’s model

Implementations of Thumb use dynamic decompression in an ARM instruction

pipeline & then instructions execute as standard ARM instructions within the
processor
Thumb is not a complete architecture; it is not anticipated that a processor would
execute Thumb instructions without supporting the ARM instruction set.
Therefore Thumb instruction set need to only support common application functions.

Exceptions will not be handled in THUMB state 116 of 37

117 of 37
118 of 37
119 of 37
Thumb-ARM Decompression

•Translation from 16-bit Thumb instruction to 32-bit

ARM instruction
•Condition bits changed to ‘always’
•Lookup to translate major and minor opcodes
•Zero extending 3-bit register specifiers to give 4-bit
specifiers
•Zero extending immediate values
•Implicit ‘S’(affecting condition codes) should be
explicitly specified.
•Thumb 2-address format must be mapped to ARM
3-address format 120 of 37
THUMB-ARM Instruction Mapping

121 of 37
❖ So where performance is all important, a system should use 32 bit memory and run
ARM code
❖ Where Power consumption and cost are more important , a 16 bit memory system
and THUMB code may be a better choice
122 of 37
Mode Switching

•Default entry to exception mode is always ARM

•Explicit entry to Thumb is done using ARM mode BX
Instruction
•Explicit entry back to ARM mode is done using Thumb
mode BX Instruction

123 of 37
124 of 37
125 of 37
Thumb Programmers Model

•Registers r0 to r7 are accessible (Lo)

•Few instructions require r8 to r15 to be specified
•r13 is used as the stack pointer
•r14 is used as the link register
•r15 is used as the program counter

126 of 37
127 of 37
128 of 37
129 of 37
130 of 37
131 of 37
THUMB Programmer’s Model

132 of 37
THUMB Register Organisation
Thumb General registers and Program Counter

User / System FIQ Supervisor Abort IRQ Undefined

r0 r0 r0 r0 r0 r0
r1 r1 r1 r1 r1 r1
r2 r2 r2 r2 r2 r2
r3 r3 r3 r3 r3 r3
r4 r4 r4 r4 r4 r4
r5 r5 r5 r5 r5 r5
r6 r6 r6 r6 r6 r6
r7 r7 r7 r7 r7 r7
SP SP_FIQ SP_SVC SP_ABT SP_IRQ SP_UND

LR LR_ FIQ LR_ SVC LR_ ABT LR_ IRQ LR_ UND

PC PC_ FIQ PC_ SVC PC_ ABT PC_ IRQ PC_ UND

Thumb Program Status Registers

CPSR CPSR CPSR CPSR CPSR CPSR

sprsr_fiq
SPSR_FIQ SPSR_SVC SPSR_ABT sprsr_fiq
SPSR_IRQ SPSR _UND
sprsr_fiq
133 of 37
ARM-Thumb Similarities
•Load-store architecture
•Support 8-bit byte, 16-bit half-word and 32 bit word
data types with aligned boundaries
•32 bit unsegmented memory.
•However , in order to achieve a 16 bit instruction
length a number of characteristic features of the ARM
instruction set have not been supported in Thumb state

134 of 37
ARM-Thumb differences
•Unconditional Execution of instruction except branch instructions
Where all ARM instructions are executed conditionally

•2-address format for data processing

ARM data processing instructions uses 3 address format
(Except 64 bit MUL instructions)
•Thumb instruction are Less regular instruction formats than ARM, as
a result of the dense encoding
•There are NO status register access instructions(MRS/MSR) in
Thumb state
•Many addressing modes of ARM not supported in Thumb state
135 of 37
•No banked registers and privileged modes in Thumb state
ARM-Thumb differences

The biggest register difference involves the SP register

The Thumb state has unique mnemonics (PUSH, POP) that
don’t exist in ARM state
These instructions assume the existence of a stack pointer,
for which R13 is used
They translate into load and store instructions internally
No SWP instructions in Thumb state
No support for coprocessor instructions in Thumb state
Barrel shifter operations are separate instructions

136 of 37
Thumb exception

•With exception processor is returned to ARM mode.

•While returning previous mode is restored as SPSR is

transferred to CPSR

•Use of the Thumb instruction set can improve code

density , Power efficiency, Save cost and Enhance
performance all at one
137 of 37
Thumb Branching

•Short conditional branches

•Medium range unconditional branches

•Long range Subroutine calls

•Branch to change to ARM Mode

138 of 37
Branch Instruction Formats

B <cond> <label>
15 14 13 12 11 8 7 0

1 1 0 1 Condition 8-Bit Offset

B <label>
1 1 1 0 0 11 – Bit Offset

BL <label>
1 1 1 H 11 – Bit Offset

BX Rm
0 1 0 0 0 1 1 1 0 H Rm 0 0 0
139 of 37
THUMB Branch Instructions

140 of 37
Features
•Different format for each case
•Offset is reduced to 11bit and 8 bit
•Offset is shifted left by 1-bit (to give half-word
alignment) and sign-extended to 32 bits.
•BL is more subtle to give 22-bit offset using link register
for temporary storage
•No direct mapping to ARM instructions as Thumb
require half-word aligned offsets.

141 of 37
BL Instruction

To allow for a reasonably large offset to the target

subroutine each of these two instructions is
automatically translated by the assembler into a
sequence of two 16 bit thumb instructions
1. H = 10
LR := PC + (sign-extended offset shifted left 12 places);

2. H = 11
PC := LR + (offset shifted left 1 place)

3. LR := address of next instruction 142 of 37

Software Interrupt Instruction

1 1 0 1 1 1 1 1 8 – Bit Immediate

•Address of next instruction is saved in r14_svc

•CPSR is saved in r14_svc
•Disables IRQ, Clears T bit, Enters Supervisor mode
•PC is forced to 0x08
•8 bit immediate is zero extended to fill the 24-bit
field in the ARM instruction.
•Limits SWIs to first 256 of 16 million ARM SWIs.
143 of 37
Data Processing

•Fairly complex instruction formats

•No conditional execution
•Separate shift operations provided, no shifting of
second operand
•All data processing instruction set condition code bits
(no need of ‘S’)

144 of 37
THUMB data processing instructions

145 of 37
THUMB data processing instructions

146 of 37
Instruction formats

•<op> Rd, Rn, Rm

•<op> Rd, Rn, # <imm3>
•<op> Rd|Rn, Rm|Rs
•<op> Rd, Rn, #<sh 5>
•<op> Rd, #<imm 8>

147 of 37
Instructions

•MOV Rd, #<imm8>

•MVN Rd, Rm
•CMP Rn, #<imm8>
•CMP Rn, Rm
•CMN Rn, Rm
•TST Rn, Rm

148 of 37
Instruction

•ADD Rd, Rn, #<imm3>

•ADD Rd, #< imm8>
•ADD Rd, Rn, Rm
•ADC Rd, Rm
•SUB Rd, Rn, #<imm3>
•SUB Rd, #< imm8>
•SUB Rd, Rn, Rm
•SBC Rd, Rm
•NEG Rd, Rn
149 of 37
Instruction

•LSL Rd, Rm, #<#sh>

•LSL Rd, Rs
•LSR Rd, Rm, #<#sh>
•LSR Rd, Rs
•ASR Rd, Rm, #<#sh>
•ASR Rd, Rs
•ROR Rd, Rs

150 of 37
Instruction

•AND Rd, Rm
•EOR Rd, Rm
•ORR Rd, Rm
•BIC Rd, Rm
•MUL Rd, Rm

151 of 37
Instruction (using Hi registers)

•ADD Rd, Rm (1 or 2 Hi registers)

•CMP Rn, Rm (1 or 2 Hi registers)
•MOV Rd, Rm (1 or 2 Hi registers)
•ADD Rd, PC, #<imm8>
•ADD Rd, SP, #<imm8>
•ADD SP, SP, #<imm7>
•SUB SP, SP, #<imm7>
Except others donot set condition code bits

152 of 37
THUMB Single register data transfer

153 of 37
Data Transfer Instruction

•LDR|STR Rd, [Rn, #off5]

•LDR|STR Rd, [Rn, Rm]
•LDRB|STRB Rd, [Rn, #off5]
•LDRB|STRB Rd, [Rn, Rm]
•LDRH|STRH Rd, [Rn, #off5]
•LDRH|STRH Rd, [Rn, Rm]
Signed operands:
•LDR|STR {S} {H|B} Rd, [Rn, Rm]

154 of 37
THUMB Multiple register data transfer

155 of 37
Multiple register transfers
•LDMIA|STMIA Rn!, { <reg list> }

•Rn may be any register among Ro – R7

•Register set can be any subset of R0 – R7 but not

base register ‘Rn’

•Write back to base register is always selected.

156 of 37
Stack Mode

•POP|PUSH { <reg list> {, R}}

•R13 (sp) is used as base register

•Uses Full Descending Stack

•In addition any subset of Ro-R7 registers LR

(lr) may be included in PUSH instruction and
PC (pc)may be included in POP instruction
157 of 37
Properties

•Thumb code requires 70% of space of ARM code

•Thumb code uses 40% more instructions than the ARM
code
•With 32-bit memory ARM code is 40% faster
•With 16-bit memory Thumb code is 45% faster than ARM
code
•Thumb code uses 30% less external memory power than
ARM code.

158 of 37
Thumb Applications

159 of 37

ARM Instruction Set
100% (1)
ARM Instruction Set
75 pages
19ECE304 - Chapter 3,5 - ARM
No ratings yet
19ECE304 - Chapter 3,5 - ARM
115 pages
5.MHN ARM InstructionSet
No ratings yet
5.MHN ARM InstructionSet
44 pages
657668478
No ratings yet
657668478
78 pages
Topic 3 ARM Instruction Set Part - 1
No ratings yet
Topic 3 ARM Instruction Set Part - 1
47 pages
3.2 Arm Addressing Mode and Instruction Set
No ratings yet
3.2 Arm Addressing Mode and Instruction Set
32 pages
Unit5 2
No ratings yet
Unit5 2
84 pages
Module 2
No ratings yet
Module 2
68 pages
ARM Instruction Set
No ratings yet
ARM Instruction Set
40 pages
Arm MC Module2
No ratings yet
Arm MC Module2
62 pages
Lecture5 ARM Instruction Set
No ratings yet
Lecture5 ARM Instruction Set
46 pages
Lecture 6
No ratings yet
Lecture 6
52 pages
0795 CSC - Instruction Set Architecture
No ratings yet
0795 CSC - Instruction Set Architecture
4 pages
Arithmetic Instructions
No ratings yet
Arithmetic Instructions
100 pages
Arm Assembly Language
No ratings yet
Arm Assembly Language
53 pages
Arminstructionset 201124050104
No ratings yet
Arminstructionset 201124050104
26 pages
AppendixD Assembly Arm
No ratings yet
AppendixD Assembly Arm
53 pages
Module 2 Notes-1
No ratings yet
Module 2 Notes-1
19 pages
Embbed
No ratings yet
Embbed
38 pages
Unit 5 Notes - ARM Instruction
No ratings yet
Unit 5 Notes - ARM Instruction
5 pages
Arm 2
No ratings yet
Arm 2
30 pages
Unit 2 Erts
No ratings yet
Unit 2 Erts
93 pages
Lec09 ARMisa
No ratings yet
Lec09 ARMisa
101 pages
ARM Instruction Set Overview
No ratings yet
ARM Instruction Set Overview
25 pages
ARM Processor Architecture Guide
No ratings yet
ARM Processor Architecture Guide
9 pages
ARM Data Processing Instructions Guide
No ratings yet
ARM Data Processing Instructions Guide
28 pages
Introduction To The ARM Instruction Set:: 1 Module - 2 - Full
No ratings yet
Introduction To The ARM Instruction Set:: 1 Module - 2 - Full
77 pages
ARM Instruction Set-With Examples
No ratings yet
ARM Instruction Set-With Examples
101 pages
ARM Assembly Language Guide
No ratings yet
ARM Assembly Language Guide
43 pages
Module 2
No ratings yet
Module 2
44 pages
Lecture 6
No ratings yet
Lecture 6
54 pages
Arm-Module 7
No ratings yet
Arm-Module 7
37 pages
Lecture 2 - ARM Instruction Set
No ratings yet
Lecture 2 - ARM Instruction Set
42 pages
Lecture 10
No ratings yet
Lecture 10
51 pages
416F22 Chapter3 POST
No ratings yet
416F22 Chapter3 POST
29 pages
DLX Info
No ratings yet
DLX Info
28 pages
Module 5
No ratings yet
Module 5
43 pages
ARM Notes
No ratings yet
ARM Notes
103 pages
Machine Language
No ratings yet
Machine Language
51 pages
MP Session 4
No ratings yet
MP Session 4
16 pages
Arm Instruction Set
No ratings yet
Arm Instruction Set
61 pages
Chandigarh University: University Institute of Engineering
No ratings yet
Chandigarh University: University Institute of Engineering
50 pages
Module 5
No ratings yet
Module 5
67 pages
Module-2: Microcontroller and Embedded Systems
No ratings yet
Module-2: Microcontroller and Embedded Systems
74 pages
ARM ISA Review
No ratings yet
ARM ISA Review
33 pages
Chapter2 Part 2 Machine Instructions and Programs
No ratings yet
Chapter2 Part 2 Machine Instructions and Programs
38 pages
Chapter2 ARMjjh
No ratings yet
Chapter2 ARMjjh
60 pages
DLX Instruction Set Guide
No ratings yet
DLX Instruction Set Guide
29 pages
ARM Processor ISA
No ratings yet
ARM Processor ISA
5 pages
Microcontroller Module-2 Notes
No ratings yet
Microcontroller Module-2 Notes
28 pages
Computer Architecture P2
No ratings yet
Computer Architecture P2
46 pages
ARM Processor Instruction Set - Lecture 6
No ratings yet
ARM Processor Instruction Set - Lecture 6
43 pages
4-Instruction Set
No ratings yet
4-Instruction Set
45 pages
ARM Assembly Guide for Educators
No ratings yet
ARM Assembly Guide for Educators
53 pages
l18 Arm
No ratings yet
l18 Arm
71 pages
ARM Architecture & Instruction Set
100% (2)
ARM Architecture & Instruction Set
71 pages
Unit 5
No ratings yet
Unit 5
62 pages
The x86 PC Assembly Language Design and Interfacing 5 TH Edition
100% (2)
The x86 PC Assembly Language Design and Interfacing 5 TH Edition
11 pages
DSP Architecture & Processors Overview
No ratings yet
DSP Architecture & Processors Overview
97 pages
CBM2093 Chipsbank
No ratings yet
CBM2093 Chipsbank
15 pages
Memory Hierarchy for CS Students
No ratings yet
Memory Hierarchy for CS Students
29 pages
Gold Cpu Chip Price Guide
100% (6)
Gold Cpu Chip Price Guide
9 pages
Solutions: CS152 Computer Architecture and Engineering
No ratings yet
Solutions: CS152 Computer Architecture and Engineering
17 pages
Computer Architecture - Notes2
No ratings yet
Computer Architecture - Notes2
101 pages
Low-Power VLSI Circuits and Systems
No ratings yet
Low-Power VLSI Circuits and Systems
33 pages
Internal - Assessment Paper of UIT
No ratings yet
Internal - Assessment Paper of UIT
2 pages
Cit314 2022 2
No ratings yet
Cit314 2022 2
3 pages
Electronics Engineering Guide
No ratings yet
Electronics Engineering Guide
4 pages
ICT Slides 3 Semester 3
No ratings yet
ICT Slides 3 Semester 3
17 pages
A Review of Architectures - Intel Single Core, Intel Dual Core and AMD Dual Core Processors and The Benefits
No ratings yet
A Review of Architectures - Intel Single Core, Intel Dual Core and AMD Dual Core Processors and The Benefits
10 pages
Lecture1 EELE 1232
No ratings yet
Lecture1 EELE 1232
20 pages
Price List 9-5-2023
No ratings yet
Price List 9-5-2023
5 pages
SiP Testing and Integration Guide
No ratings yet
SiP Testing and Integration Guide
82 pages
Microcontroller Interview Q&A
No ratings yet
Microcontroller Interview Q&A
6 pages
20A04606 Basic VLSI Design
No ratings yet
20A04606 Basic VLSI Design
1 page
DAY-15 Programmable Logic Devices PLD: Programmable Read Only Memory PROM
No ratings yet
DAY-15 Programmable Logic Devices PLD: Programmable Read Only Memory PROM
11 pages
The Intel Pentium Processor
No ratings yet
The Intel Pentium Processor
12 pages
x86 Assembly-Print Version - Wikibooks, Open Books For An Open World
No ratings yet
x86 Assembly-Print Version - Wikibooks, Open Books For An Open World
196 pages
E3 237 Integrated Circuits For Wireless Communication: Lecture 2: RF CMOS Technology
No ratings yet
E3 237 Integrated Circuits For Wireless Communication: Lecture 2: RF CMOS Technology
12 pages
VLSI Design: MOSFET Fundamentals
No ratings yet
VLSI Design: MOSFET Fundamentals
24 pages
Mic Project Done
No ratings yet
Mic Project Done
16 pages
8051 Microcontroller Pin Guide
No ratings yet
8051 Microcontroller Pin Guide
3 pages
Question Paper Code:: Reg. No.
No ratings yet
Question Paper Code:: Reg. No.
4 pages
dsPIC33F FRM Section 5. Flash Programming (DS70191B)
No ratings yet
dsPIC33F FRM Section 5. Flash Programming (DS70191B)
16 pages
Full Digital Electronics Microprocessor MCQs
No ratings yet
Full Digital Electronics Microprocessor MCQs
9 pages
List of Ip Vendors
No ratings yet
List of Ip Vendors
5 pages