ARM Instruction Set Architecture (I)
Lecture 5
Yeongpil Cho
Hanynag University
Topics
• Structure of ARM Assembly Code
• Assembler Directives
2
ARM Instruction Set Architecture
3
History
4
From C to Assembly
.text
…
LDR r0, =x
LDR r1, [r0]
ADD r1, r1, #1
STR r1, [r0]
…
.data
x: .word -2
5
Two Pass Assembly
• Most assemblers read the source file twice.
• First Pass:
▪ Build a symbol table.
▪ Calculate and record values (to be used in the second pass) for
each symbol.
▪ Some symbols may not have a known value.
– They are unresolved.
– The linker will have to fix them.
• Second Pass:
▪ Generate the object file.
▪ Use the symbol table to provide values when needed.
▪ Add information to the object file to tell the linker about any
symbols that are unresolved.
6
Example Code
7
Assembly Listing
8
Syntax of Assembly
• Basic syntax
▪ Label statements end after the “:”
▪ The other statements end at the first newline or “;”
▪ Directives start with a period “.”
• Comments
9
Syntax of Assembly
• Examples
10
Assembly Expressions
• Expressions consist of one or more integer literals or
symbol references, combined using operators.
• Expression can be used as instruction operands or
directive argument.
• Assembler evaluates all expressions.
11
Assembly Expressions
• Constants
▪ Decimal Integer
▪ Hexadecimal integer, prefixed with 0x
▪ Octal integer, prefixed with 0
▪ Binary integer, prefixed with 0b
▪ Negative numbers can be represented using the unary operator, -
• Symbol References
▪ Symbols do not need to be defined in the same assembly
language source file, to be referenced in expressions.
▪ The period symbol (.) is a special symbol that can be used to
reference the current location in the output file.
12
Assembly Expressions
• Operators
▪ Unary Operators: +, -, ~
▪ Binary Operators: +, -, *, /, %
▪ Binary Logical Operators: &&, ||
▪ Binary Bitwise Operators: &. |, ^, >>, <<
▪ Binary Comparison Operators: ==, !=, <, >, <=, >=
13
Assembly Expressions
• Examples
14
Assembly Directives
• String definition
• Data definition
• Alignment
• Space-filling
• Org
• Conditional
• Macro
• Section
• Type
• Symbol Binding
• Instruction Set Selection Directives
15
String definition directives
• Allocates one or more bytes of memory in the current
section, and defines the initial contents of the memory
from a string literal.
• .ascii "string“
▪ .ascii does not append a null byte to the end of the string.
• .asciz "string"
• .string "string“
▪ .asciz and .string append a null byte to the end of the string.
16
String definition directives
• Examples
17
Data definition directives
• These directives allocate memory in the current section, and
define the initial contents of that memory.
• .byte expr[, expr]…
• .hword expr[, expr]…
• .word expr[, expr]…
• .quad expr[, expr]…
• .octa expr[, expr]…
▪ If multiple arguments are specified, multiple memory locations of
the specified size are allocated and initialized to the provided values
in order.
18
Data definition directives
• Examples
19
Alignment Directives
• The alignment directives align the current location in the file to a
specified boundary.
• .balign num_bytes [, fill_value]
• .p2align exponent [, fill_value]
• .align exponent [, fill_value]
▪ num_bytes
– This parameter specifies the number of bytes that must be aligned to.
– The number must be a power of 2.
▪ exponent
– This parameter specifies the alignment boundary as an exponent.
– The actual alignment boundary is 2exponent
▪ fill_value
– The value to fill any inserted padding bytes with. This value is optional.
20
Alignment Directives
• Examples
Ensuring that the entry points to
functions are on 16-byte boundaries,
to better utilize caches
21
Space-filling Directives
• .space count [, value]
▪ The .space directive emits count bytes of data, each of which has
value value.
• .fill count [, size [, value]]
▪ The .fill directive emits count data values, each with length size
bytes and value value.
22
Org Directives
• The .org directive advances the location counter in the cu
rrent section to new-location.
• .org new_location [, fill_value]
▪ new_location
– must be one of:
– An absolute integer expression, in which case it is treated as the nu
mber of bytes from the start of the section.
– An expression which evaluates to a location in the current section.
This could use a symbol in the current section, or the current locati
on ('.').
▪ fill_value
– This is an optional 1-byte value.
23
Org Directives
• Operation
▪ The .org directive can only move the location counter forward,
not backward.
▪ By default, the .org directive inserts zero bytes in any locations
that it skips over.
▪ This can be overridden using the optional fill_value argument,
which sets the 1-byte value that will be repeated in each skipped
location.
24
Org Directives
• Examples
b: backward
f: forward
25
Conditional Assembly Directives
• These directives allow you to conditionally assemble
sequences of instructions and directives.
• Syntax
▪ You should note that all directives are evaluated by assembler!
– Condition will not be checked at run-time!
▪ Modifiers decide how to check conditions
– Examples
26
Conditional Assembly Directives
• Examples
27
Macro Directives
• Syntax
▪ macro_name
– The name of the macro.
▪ parameter_name
– Inside the body of a macro, the parameters can be referred to by
their name, prefixed with \. When the macro is instantiated,
parameter references will be expanded to the value of the
argument.
– Parameters can be qualified in these ways:
28
Macro Directives
• Operation
▪ The .macro directive defines a new macro with name
macro_name. Once a macro is defined, it can be instantiated by
using it like an instruction mnemonic:
• Examples
pascal-style strings are
prefixed by a length byte, and
have no null terminator
29
Section Directives
• The section directives instruct the assembler to change the
ELF section that code and data are emitted into.
• .section name [, "flags" [, %type [, entry_size] [, group_name [,
linkage]] [, link_order_symbol] [, unique, unique_id] ]]
• .text
• .data
• .rodata
• .bss
▪ .section directive switches the current target section to the one
described by its arguments.
▪ The rest of the directives (.text, .data, .rodata, .bss) switch to one of
the built-in sections.
30
Section Directives
• Examples
▪ Splitting code and data into the built-in .text and .data sections
31
Type Directive
• The default type of a symbol in an object file is the
assembly-time type of the symbol.
▪ Symbolic constants and undefined symbols → @notype
▪ Labels and common symbols → @object
▪ Function names → @function
• The .type directive explicitly sets the type of a symbol.
• .type symbol, %type
▪ %type
– The following types are accepted:
– %function
• a function name
– %object
• a data object
– %tls_object
• a thread-local data object. 32
Type Directive
• Examples
33
Symbol Binding Directives
• These directives modify the ELF binding of one or more
symbols.
• .global symbol[, symbol]…
▪ These symbols are visible to all object files being linked, so a
definition in one object file can satisfy a reference in another.
• .local symbol[, symbol]…
▪ These symbols are not visible outside the object file they are defined
or referenced in, so multiple object files can use the same symbol
names without interfering with each other.
• .weak symbol[, symbol]…
▪ These symbols behave similarly to global symbols, with these
differences:
– If a reference to a symbol with weak binding is not satisfied (no
definition of the symbol is found), this is not an error.
– If multiple definitions of a weak symbol are present, this is not an error.
If a definition of the symbol with strong binding is present, that
definition satisfies all references to the symbol, otherwise one of the
weak references is chosen. 34
Symbol Binding Directives
• Operation
▪ The symbol binding directive can be at any point in the assembly
file, before or after any references or definitions of the symbol.
▪ If the binding of a symbol is not specified using one of these
directives, the default binding is:
– If a symbol is not defined in the assembly file, it has global visibility
by default.
– If a symbol is defined in the assembly file, it has local visibility by
default.
35
Symbol Binding Directives
• Examples
36
Instruction Set Selection Directives
• .arm
▪ The .arm directive instructs the assembler to interpret
subsequent instructions as A32 instructions.
• .thumb
▪ The .thumb directive instructs the assembler to interpret
subsequent instructions as T32 instructions, using the UAL
syntax.
• .thumb_func
▪ This directive specifies that the following symbol is the name of a
Thumb encoded function.
• .syntax [unified | divided]
▪ This directive sets the Instruction Set Syntax.
▪ divided (default for compatibility with legacy)
– ARM and Thumb instructions are used separately
▪ unified
– Enables UAL (Unified Assembly Language) syntax
– Necessary for Thumb2 instructions 37