Microprocessor Systems
1
Overview
We program in C for convenience
There are no processors that execute C, only machine code
So we compile C code into assembly code, a human-
readable representation of machine code
We need to know what the assembly code implementing
the C code looks like
to use the processor efficiently
to analyze the code with precision
to find performance and other problems
An overview of what C gets compiled into
C start-up module, subroutines calls, stacks, data classes and
layout, pointers, control flow, etc.
2
Programmer’s World: The Land of Chocolate!
As many functions and
variables as you want!
All the memory you could
ask for!
So many data types!
Integers, floating point,
char, …
So many data structures!
Arrays, lists, trees, sets,
dictionaries
So many control
structures! Subroutines,
if/then/else, loops, etc.
Iterators! Polymorphism!
Processor’s World
Data types 23 251 151 11 3 1 1 1
Integers
213 6 234 2 u 1 1 1
More if you’re
lucky! 2 33 72 1 a 1 1 a
Instructions a 4 h e l l o 1
Math: +, -, * 67 96 a 0 9 9 9 1
Logic: and, or
6 11 d 72 7 0 0 0
Shift, rotate
Move, swap 28 289 37 54 42 0 0 0
Compare 213 6 234 2 31 1 1 1
Jump, branch
Software Development Tools
Program build toolchain
Programmer
Debugger
Program build toolchain
5
Program Translation Stages
Parser
reads in C code,
checks for syntax errors,
forms intermediate code (tree representation)
High-Level Optimizer
Compiler Modifies intermediate code (processor-independent)
(armcc)
Code Generator
Creates assembly code from of the intermediate code
Allocates variable uses to registers
Low-Level Optimizer
Modifies assembly code (parts are processor-specific)
Assembler Assembler
(armasm) Creates object code (machine code)
Linker/Loader
Linker/ Creates executable image from one or more object file
Loader
(armlink)
Examining Assembly Code before Debugger
Compiler can generate assembly code listing for
reference
Select in project options
Examining Disassembled Program in Debugger
View->Disassembly Window
A Warning About Code Optimizations
Compiler and rest of tool-chain try to optimize code:
Simplifying operations
Removing “dead” code
Using registers
These optimizations often get in way of
understanding what the code does
Fundamental trade-off: Fast or comprehensible code?
Compilers typically offer a range of optimization levels
(e.g. Level 0 to Level 3)
Code examples here may use “volatile” data type
modifier to reduce compiler optimizations and
improve readability
Application Binary Interface (ABI)
Defines rules which allow separately developed
functions to work together
ARM Architecture Procedure Call Standard (AAPCS)
Which registers must be saved and restored
How to call procedures
How to return from procedures
etc.
C Library ABI (CLIBABI)
C Library functions
Run-Time ABI (RTABI)
Run-time helper functions: 32/32 integer division,
memory copying, floating-point operations, data type
conversions, etc.
USING REGISTERS
AAPCS Register Use Conventions
Make it easier to create modular, isolated and
integrated code
Argument/Scratch registers are not expected to be
preserved upon returning from a called subroutine
r0-r3
Preserved (“variable”) registers are expected to have
their original values upon returning from a called
subroutine
r4-r8, r10-r11
AAPCS Core Register Use
Must be saved, restored by callee-procedure
if it will modify them. Calling subroutine
expects these to retain their value.
Must be saved, restored by callee-procedure
if it will modify them. Calling subroutine
expects these to retain their value.
Don’t need to be saved. May be used
for arguments, results, or temporary
values.
MEMORY REQUIREMENTS
What Memory Does a Program Need?
Five possible types
int a, b;
const char c=123;
Code
int d=31; Read-only static data
void main(void) {
int e; Writable static data
char f[32]; Initialized
e = d + 7;
a = e + 29999; Zero-initialized
strcpy(f,“Hello!”); Uninitialized
}
Heap
Stack
What goes where?
Code is obvious
And the others?
What Memory Does a Program Need?
Can the information change?
If No Put it in read-only,
int a, b;
const char c=123;
nonvolatile memory
int d=31; Instructions
void main(void) {
Constant strings
int e;
char f[32]; Constant operands
e = d + 7; Initialization values
a = e + 29999;
strcpy(f,“Hello!”); If Yes Put it in read/write
} memory
Variables
Intermediate computations
Return address
Other housekeeping data
What Memory Does a Program Need?
How long does the data need to
int a, b;
exist? Reuse memory if possible.
const char c=123; Statically allocated
int d=31;
Exists from program start to
void main(void) {
int e; end
char f[32]; Each variable has its own fixed
e = d + 7; location
a = e + 29999;
strcpy(f,“Hello!”); Space is not reused
}
Automatically allocated
Exists from function start to end
Space can be reused
Dynamically allocated
Exists from explicit allocation to
explicit deallocation
Space can be reused
Program Memory Use
RAM Flash ROM
Zero-Initialized int a, b;
const char c=123; Constant Data
Data
int d=31;
void main(void) {
int e; Initialization
Initialized Data
char f[32]; Data
e = d + 7;
a = e + 29999; Startup and
Stack strcpy(f,“Hello!”); Runtime Library
} Code
Heap Data Program .text
Activation Record
Lower (Free stack
address space)
Activation records Local storage ← Stack ptr
are located on the stack Activation record for
Return address
Calling a function creates current function
Arguments
an activation record Local storage
Activation record for
Returning from a function caller function
Return address
deletes the activation record Arguments
Activation record for Local storage
Automatic variables caller’s caller Return address
and housekeeping function Arguments
information are Activation record for Local storage
Higher caller’s caller’s Return address
stored in a function’s address caller function Arguments
activation record
Not all fields (Local storage, Return Address,
Arguments) may be present for each activation record
Type and Class Qualifiers
Used to modify a variable’s declaration so the
compiler treats it slightly differently
Const
Never written by program, can be put in ROM to save
RAM
Volatile
Can be changed outside of normal program flow:
Interrupt Service Routine (ISR), hardware register
Compiler must be careful with optimizations
Static
Declared within function, retains value between function
invocations
Scope is limited to function
Linker Map File
Contains extensive information on functions and
variables
Value, type, size, object
Cross references between sections
Memory map of image
Sizes of image components
Summary of memory requirements
C Run-Time Start-Up Module
After reset, RAM Flash ROM
processor must…
Zero-Initialized Fill with Initialization
Initialize hardware Data Data
zeros
Peripherals, etc. a, b 31
Set up stack Initialized Data Constant Data
pointer d c: 123
Hello!
Initialize C or C++ Startup and
run-time Stack
Runtime
e, f
environment Library Code
Set up heap
Heap Data Code
memory
Initialize variables
ACCESSING DATA IN MEMORY
Accessing Data
int siA;
What does it take to void static_auto_local() {
get at a variable in int aiB;
static int siC=3;
memory? int * apD;
int aiE=4, aiF=5, aiG=6;
Depends on location,
which depends on siA = 2;
aiB = siC + siA;
storage type (static, apD = & aiB;
automatic, dynamic) (*apD)++;
apD = &siC;
(*apD) += 9;
apD = &siA;
apD = &aiE;
apD = &aiF;
apD = &aiG;
(*apD)++;
aiE+=7;
*apD = aiE + aiF;
}
Static Variables
Static var can be located anywhere in Load r0 with pointer to variable
Load r1 from [r0]
32-bit memory space, so to access it,
Use value of variable
you need a 32-bit pointer
Label:
Can’t fit a 32-bit pointer into a 16-bit 32-bit pointer to Variable
instruction (or a 32-bit instruction), so
save the pointer separate from
instruction, but nearby so we can access
it with a short PC-relative offset
Load the pointer into a register (r0)
Can now load variable’s value into a Variable
register (r1) from memory using that
pointer in r0
Similarly can store a new value to the
variable in memory
Static Variables
Key AREA ||.text||, CODE, READONLY, ALIGN=2
variable’s value ;;;20 siA = 2;
variable’s address 00000e 2102 MOVS r1,#2
address of copy of variable’s 000010 4a37 LDR r2,|L1.240|
address 000012 6011 STR r1,[r2,#0] ; siA
Code ;;;21 aiB = siC + siA;
Loads r2 with address of siA 000014 4937 LDR r1,|L1.244|
(from |L1.240|) 000016 6809 LDR r1,[r1,#0] ; siC
000018 6812 LDR r2,[r2,#0] ; siA
Loads r1 with contents of siA
00001a 1889 ADDS r1,r1,r2
(via pointer r2, with offset 0)
...
Same for siC, with address at
|L1.244|
|L1.240|
Addresses of siA and siC are DCD ||siA||
stored as literals to be loaded |L1.244|
into pointers DCD ||siC||
AREA ||.data||, DATA, ALIGN=2
Variables siC and siA are ||siA||
located in .data section with DCD 0x00000000
initial values ||siC||
DCD 0x00000003
Automatic Variables Stored on Stack
Automatic variables are stored in a int main(void) {
function’s activation record (unless auto vars
a();
optimized and promoted to register) }
void a(void) {
Activation records are located on the auto vars
stack b();
}
Calling a function creates an activation void b(void) {
auto vars
record, allocating space on stack c();
}
Returning from a function deletes void c(void) {
the activation record, freeing up space auto vars
…
on stack }
Automatic Variables
Lower (Free stack
int main(void)
address space)
{
<- Stack pointer
auto vars Local storage
Activation record while executing C
a();
for current Saved regs
}
function C Arguments
(optional)
void a(void) {
Local storage <- Stack pointer
auto vars Activation record
Saved regs while executing B
b(); for caller
Arguments
} function B
(optional)
void b(void) { Local storage <- Stack pointer
Activation record
auto vars Saved regs while executing A
for caller’s caller
c(); Arguments
function A
} (optional)
Activation record Local storage <- Stack pointer
void c(void) { for caller’s Saved regs while executing main
auto vars Higher caller’s caller Arguments
… address function main (optional)
}
Addressing Automatic Variables
Program must allocate space on
stack for variables
Stack addressing uses an offset from
the stack pointer: [sp, #offset]
One byte used for offset, is multiplied
by four Address Contents
SP
Possible offsets: 0, 4, 8, …, 1020 bytes SP+4
Maximum range addressable this way is SP+8
1024 bytes
SP+0xC
SP+0x10
SP+0x14
SP+0x18
SP+0x1C
SP+0x20
Example Code
int siA;
void static_auto_local() {
int aiB;
static int siC=3;
int * apD;
int aiE=4, aiF=5, aiG=6;
siA = 2;
aiB = siC + siA;
apD = & aiB;
(*apD)++;
apD = &siC;
(*apD) += 9;
apD = &siA;
apD = &aiE;
apD = &aiF;
apD = &aiG;
(*apD)++;
aiE+=7;
*apD = aiE + aiF;
}
Automatic Variables
Address Contents
SP aiG
SP+4 aiF ;;;14 void static_auto_local(
SP+8 aiE void ) {
SP+0xC aiB 000000 b50f PUSH {r0-r3,lr}
SP+0x10 r0 ;;;15 int aiB;
SP+0x14 r1 ;;;16 static int siC=3;
SP+0x18 r2 ;;;17 int * apD;
;;;18 int aiE=4, aiF=5, aiG=6;
SP+0x1C r3
000002 2104 MOVS r1,#4
SP+0x20 lr 000004 9102 STR r1,[sp,#8]
000006 2105 MOVS r1,#5
Initialize aiE 000008 9101 STR r1,[sp,#4]
Initialize aiF 00000a 2106 MOVS r1,#6
00000c 9100 STR r1,[sp,#0]
Initialize aiG …
;;;21 aiB = siC + siA;
…
Store value for aiB 00001c 9103 STR r1,[sp,#0xc]
USING POINTERS
Example Code
int siA;
void static_auto_local() {
int aiB;
static int siC=3;
int * apD;
int aiE=4, aiF=5, aiG=6;
siA = 2;
aiB = siC + siA;
apD = & aiB;
(*apD)++;
apD = &siC;
(*apD) += 9;
apD = &siA;
apD = &aiE;
apD = &aiF;
apD = &aiG;
(*apD)++;
aiE+=7;
*apD = aiE + aiF;
}
Using Pointers to Automatic Variables
C Pointer: a variable which
;;;22 apD = & aiB;
holds the data’s address 00001e a803 ADD r0,sp,#0xc
;;;23 (*apD)++;
000020 6801 LDR r1,[r0,#0]
aiB is on stack at SP+0xc 000022 1c49 ADDS r1,r1,#1
Compute r0 with variable’s 000024 6001 STR r1,[r0,#0]
address from stack pointer
and offset (0xc)
Load r1 with variable’s
value from memory
Operate on r1, save back
to variable’s address
Example Code
int siA;
void static_auto_local() {
int aiB;
static int siC=3;
int * apD;
int aiE=4, aiF=5, aiG=6;
siA = 2;
aiB = siC + siA;
apD = & aiB;
(*apD)++;
apD = &siC;
(*apD) += 9;
apD = &siA;
apD = &aiE;
apD = &aiF;
apD = &aiG;
(*apD)++;
aiE+=7;
*apD = aiE + aiF;
}
Using Pointers to Static Variables
Load r0 with variable’s
;;;24 apD = &siC;
address from address of 000026 4833 LDR r0,|L1.244|
copy of variable’s ;;;25 (*apD) += 9;
000028 6801 LDR r1,[r0,#0]
address 00002a 3109 ADDS r1,r1,#9
00002c 6001 STR r1,[r0,#0]
|L1.244|
Load r1 with variable’s DCD ||siC||
AREA ||.data||, DATA, ALIGN=2
value from memory ||siC||
DCD 0x00000003
Operate on r1, save back
to variable’s address
ARRAY ACCESS
Array Access
• What does it take to get unsigned char buff2[3];
at an array element in unsigned short int buff3[5][7];
memory? unsigned int arrays(unsigned char n,
unsigned char j) {
volatile unsigned int i;
– Depends on how many
dimensions i = buff2[0] + buff2[n];
i += buff3[n][j];
– Depends on element size return i;
and row width }
– Depends on location,
which depends on storage
type (static, automatic,
dynamic)
Accessing 1-D Array Elements
Need to calculate element
address, that is sum of: Address Contents
array start address buff2 buff2[0]
offset: index * element size buff2 + 1 buff2[1]
buff2 is array of unsigned buff2 + 2 buff2[2]
characters
Move n (argument) from r0 into r2
Load r3 with pointer to buff2
Load (byte) r3 with first element of
buff2 ;;;74 unsigned int arrays(unsigned char
n, unsigned char j) {
Load r4 with pointer to buff2 00009e 4602 MOV r2,r0
Load (byte) r4 with element ;;;75 volatile unsigned int i;
;;;76 i = buff2[0] + buff2[n];
at address buff2+r2 0000a0 4b1b LDR r3,|L1.272|
r2 holds argument n 0000a2 781b LDRB r3,[r3,#0]
Add r3 and r4 to form sum 0000a4 4c1a LDR r4,|L1.272|
0000a6 5ca4 LDRB r4,[r4,r2]
0000a8 1918 ADDS r0,r3,r4
|L1.272|
DCD buff2
Accessing 2-D Array Elements
short int buff3[5][7]
Address Contents var[rows][columns]
buff3 buff3[0][0]
buff3+1 Sizes
buff3+2 buff3[0][1] Element size: 2 bytes
Row size: 7*2 bytes =
buff3+3
Row1
...
buff3+10 buff3[0][5]
14 bytes (0xe)
buff3+11 Offset based on row
buff3+12 buff3[0][6]
index and column index
buff3+13
buff3+14 buff3[1][0] row offset = row index *
buff3+15 row size
buff3+16 buff3[1][1] column offset = column
Row2
buff3+17 index * element size
buff3+18 buff3[1][2]
buff3+19
...
buff3+68 buff3[4][6]
buff3+69
Code to Access 2-D Array
Load r3 with row size ;;;77 i += buff3[n][j];
0000aa 230e MOVS r3,#0xe
Multiply by row number (n,
0000ac 4353 MULS r3,r2,r3
r2) to put row offset in r3
Load r4 with address of buff3 0000ae 4c19 LDR r4,|L1.276|
Add buff3 address to row
offset in r3 0000b0 191b ADDS r3,r3,r4
Shift column number (j is 0000b2 004c LSLS r4,r1,#1
mapped to r1) left by one
Which is multiplying by 2
(bytes/element) 0000b4 5b1b LDRH r3,[r3,r4]
Load (halfword) r3 with
element at address r3+r4
0000b6 1818 ADDS r0,r3,r0
(buff3 + row offset + col.
offset) |L1.276|
Add r3 into variable i (variable DCD buff3
i is mapped to r0)