Dec   Hex   Bin
1     1     00000001
    ORG ; ONE
                           The x86
                           Microprocessor
Chapter 1d
1.4 INTRODUCTION TO PROGRAM SEGMENTS
• A typical Assembly language program consists of
  at least three segments:
  – A code segment - which contains the Assembly
    language instructions that perform the tasks that the
    program was designed to accomplish.
  – A data segment - used to store information (data) to
    be processed by the instructions in the code segment.
  – A stack segment - used by the CPU to store information
    temporarily.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
origin and definition of the segment
• A segment is an area of memory that includes up
  to 64K bytes, and begins on an address evenly
  divisible by 16 (such an address ends in 0H)
  – 8085 addressed a maximum of 64K of physical memory,
    since it had only 16 pins for address lines. (216 = 64K)
     • Limitation was carried into 8088/86 design for compatibility.
• In 8085 there was 64K bytes of memory for all code,
  data, and stack information.
  – In 8088/86 there can be up to 64K bytes in each category.
     • The code segment, data segment, and stack segment.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
origin and definition of the segment
• 80286 and above operate in either the real or
  protected mode.
• Real mode operation allows addressing of only the
  first 1M byte of memory space—even in Pentium 4
  or Core2 microprocessor.
  – the first 1M byte of memory is called the real memory,
    conventional memory, or DOS memory system,
    organized as:
     ØIndividual bytes of data stored at consecutive addresses
      over the rage 0000016 to FFFFF16
     ØSo, the memory is organized as 8 bit bytes and not as 16 bit
      words
1.4 INTRODUCTION TO PROGRAM SEGMENTS
logical address and physical address
• In literature concerning 8086, there are three types
  of addresses mentioned frequently:
  – The physical address - the 20-bit address actually on
    the address pins of the 8086 processor, decoded by the
    memory interfacing circuitry.
     • This address can have a range of 00000H to FFFFFH.
     • An actual physical location in RAM or ROM within the 1 mb
       memory range.
  – The offset address - a location in a 64K-byte segment
    range, which can range from 0000H to FFFFH.
  – The logical address - which consists of a segment value
    and an offset address.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• To execute a program, 8086 fetches the instructions
  (opcodes and operands) from the code segment.
  – The logical address of an instruction always consists of a
    CS (code segment) and an IP (instruction pointer), shown
    in CS:IP format.
  – The physical address for the location of the instruction
    is generated by shifting the CS left one hex digit, then
    adding it to the IP.
     • IP contains the offset address.
• The resulting 20-bit address is called the physical
  address since it is put on the external physical
  address bus pins.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• Assume values in CS & IP as shown in the diagram:
  – The offset address contained in IP, is 95F3H.
  – The logical address is CS:IP, or 2500:95F3H.
  – The physical address will be 25000 + 95F3 = 2E5F3H
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• Calculate the physical address of an instruction:
  – The microprocessor will retrieve the instruction from
    memory locations starting at 2E5F3.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• Calculate the physical address of an instruction:
  – Since IP can have a minimum value of 0000H and a
    maximum of FFFFH, the logical address range in this
    example is 2500:0000 to 2500:FFFF.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• Calculate the physical address of an instruction:
  – This means that the lowest memory location of the code
    segment above will be 25000H (25000 + 0000) and the
    highest memory location will be 34FFFH (25000 + FFFF).
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
                      – This shows a memory
                        segment beginning at
                        10000H, ending at location
                        IFFFFH
                      – A real mode segment of
                        memory is 64K in length
                      – also shows how an offset
                        address, called a
                        displacement, of F000H
                        selects location 1F000H in
                        the memory
                      – Segment and offset address
                        is sometimes written as
                        1000:F000.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• The lowest address byte in a segment has an offset of 000016 and
  the highest address byte in a segment has an offset of FFFF16
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• There are many possible combinations of segment base address
  and offset that yield the same physical address (how?)
   – Example: 002B:0013 = 002C3
                002C:0003 = 002C3
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
• What happens if the desired instructions are located
  beyond these two limits?
  – The value of CS must be changed to access those
    instructions.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
q 8086 has the ability of accessing a 2-byte word in one load instruction.
q The less significant byte of data is assumed to be stored in the lower
  address byte
q The more significant byte of data is assumed to be stored in the higher
  address byte.
Consider the following example
                                    Address      Memory           Memory
                                                 (binary)      (hexadecimal)
                                    0072516     0101 0101           55
                                    0072416     0000 0010           02
                 the word stored at memory location (00724H) is 5502H.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment
q    To permit efficient use of memory, words can be stored at
     § Even boundary (aligned word) implying that the less significant byte is
         stored at an even address.
     § Odd boundary (misaligned word) implying that the less significant byte
         is stored at an odd address.
     §   What is the data word shown in the figure below? Express the result
         in hexadecimal form. Is it stored at an even- or odd-address word
         boundary? Is it aligned or misaligned word of data?
•   Solution:
     – Most significant byte at 0072B is 7D16
     – Least significant byte at 0072A is 0A16
     – Together the two bytes give the word 7D0A16
     – Since least significant byte is stored at address 0072A which is in binary
       0000 0000 0111 0010 1010. Hence it is stored at an even boundary
     – Therefore it is aligned
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment logical/physical address
• In the next code segment, CS and IP hold the
  logical address of the instructions to be executed.
  – The following Assembly language instructions have been
    assembled (translated into machine code) and stored in
    memory.
  – The three columns show the logical address of CS:IP,
    the machine code stored at that address, and the
    corresponding Assembly language code.
  – The physical address is put on the address bus by the
    CPU to be decoded by the memory circuitry.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment logical/physical address
Instruction "MOV AL,57" has a machine code of B057.
B0 is the opcode and 57 is the operand.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
code segment logical/physical address
Instruction "MOV AL,57" has a machine code of B057.
B0 is the opcode and 57 is the operand.
The byte at address 1132:0100 contains B0, the opcode for moving
a value into register AL.
Address 1132:0101 contains the operand to be moved to AL.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment
• Assume a program to add 5 bytes of data, such as
  25H, 12H, 15H, 1FH, and 2BH.
  – One way to add them is as follows:
  – In the program above, the data & code are mixed together
    in the instructions.
     • If the data changes, the code must be searched for every
       place it is included, and the data retyped
     • From this arose the idea of an area of memory strictly for data
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment
• In x86 microprocessors, the area of memory set
  aside for data is called the data segment.
   – The data segment uses register DS and an offset value.
   – DEBUG assumes that all numbers are in hex.
      • No "H" suffix is required.
   – MASM assumes that they are in decimal.
      • The "H" must be included for hex data.
• The next program demonstrates how data can
  be stored in the data segment and the program
  rewritten so that it can be used for any set of data.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment
• Assume data segment offset begins at 200H.
  – The data is placed in memory locations:
  – The program can be rewritten as follows:
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment
• The offset address is enclosed in brackets, which
  indicate that the operand represents the address
  of the data and not the data itself.
  – If the brackets were not included, as in
    "MOV AL,0200", the CPU would attempt to move 200
    into AL instead of the contents of offset address 200.
    decimal.
     • This program will run with any set of data.
     • Changing the data has no effect on the code.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment
• If the data had to be stored at a different offset
  address the program would have to be rewritten
   – A way to solve this problem is to use a register to hold
     the offset address, and before each ADD, increment the
     register to access the next byte.
• 8088/86 allows only the use of registers BX, SI,
  and DI as offset registers for the data segment
   – The term pointer is often used for a register holding
     an offset address.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment
• In the following example, BX is used as a pointer:
• The INC instruction adds 1 to (increments) its
  operand.
  – "INC BX" achieves the same result as "ADD BX,1“
  – If the offset address where data is located is changed,
    only one instruction will need to be modified.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment logical/physical address
• The physical address for data is calculated using
  the same rules as for the code segment.
  – The physical address of data is calculated by shifting DS
    left one hex digit and adding the offset value, as shown
    in Examples 1-2, 1-3, and 1-4.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment logical/physical address
1.4 INTRODUCTION TO PROGRAM SEGMENTS
data segment logical/physical address
1.4 INTRODUCTION TO PROGRAM SEGMENTS
little endian convention
• Previous examples used 8-bit or 1-byte data.
  – What happens when 16-bit data is used?
• The low byte goes to the low memory location and
  the high byte goes to the high memory address.
  – Memory location DS:1500 contains F3H.
  – Memory location DS:1501 contains 35H.
     • (DS:1500 = F3 DS:1501 = 35)
  – This convention is called little endian vs big endian.
     • From a Gulliver’s Travels story about how an egg should
       be opened—from the little end, or the big end.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
little endian convention
• In the big endian method, the high byte goes to the
  low address.
  – In the little endian method, the high byte goes to the
    high address and the low byte to the low address.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
little endian convention
• All Intel microprocessors and many microcontrollers
  use the little endian convention.
  – Freescale (formerly Motorola) microprocessors, along
    with some other microcontrollers, use big endian.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
extra segment (ES)
• ES is a segment register used as an extra data
  segment.
  – In many normal programs this segment is not used.
  – Use is essential for string operations.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
memory map of the IBM PC
• The 20-bit address of 8088/86
  allows 1mb (1024K bytes) of
  memory space with the address
  range 00000–FFFFF.
    – During the design phase of the first
      IBM PC, engineers had to decide
      on the allocation of the 1-megabyte
      memory space to various sections
      of the PC.
          • This memory allocation is
            called a memory map.
Figure 1-3 Memory Allocation in the PC
1.4 INTRODUCTION TO PROGRAM SEGMENTS
memory map of the IBM PC
• Of this 1 megabyte, 640K bytes
  from addresses 00000 – 9FFFFH
  were set aside for RAM
• 128K bytes A0000H – BFFFFH
  were allocated for video memory
• The remaining 256K bytes from
  C0000H – FFFFFH were set
  aside for ROM
Figure 1-3 Memory Allocation in the PC
1.4 INTRODUCTION TO PROGRAM SEGMENTS
more about RAM
• In the early 80s, most PCs came with 64K to 256K
  bytes of RAM, more than adequate at the time
  – Users had to buy memory to expand up to 640K.
• Managing RAM is left to Windows because...
  – The amount of memory used by Windows varies.
  – Different computers have different amounts of RAM.
  – Memory needs of application packages vary.
• For this reason, we do not assign any values for the
  CS, DS, and SS registers.
  – Such an assignment means specifying an exact physical
    address in the range 00000–9FFFFH, and this is beyond
    the knowledge of the user.
1.4 INTRODUCTION TO PROGRAM SEGMENTS
video RAM
• From A0000H to BFFFFH is set aside for video
  – The amount used and the location vary depending
    on the video board installed on the PC
1.4 INTRODUCTION TO PROGRAM SEGMENTS
more about ROM
• C0000H to FFFFFH is set aside for ROM.
  – Not all the memory in this range is used by the PC's ROM.
• 64K bytes from location F0000H–FFFFFH are
  used by BIOS (basic input/output system) ROM.
  – Some of the remaining space is used by various adapter
    cards (such as the network card), and the rest is free.
• The 640K bytes from 00000 to 9FFFFH is referred
  to as conventional memory.
  – The 384K bytes from A0000H to FFFFFH are called
    the UMB (upper memory block).
1.4 INTRODUCTION TO PROGRAM SEGMENTS
function of BIOS ROM
• There must be some permanent (nonvolatile)
  memory to hold the programs telling the CPU
  what to do when the power is turned on
  – This collection of programs is referred to as BIOS.
• BIOS stands for basic input-output system.
  – It contains programs to test RAM and other
    components connected to the CPU.
  – It also contains programs that allow Windows to
    communicate with peripheral devices.
  – The BIOS tests devices connected to the PC when
    the computer is turned on and to report any errors.