Input/Output Organization
Outline
• Introduction                • External interface
• Accessing I/O devices              Serial transmission
• An example I/O device              Parallel interface
    Keyboard                 • USB
• I/O data transfer                  Motivation
    Programmed I/O                  USB architecture
    DMA                             USB transactions
• Error detection and         • IEEE 1394
  correction                           Advantages
    Parity encoding
                                       Transactions
    Error correction
                                       Bus arbitration
    CRC
                                       Configuration
                      Introduction
• I/O devices serve two main purposes
    To communicate with outside world
    To store data
• I/O controller acts as an interface between the
  systems bus and I/O device
    Relieves the processor of low-level details
    Takes care of electrical interface
• I/O controllers have three types of registers
    Data
    Command
    Status
Introduction (cont’d)
               Introduction (cont’d)
• To communicate with an I/O device, we need
   Access to various registers (data, status,…)
     » This access depends on I/O mapping
         – Two basic ways
              Memory-mapped I/O
              Isolated I/O
   A protocol to communicate (to send data, …)
     » Three types
         – Programmed I/O
         – Direct memory access (DMA)
         – Interrupt-driven I/O
              Accessing I/O Devices
• I/O address mapping
   Memory-mapped I/O
     » Reading and writing are similar to memory read/write
     » Uses same memory read and write signals
     » Most processors use this I/O mapping
   Isolated I/O
     » Separate I/O address space
     » Separate I/O read and write signals are needed
     » Pentium supports isolated I/O
         – 64 KB address space
             Can be any combination of 8-, 16- and 32-bit I/O
                ports
         – Also supports memory-mapped I/O
         Accessing I/O Devices (cont’d)
• Accessing I/O ports in Pentium
    Register I/O instructions
      in   accumulator, port8 ; direct format
         – Useful to access first 256 ports
      in   accumulator,DX                  ; indirect format
         – DX gives the port address
    Block I/O instructions
      » ins and outs
         – Both take no operands---as in string instructions
      » ins: port address in DX, memory address in ES:(E)DI
      » outs: port address in DX, memory address in ES:(E)SI
      » We can use rep prefix for block transfer of data
             An Example I/O Device
• Keyboard
   Keyboard controller scans and reports
         – Key depressions and releases
     » Supplies key identity as a scan code
         – Scan code is like a sequence number of the key
             Key’s scan code depends on its position on the
              keyboard
             No relation to the ASCII value of the key
   Interfaced through an 8-bit parallel I/O port
     » Originally supported by 8255 programmable peripheral
       interface chip (PPI)
       An Example I/O Device (cont’d)
• 8255 PPI has three 8-bit registers
      » Port A (PA)
      » Port B (PB)
      » Port C (PC)
    These ports are mapped as follows
         8255 register        Port address
         PA (input port)            60H
         PB (output port)           61H
         PC (input port)            62H
         Command register           63H
An Example I/O Device (cont’d)
     Mapping of 8255 I/O ports
        An Example I/O Device (cont’d)
• Mapping I/O ports is similar to mapping memory
    Partial mapping
    Full mapping
• Keyboard scan code and status can be read from
  port 60H
    7-bit scan code is available from
      » PA0 – PA6
    Key status is available from PA7
      » PA7 = 0 – key depressed
      » PA0 = 1 – key released
                    I/O Data Transfer
• Data transfer involves two phases
    A data transfer phase
      » It can be done either by
           – Programmed I/O
           – DMA
    An end-notification phase
      » Programmed I/O
      » Interrupt
• Three basic techniques
    Programmed I/O
    DMA
    Interrupt-driven I/O
            I/O Data Transfer (cont’d)
• Programmed I/O
   Done by busy-waiting
     » This process is called polling
• Example
   Reading a key from the keyboard involves
     » Waiting for PA7 bit to go low
         – Indicates that a key is pressed
     » Reading the key scan code
     » Translating it to the ASCII value
     » Waiting until the key is released
            I/O Data Transfer (cont’d)
• Direct memory access (DMA)
   Problems with programmed I/O
     » Processor wastes time polling
         – In our example
             Waiting for a key to be pressed,
             Waiting for it to be released
     » May not satisfy timing constraints associated with some
       devices
         – Disk read or write
   DMA
     » Frees the processor of the data transfer responsibility
I/O Data Transfer (cont’d)
            I/O Data Transfer (cont’d)
• DMA is implemented using a DMA controller
   DMA controller
     » Acts as slave to processor
     » Receives instructions from processor
     » Example: Reading from an I/O device
         – Processor gives details to the DMA controller
             I/O device number
             Main memory buffer address
             Number of bytes to transfer
             Direction of transfer (memory  I/O device, or vice
              versa)
            I/O Data Transfer (cont’d)
• Steps in a DMA operation
   Processor initiates the DMA controller
     » Gives device number, memory buffer pointer, …
         – Called channel initialization
     » Once initialized, it is ready for data transfer
   When ready, I/O device informs the DMA controller
     » DMA controller starts the data transfer process
        – Obtains bus by going through bus arbitration
        – Places memory address and appropriate control signals
        – Completes transfer and releases the bus
        – Updates memory address and count value
        – If more to read, loops back to repeat the process
   Notify the processor when done
     » Typically uses an interrupt
            I/O Data Transfer (cont’d)
DMA controller details
I/O Data Transfer (cont’d)
                  DMA transfer timing
            I/O Data Transfer (cont’d)
8237 DMA controller
            I/O Data Transfer (cont’d)
• 8237 supports four DMA channels
• It has the following internal registers
    Current address register
      » One 16-bit register for each channel
      » Holds address for the current DMA transfer
    Current word register
      » Keeps the byte count
      » Generates terminal count (TC) signal when the count goes
        from zero to FFFFH
    Command register
      » Used to program 8257 (type of priority, …)
          I/O Data Transfer (cont’d)
 Mode register
   » Each channel can be programmed to
       – Read or write
       – Autoincrement or autodecrement the address
       – Autoinitialize the channel
 Request register
   » For software-initiated DMA
 Mask register
   » Used to disable a specific channel
 Status register
 Temporary register
   » Used for memory-to-memory transfers
             I/O Data Transfer (cont’d)
• 8237 supports four types of data transfer
    Single cycle transfer
      » Only single transfer takes place
      » Useful for slow devices
    Block transfer mode
      » Transfers data until TC is generated or external EOP signal is
        received
    Demand transfer mode
      » Similar to the block transfer mode
      » In addition to TC and EOP, transfer can be terminated by
        deactivating DREQ signal
    Cascade mode
      » Useful to expand the number channels beyond four
                   External Interface
• Two ways of interfacing I/O devices
    Serial
      » Cheaper
      » Slower
    Parallel
      » Faster
      » Data skew
      » Limited to small distances
 External Interface (cont’d)
Two basic modes of data transmission
            External Interface (cont’d)
• Serial transmission
    Asynchronous
      » Each byte is encoded for transmission
          – Start and stop bits
      » No need for sender and receiver synchronization
    Synchronous
      » Sender and receiver must synchronize
          – Done in hardware using phase locked loops (PLLs)
      » Block of data can be sent
      » More efficient
          – Less overhead than asynchronous transmission
      » Expensive
External Interface (cont’d)
External Interface (cont’d)
 Asynchronous transmission
            External Interface (cont’d)
• EIA-232 serial interface
    Low-speed serial transmission
    Adopted by Electronics
     Industry Association (EIA)
      » Popularly known by its
        predecessor RS-232
    It uses a 9-pin connector DB-9
      » Uses 8 signals
    Typically used to connect a
     modem to a computer
            External Interface (cont’d)
• Transmission protocol uses three phases
    Connection setup
      » Computer A asserts DTE Ready
         – Transmits phone# via Transmit Data line (pin 2)
      » Modem B alerts its computer via Ring Indicator (pin 9)
         – Computer B asserts DTE Ready (pin 4)
         – Modem B generates carrier and turns its DCE Ready
      » Modem A detects the carrier signal from modem B
         – Modem A alters its computer via Carrier Detect (pin 1)
         – Turns its DCE Ready
    Data transmission
      » Done by handshaking using
         – request-to-send (RTS) and clear-to-send (CTS) signals
    Connection termination
      » Done by deactivating RTS
             External Interface (cont’d)
• Parallel printer interface
    A simple parallel interface
    Uses 25-pin DB-25
      » 8 data signals
           – Latched by strobe (pin 1)
      » Data transfer uses simple handshaking
           – Uses acknowledge (CK) signal
               After each byte, computer waits for ACK
      » 5 lines for printer status
           – Busy, out-of-paper, online/offline, autofeed, and fault
      » Can be initialized with INIT
           – Clears the printer buffer and resets the printer
External Interface (cont’d)
            External Interface (cont’d)
• SCSI
   Pronounced “scuzzy”
   Small Computer System Interface
     » Supports both internal and external connection
   Comes in two bus widths
     » 8 bits
         – Known as narrow SCSI
         – Uses a 50-pin connector
         – Device id can range from 0 to 7
     » 16 bits
         – Known as wide SCSI
         – Uses a 68-pin connector
         – Device id can range from 0 to 15
External Interface (cont’d)
External Interface (cont’d)
                              cont’d
External Interface (cont’d)
              External Interface (cont’d)
• SCSI uses client-server model
    Uses terms initiator and target for client and server
        » Initiator issues commands to targets to perform a task
            – Initiators are typically SCSI host adaptors
        » Targets receive the command and perform the task
            – Targets are SCSI devices like disk drives
• SCSI transfer proceeds in phases
      Command
      Message in              IN and OUT from
      Message out             the initiator point
      Data in
                               of view
      Data out
      Status
         External Interface (cont’d)
 SCSI uses asynchronous mode for all bus negotiations
   » Uses handshaking using REQ and ACK signals for each byte
     of data
 On a synchronous SCSI
   » Data are transferred synchronously
   » REQ-ACK signals are not used for each byte
   » A number of bytes (e.g., 8) can be sent without waiting for
     ACK
       – Improves throughput
       – Minimizes adverse impact of cable propagation delay
                               USB
• Universal Serial Bus
    Originally developed in 1995 by a consortium including
      » Compaq, HP, Intel, Lucent, Microsoft, and Philips
    USB 1.1 supports
      » Low-speed devices (1.5 Mbps)
      » Full-speed devices (12 Mbps)
    USB 2.0 supports
      » High-speed devices
          – Up to 480 Mbps (a factor of 40 over USB 1.1)
      » Uses the same connectors
          – Transmission speed is negotiated on device-by-device basis
                      USB (cont’d)
• Motivation for USB
   Avoid device-specific interfaces
     » Eliminates multitude of interfaces
         – PS/2, serial, parallel, monitor, microphone, keyboard,…
   Avoid non-shareable interfaces
     » Standard interfaces support only one device
   Avoid I/O address space and IRQ problems
     » USB does not require memory or address space
   Avoid installation and configuration problems
     » Don’t have to open the box to install and configure jumpers
   Allow hot attachment of devices
                       USB (cont’d)
• Additional advantages of USB
   Power distribution
     » Simple devices can be bus-powered
         – Examples: mouse, keyboards, floppy disk drives, wireless
           LANs, …
   Control peripherals
     » Possible because USB allows data to flow in both directions
   Expandable through hubs
   Power conservation
     » Enters suspend state if there is no activity for 3 ms
   Error detection and recovery
     » Uses CRC
USB (cont’d)
 USB cables
                     USB (cont’d)
• USB encoding
   Uses NRZI encoding
     » Non-Return to Zero-Inverted
                       USB (cont’d)
• NRZI encoding
   A signal transition occurs if the next bit is zero
     » It is called differential encoding
   Two desirable properties
     » Signal transitions, not levels, need to be detected
     » Long string of zeros causes signal changes
   Still a problem
     » Long strings of 1s do not causes signal change
   To solve this problem
     » Uses bit stuffing
        – A zero is inserted after every six consecutive 1s
               USB (cont’d)
Bit stuffing
                      USB (cont’d)
• Transfer types
     » Four types of transfer
   Interrupt transfer
     » Uses polling
        – Polling interval can range from 1 ms to 255 ms
   Isochronous transfer
     » Used in real-time applications that require constant data
       transfer rate
          – Example: Reading audio from CD-ROM
     » These transfers are scheduled regularly
     » Do not use error detection and recovery
                     USB (cont’d)
 Control transfer
   » Used to configure and set up USB devices
   » Three phases
       – Setup stage
            Conveys type of request made to target device
       – Data stage
            Optional stage
            Control transfers that require data use this stage
       – Status stage
            Checks the status of the operation
   » Allocates a guaranteed bandwidth of 10%
   » Error detection and recovery are used
       – Recovery is by means of retries
                  USB (cont’d)
 Bulk transfer
   » For devices with no specific data transfer rate
     requirements
       – Example: sending data to a printer
   » Lowest priority bandwidth allocation
   » If the other three types of transfers take 100% of the
     bandwidth
       – Bulk transfers are deferred until load decreases
   » Error detection and recovery are used
       – Recovery is by means of retries
                      USB (cont’d)
• USB architecture
   USB host controller
     » Initiates transactions over USB
   Root hub
     » Provides connection points
   Two types of host controllers
     » Open host controller (OHC)
         – Defined by Intel
     » Universal host controller (UHC)
         – Specified by National Semiconductor, Microsoft, Compaq
     » Difference between the two
         – How they schedule the four types of transfers
                      USB (cont’d)
• UHC scheduling
   Schedules periodic transfers first
     » Periodic transfers: isochronous and interrupts
     » Can take up to 90% of bandwidth
   These transfers are followed by control and bulk
    transfers
     » Control transfers are guaranteed 10% of bandwidth
   Bulk transfers are scheduled only if there is bandwidth
    available
USB (cont’d)
                      USB (cont’d)
• OHC scheduling
   Different from UHC scheduling
   Reserves space for non-periodic transfers first
     » Non-periodic transfers: control and bulk
     » 10% bandwidth reserved
   Next periodic transfers are scheduled
     » Guarantees 90% bandwidth
   Left over bandwidth is allocated to non-periodic
    transfers
                     USB (cont’d)
• Bus powered devices
   Low-power
     » Less than 100 mA
     » Can be bus-powered
   High-powered
     » Between 100 mA and 500 mA
         – Full-powered ports can power these devices
     » Can be designed to have their own power
     » Operate in three modes
         – Configured (500 mA)
         – Unconfigured (100 mA)
         – Suspended ( about 2.5 mA)
                     USB (cont’d)
• USB hubs
   Bus-powered
     » No extra power supply required
     » Must be connected to an upstream port that can supply 500 mA
     » Downstream ports can only supply 100 mA
        – Number of ports is limited to four
        – Support only low-powered devices
   Self-powered
     » Support 4 high-powered devices
     » Support 4 bus-powered USB hubs
   Most 4-port hubs are dual-powered
                     USB (cont’d)
Hubs can be used to expand
  Upstream port
  Downstream ports
                      USB (cont’d)
• USB transactions
   Transfers are done in one or more transactions
     » Each transaction consists of several packets
   Transactions may have between 1 and 3 phases
     » Token packet phase
         – Specifies transaction type and target device address
     » Data packet phase (optional)
         – Maximum of 1023 bytes are transferred
     » Handshake packet phase
         – Except for isochronous transfers, others use error detection
           for guaranteed delivery
         – Provides feedback on whether data has been received
           without error
                USB (cont’d)
USB IRP frame
                         USB (cont’d)
                                        Token packets use CRC-5
                                                    Hardware encoded
                                                    special pattern
Specifies token, data,
or handshake packet      Complement of type field
                       USB (cont’d)
USB 1.1 transactions
                        USB (cont’d)
• USB 2.0
   USB 1.1 uses 1 ms frames
   USB 2.0 uses 125 s frames
     » 1/8 of USB 1.1
   Supports 40X data rates
     » Up to 480 Mbps
   Competitive with
     » SCSI
     » IEEE 1394 (FireWire)
   Widely available now
                          IEEE 1394
• Apple originally developed this standard for high-
  speed peripherals
    Known by a variety of names
      » Apple: FireWire
      » Sony: i.ILINK
    IEEE standardized it as IEEE 1394
      » First released in 1995 as IEEE 1394-1995
      » A slightly revised version as 1394a
      » Next version 1394b
    Shares many of the features of USB
                 IEEE 1394 (cont’d)
• Advantages
   High speed
     » Supports three speeds
         – 100, 200, 400 Mbps
             Competes with USB 2.0
         – Plans to boost it to 3.2 Gbps
   Hot attachment
     » Like USB
     » No need to shut down power to attach devices
   Peer-to-peer support
     » USB is processor-centric
     » Supports peer-to-peer communication without involving the
       processor
               IEEE 1394 (cont’d)
 Expandable bus
   » Devices can be connected in daisy-chain fashion
   » Hubs can used to expand
 Power distribution
   » Like the USB, cables distribute power
       – Much higher power than USB
            Voltage between 8 and 33 V
            Current an be up to 1.5 Amps
 Error detection and recovery
   » As in USB, uses CRC
   » Uses retransmission in case of error
 Long cables
   » Like the USB
    IEEE 1394 (cont’d)
IEEE 1394 6-pin and 4-pin connectors
                         4-pin connector does
                         not distribute power
                 IEEE 1394 (cont’d)
• Encoding
   Uses a simple NRZ encoding
   Strobe signal is encoded
     » Changes the signal even if successive bits are the same
                  IEEE 1394 (cont’d)
• Transfer types
    Asynchronous
      » For applications that require correct delivery of data
          – Example: writing a file to a disk drive
      » Uses an acknowledgement to confirm delivery
      » Guaranteed bandwidth of 20%
    Isochronous
      » For real-time applications
      » No acknowledgement
      » Up to 80% of bandwidth allocated
    Bandwidth allocation on a cycle-by-cycle basis
      » Cycle time: 125 s
IEEE 1394 (cont’d)
                 IEEE 1394 (cont’d)
• Transactions
   Follow request and reply format
   Each packet is encapsulated between Data_Prefix
    and Data_end
                IEEE 1394 (cont’d)
• Isochronous transactions
    Similar to asynchronous transactions
    Main difference:
      » No acknowledgement packets
                  IEEE 1394 (cont’d)
• Bus arbitration
    Needed because of peer-to-peer communication
    Arbitration must respect
      » Bandwidth allocation to isochronous channels
      » Fairness-based allocation for asynchronous channels
    Uses fairness interval
      » During each interval
         – All nodes with pending asynchronous transaction are
            allowed bus ownership once
    Nodes with pending isochronous transactions go
     through arbitration during each cycle
    IRM is used for isochronous bandwidth allocations
                 IEEE 1394 (cont’d)
• Configuration
   Does not require the host system
   Consists of two main phases
     » Tree identification
         – Used to find the network topology
         – Uses two special signals
              Parent_notify and Child_Notify
     » Self-identification
         – Done after the tree identification
         – Assigns unique ids to nodes
IEEE 1394 (cont’d)
   Tree identification
IEEE 1394 (cont’d)
   Tree identification
                IEEE 1394 (cont’d)
                      Tree identification
All leaf nodes have been identified
                 IEEE 1394 (cont’d)
Tree identification
  Final topology after
  tree identification
  process
                IEEE 1394 (cont’d)
Self-identification
                      Initial network with
                      count values set to 0
                 IEEE 1394 (cont’d)
Self-identification
Node A received grant message
Assigns itself ID zero
                IEEE 1394 (cont’d)
Self-identification
                IEEE 1394 (cont’d)
Self-identification
    Final assignment of node ids
                       Bus Wars
• SCSI is dominant in disk and storage device
  interfaces
    Parallel interface
    Its bandwidth could go up to 640 MB/s
• IEEE 1394
    Serial interface
    Supports peer-to-peer applications
    Dominant in video applications
• USB
    Useful in low-cost, host-to-peripheral applications
    USB 2.0 provides high-speed support
                                                           Last slide