COSC65 – Organization and
Architecture
            Chapter 2
Computer Evolution and Performance
History of Computers
         First Generation: Vacuum Tubes
◼   ENIAC
    ◼   Electronic Numerical Integrator And Computer
◼   Designed and constructed at the University of Pennsylvania
    ◼   Started in 1943 – completed in 1946
    ◼   By John Mauchly and John Eckert
◼   World’s first general purpose electronic digital computer
    ◼   Army’s Ballistics Research Laboratory (BRL) needed a way to supply trajectory
        tables for new weapons accurately and within a reasonable time frame
    ◼   Was not finished in time to be used in the war effort
◼   Its first task was to perform a series of calculations that were used to help
    determine the feasibility of the hydrogen bomb
◼   Continued to operate under BRL management until 1955 when it was
    disassembled
John von Neumann
            EDVAC (Electronic Discrete Variable Computer)
◼ First   publication of the idea was in 1945
◼ Stored    program concept
 ◼ Attributedto ENIAC designers, most notably the
   mathematician John von Neumann
 ◼ Program represented in a form suitable for storing in
   memory alongside the data
◼ IAS   computer
 ◼ Princeton Institute for Advanced Studies
 ◼ Prototype of all subsequent general-purpose computers
 ◼ Completed in 1952
Structure of von Neumann Machine
Structure
    of
   IAS
Computer
                                 Registers
Memory buffer register      • Contains a word to be stored in memory or sent to the I/O unit
      (MBR)                 • Or is used to receive a word from memory or from the I/O unit
   Memory address           • Specifies the address in memory of the word to be written from
    register (MAR)            or read into the MBR
Instruction register (IR)   • Contains the 8-bit opcode instruction being executed
   Instruction buffer       • Employed to temporarily hold the right-hand instruction from a
     register (IBR)           word in memory
                            • Contains the address of the next instruction pair to be fetched
 Program counter (PC)         from memory
 Accumulator (AC) and       • Employed to temporarily hold operands and results of ALU
multiplier quotient (MQ)      operations
Commercial Computers
                        UNIVAC
◼   1947 – Eckert and Mauchly formed the Eckert-Mauchly Computer Corporation
    to manufacture computers commercially
◼   UNIVAC I (Universal Automatic Computer)
    ◼   First successful commercial computer
    ◼   Was intended for both scientific and commercial applications
    ◼   Commissioned by the US Bureau of Census for 1950 calculations
◼   The Eckert-Mauchly Computer Corporation became part of the UNIVAC
    division of the Sperry-Rand Corporation
◼   UNIVAC II – delivered in the late 1950’s
    ◼   Had greater memory capacity and higher performance
◼   Backward compatible
◼   Was the major manufacturer of
    punched-card processing
    equipment
◼ Delivered   its first electronic
    stored-program computer (701) in
                                              IBM
    1953
     ◼   Intended primarily for scientific
         applications
◼ Introduced       702 product in 1955
     ◼   Hardware features made it suitable
         to business applications
◼ Series   of 700/7000 computers
    established IBM as the
    overwhelmingly dominant
    computer manufacturer
History of Computers
         Second Generation: Transistors
◼ Smaller
◼ Cheaper
◼ Dissipates   less heat than a vacuum tube
◼ Is   a solid state device made from silicon
◼ Was    invented at Bell Labs in 1947
◼ Itwas not until the late 1950’s that fully
  transistorized computers were commercially
  available
   Computer Generations
Computer Generations
                   Second Generation Computers
◼   Introduced:
                                                     ◼   Appearance of the Digital Equipment
    ◼   More complex arithmetic and logic units          Corporation (DEC) in 1957
        and control units
    ◼   The use of high-level programming            ◼   PDP-1 was DEC’s first computer
        languages
                                                     ◼   This began the mini-computer
    ◼   Provision of system software which
                                                         phenomenon that would become so
        provided the ability to:
                                                         prominent in the third generation
        ◼   load programs
        ◼   move data to peripherals and libraries
        ◼   perform common computations
    IBM
    7094
Configuration
                   History of Computers
                     Third Generation: Integrated Circuits
◼   1958 – the invention of the integrated circuit
◼   Discrete component
    ◼   Single, self-contained transistor
    ◼   Manufactured separately, packaged in their own containers, and soldered or wired
        together onto masonite-like circuit boards
    ◼   Manufacturing process was expensive and cumbersome
◼   The two most important members of the third generation were the IBM
    System/360 and the DEC PDP-8
Microelectronics
                                              ◼   A computer consists of gates,
                  Integrated                      memory cells, and
                                                  interconnections among these
                  Circuits                        elements
                                              ◼   The gates and memory cells
◼   Data storage – provided by memory cells       are constructed of simple
                                                  digital electronic components
◼   Data processing – provided by gates
                                              ◼   Exploits the fact that such
◼   Data movement – the paths among               components as transistors,
    components are used to move data from         resistors, and conductors can be
    memory to memory and from memory              fabricated from a
    through gates to memory                       semiconductor such as silicon
                                              ◼   Many transistors can be
◼   Control – the paths among components
                                                  produced at the same time on a
    can carry control signals                     single wafer of silicon
                                              ◼   Transistors can be connected
                                                  with a processor metallization to
                                                  form circuits
   Wafer,
   Chip,
    and
    Gate
Relationship
Chip Growth
Moore’s Law
  1965; Gordon Moore – co-founder of Intel
   Observed number of transistors that could
   be put on a single chip was doubling every
   year
                         Consequences of Moore’s law:
   The pace slowed to
   a doubling every 18
      months in the                                          Computer
      1970’s but has      The cost of
                                         The electrical      becomes
                           computer
   sustained that rate     logic and
                                         path length is   smaller and is     Reduction in
                                                                                               Fewer
        ever since                        shortened,           more            power and
                            memory                                                           interchip
                                          increasing       convenient to        cooling
                         circuitry has                    use in a variety                  connections
                                           operating                         requirements
                           fallen at a                           of
                                             speed
                         dramatic rate                     environments
                                      LSI
                                      Large
                                      Scale
  Later                            Integration
Generations
                       VLSI
                     Very Large
                        Scale
                     Integration
                                     ULSI
    Semiconductor Memory           Ultra Large
       Microprocessors                Scale
                                   Integration
      Semiconductor Memory
  In 1970 Fairchild produced the first relatively capacious semiconductor memory
Chip was about the size   Could hold 256 bits of
                                                      Non-destructive        Much faster than core
   of a single core             memory
In 1974 the price per bit of semiconductor memory dropped below the price per bit
                                   of core memory
There has been a continuing and rapid decline in       Developments in memory and processor
 memory cost accompanied by a corresponding        technologies changed the nature of computers in
      increase in physical memory density                        less than a decade
         Since 1970 semiconductor memory has been through 13 generations
Each generation has provided four times the storage density of the previous generation, accompanied
                        by declining cost per bit and declining access time
                 Microprocessors
◼   The density of elements on processor chips continued to rise
    ◼   More and more elements were placed on each chip so that fewer
        and fewer chips were needed to construct a single computer
        processor
◼   1971 Intel developed 4004
    ◼   First chip to contain all of the components of a CPU on a single
        chip
    ◼   Birth of microprocessor
◼   1972 Intel developed 8008
    ◼   First 8-bit microprocessor
◼   1974 Intel developed 8080
    ◼   First general purpose microprocessor
    ◼   Faster, has a richer instruction set, has a large addressing
        capability
Microprocessor Speed
    Techniques built into contemporary processors include:
                            Pipelining
                                              • Processor moves data or instructions into a
                                                conceptual pipe with all stages of the pipe
                                                processing simultaneously
                             Branch           • Processor looks ahead in the instruction code
                                                fetched from memory and predicts which
                            prediction
                                                branches, or groups of instructions, are likely
                                                to be processed next
                             Data flow        • Processor analyzes which instructions are
                                                dependent on each other’s results, or data, to
                             analysis           create an optimized schedule of instructions
                           Speculative
                                              • Using branch prediction and data flow analysis,
                                                some processors speculatively execute
                                                instructions ahead of their actual appearance in
                            execution
                                                the program execution, holding the results in
                                                temporary locations, keeping execution
                                                engines as busy as possible
Performance
Balance
 ◼ Adjust the organization and                         Increase the number
                                                           of bits that are
                                                       retrieved at one time
 architecture to compensate                             by making DRAMs
                                                        “wider” rather than
 for the mismatch among the                              “deeper” and by
                                                        using wide bus data
 capabilities of the various                                    paths
 components                                                   Reduce the
                                                       frequency of memory
                                                               access by
 ◼ Architectural   examples                                 incorporating
                                                             increasingly
 include:                                                    complex and
                                                            efficient cache
                                                         structures between
                                                          the processor and
                                                            main memory
                                                                                  Increase the
                                 Change the DRAM                                  interconnect
                                interface to make it                           bandwidth between
                                 more efficient by                               processors and
                               including a cache or                             memory by using
                                  other buffering                              higher speed buses
                              scheme on the DRAM                                and a hierarchy of
                                        chip                                   buses to buffer and
                                                                               structure data flow
Typical I/O Device Data Rates
Improvements in Chip Organization and
Architecture
◼   Increase hardware speed of processor
    ◼   Fundamentally due to shrinking logic gate size
        ◼   More gates, packed more tightly, increasing clock rate
        ◼   Propagation time for signals reduced
◼   Increase size and speed of caches
    ◼   Dedicating part of processor chip
        ◼   Cache access times drop significantly
◼   Change processor organization and architecture
    ◼   Increase effective speed of instruction execution
    ◼   Parallelism
Problems with Clock Speed and Login Density
◼ Power
 ◼   Power density increases with density of logic and clock speed
 ◼   Dissipating heat
◼ RC   delay
 ◼   Speed at which electrons flow limited by resistance and
     capacitance of metal wires connecting them
 ◼   Delay increases as RC product increases
 ◼   Wire interconnects thinner, increasing resistance
 ◼   Wires closer together, increasing capacitance
◼ Memory     latency
 ◼   Memory speeds lag processor speeds
Processor
 Trends
            The use of multiple
Multicore   processors on the same chip
            provides the potential to
            increase performance
            without increasing the clock
            rate
            Strategy is to use two simpler
            processors on the chip rather
            than one more complex
            processor
            With two processors larger
            caches are justified
            As caches became larger it
            made performance sense to
            create two and then three
            levels of cache on a chip
              Many Integrated Core (MIC)
                Graphics Processing Unit (GPU)
                   MIC                                               GPU
◼   Leap in performance as well as the          ◼   Core designed to perform
    challenges in developing software to            parallel operations on graphics
    exploit such a large number of cores            data
◼   The multicore and MIC strategy involves a   ◼   Traditionally found on a plug-in
    homogeneous collection of general               graphics card, it is used to
    purpose processors on a single chip             encode and render 2D and 3D
                                                    graphics as well as process
                                                    video
                                                ◼   Used as vector processors for a
                                                    variety of applications that
                                                    require repetitive computations
        Overview
                                                       ARM
◼   Results of decades of design effort on
    complex instruction set computers                    Intel
    (CISCs)
◼   Excellent example of CISC design
◼   Incorporates the sophisticated design
    principles once found only on
    mainframes and supercomputers
◼   An alternative approach to processor
    design is the reduced instruction set
                                             x86 Architecture
    computer (RISC)
◼   The ARM architecture is used in a
    wide variety of embedded systems
    and is one of the most powerful and
    best designed RISC based systems on
    the market
◼   In terms of market share Intel is        CISC
    ranked as the number one maker of
    microprocessors for non-embedded
    systems                                     RISC
Embedded Systems
                   Requirements and Constraints
                                       Small to large systems,
                                       implying different cost
                                      constraints and different
                                     needs for optimization and
                                                reuse
                                                                    Relaxed to very strict
                                                                      requirements and
          Different models of                                     combinations of different
       computation ranging from                                   quality requirements with
       discrete event systems to                                      respect to safety,
            hybrid systems                                        reliability, real-time and
                                                                           flexibility
           Different application
         characteristics resulting
         in static versus dynamic
        loads, slow to fast speed,                                 Short to long life times
        compute versus interface
          intensive tasks, and/or
           combinations thereof
                                      Different environmental
                                       conditions in terms of
                                     radiation, vibrations, and
                                             humidity
Possible Organization of an Embedded System
    Programming
        IDE
+
    Thank you!