Academic Session 2020/2021
Semester 1
     EMT 475/3
Computer Organization
   & Architecture
Chapter 1 : Introduction to Computer &
              Architecture
           Faculty of Electronic Engineering Technology
                     Universiti Malaysia Perlis
Outline
• Organization & Architecture
• Structure & Function
• Computer Evolution & Performance
• Designing for Performance
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           2
Outline
• Organization & Architecture
• Structure & Function
• Computer Evolution & Performance
• Designing for Performance
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           3
Organization & Architecture
• What is a computer?
   ―Data processing machine
   ―Operated automatically under the control of a list of
    instructions (called program) stored in its main
    memory
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           4
Organization & Architecture
• What is a computer system?
   ― Consists of computer & its
     peripherals.
   ― Computer peripherals
     • Input devices
        ―allows you enter
          information
        ―keyboard, mouse, scanner & etc.
     • Output devices
        ―monitor, speakers, printer & etc.
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           5
      Organization & Architecture
                • Secondary memories
                   ― storage devices
                   ― hard drives & solid state
                      drive
                   ― also refer to
                     removable storage media
                     such as USB flash drives,
                     CDs & DVDs
USB – universal serial bus
CD – compact disc              Faculty of Electronic Engineering Technology
DVD – digital versatile disc             Universiti Malaysia Perlis           6
Organization & Architecture (add)
• By definition (textbook, pg. 26)
   ― Computer Architecture
      • Those attributes to a system visible to the
        programmer/those attributes that have a direct impact
        on the logical execution of a program.
     • Include the instruction set, the number of bits used to
       represent various data types, I/O mechanism &
       techniques for addressing memory
     • Issue: whether a computer will have a multiply
       instruction
                     Faculty of Electronic Engineering Technology
                               Universiti Malaysia Perlis           7
Organization & Architecture (add)
 ―Instruction set architecture
   • Instruction format, instruction Opcodes, registers,
     instruction & data memory
     ―The effect of executed instruction on the
      registers & memory
     ―Algorithm for controlling instruction execution
                  Faculty of Electronic Engineering Technology
                            Universiti Malaysia Perlis           8
Organization & Architecture (add)
 ―Computer Organization
   • The operational units & their interconnections that
     realize the architectural specification
   • Include those hardware details
       ―Control signals, interfaces between the computer
        & peripheral, memory technology used
   • Issue: whether that instruction will be implemented
     by a special unit or by a mechanism that makes
     repeated use of the add unit in the system
                  Faculty of Electronic Engineering Technology
                            Universiti Malaysia Perlis           9
Organization & Architecture
• Architecture
   ―Intel x86 family: share the same basic architecture
   ―IBM System/370 family :share the same basic
     architecture
                   • Code compatibility
                     ―At least backwards
 Organization differs between different versions
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           10
Outline
• Organization & Architecture
• Structure & Function
• Computer Evolution & Performance
• Designing for Performance
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           11
What is Computer? (add)
• Computer
   ―Complex system
   ―How to describe? Recognize its hierarchical nature
• Hierarchical system
   ―A set of interrelated subsystems
   ―Provide both their design & their description
   ―Allow designer deals with particular level of the
    system at a time
   ―At each level, designer concerns with structure &
    function
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           12
      Structure & Function
      • Structure  the way in which components relate to
        each other
         ―e.g. connection between ALU & control unit,
           connection between Instruction register & instruction
           decoder.
      • Function  the operation of individual components as
        part of the structure
         ―e.g. How the ALU, Instruction register & instruction
          decoder work.
                              Faculty of Electronic Engineering Technology
ALU – arithmetic logic unit             Universiti Malaysia Perlis           13
Structure & Function
• Top level
        Peripherals                                           Computer
                                                      Central                   Main
                                                    Processing                 Memory
                                                    Unit (CPU)
           Computer
                                                                  Systems
                                                              Interconnection
                                                                       Input
                                                                      Output
       Communication
       lines
                       Faculty of Electronic Engineering Technology
                                 Universiti Malaysia Perlis                             14
Structure & Function
• Computer structure:
  (Simple
   Single-Processor)
  ―Central processing unit (CPU)
    • controls the operation of the computer & performs
      its data processing functions-processor.
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           15
Structure & Function
• Computer structure:
 (Simple
   Single-Processor)
  ―Main memory
    • stores data
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           16
Structure & Function
• Computer structure:
 (Simple
   Single-Processor)
  ―I/O
    • moves data between the computer & its external
      environment
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           17
Structure & Function
• Computer structure:
 (Simple
   Single-Processor)
  ―System interconnection
    • some mechanism that provides for communication
      among CPU, main memory & I/O.
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           18
Structure & Function
• Traditionally:
   ― a computer consist of
     single CPU (single-core)
• Recent years:
   ― multiple processors is
     available in single
     computer (multi-core)
                     Faculty of Electronic Engineering Technology
                               Universiti Malaysia Perlis           19
Structure & Function
• Basic functions that                                         Data Movement
                                                                 Apparatus
  a computer can perform:
   ―Data processing
   ―Data storage
                                                                    Control
   ―Data movement                                                  Mechanism
   ―Control mechanism
                                       Data Storage                            Data Processing
                                         Facility                                  Facility
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis                                         20
      Structure & Function
      • Process data
         ―data can be a variety of forms, & the range of processing
           requirements is broad.
         ―There are only a few fundamental methods of data
           processing (refer ALU).
      • Store data
         ―computer must temporarily store at least those pieces of
           data that are being worked on at any given moment.
            • short-term data storage function (temporary register)
            • long-term data storage function (store File).
                              Faculty of Electronic Engineering Technology
ALU – arithmetic logic unit             Universiti Malaysia Perlis           21
Structure & Function
• Move data
   1. Computer must be able to move data between itself &
       outside world.
   2. Device directly connected to computer
      • peripheral. (printer, keyboard & etc.)
   3. If data are moved over longer distances
      • data communication. (transmitter & receiver)
• Control
   ― control of THREE functions (above), & given by
      individuals who provides the computer with instruction.
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis           22
Structure & Function
• (a) Operation of data movement                                    Movement
   ― From one communications line
      /peripheral to another.
                                                                     Control
                                                          Storage              Processing
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis                                     23
Structure & Function                                           READ
• (b) Operation of storage                                            Movement
   ― Data transferred from the
      external environment to computer
      storage (READ) & vice versa
      (WRITE)                                                          Control
                                       WRITE
                                                           Storage               Processing
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis                                      24
Structure & Function
• (c) Operation of processing data                                   Movement
      from storage to storage
                                                                      Control
                                                           Storage              Processing
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis                                     25
Structure & Function
• (d) Operation of processing data from                              Movement
      storage to I/O or vice versa
                                                                      Control
                                                           Storage              Processing
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis                                     26
Structure & Function
• CPU level
                                                                           CPU
              Computer
                                                        Registers                   Arithmetic
        I/O                                                                         Logic Unit
          System     CPU
            Bus
                                                                    Internal CPU
       Memory                                                     Interconnection
                                                                          Control
                                                                           Unit
                           Faculty of Electronic Engineering Technology
                                     Universiti Malaysia Perlis                                  27
      Structure & Function
      • CPU Level:
      • Registers
         ―provides storage internal to CPU
                                Faculty of Electronic Engineering Technology
CPU – central processing unit             Universiti Malaysia Perlis           28
      Structure & Function
      • CPU Level:
      • ALU
         ―performs data processing functions
                              Faculty of Electronic Engineering Technology
ALU – arithmetic logic unit             Universiti Malaysia Perlis           29
      Structure & Function
      • CPU Level:
      • CPU interconnection
         ―mechanism that provides communication among the
          control unit, ALU & register.
ALU – arithmetic logic unit     Faculty of Electronic Engineering Technology
CPU – central processing unit             Universiti Malaysia Perlis           30
      Structure & Function
      • CPU Level:
      • Control unit
         ―controls the operation of the CPU & hence the
          computer.
                                Faculty of Electronic Engineering Technology
CPU – central processing unit             Universiti Malaysia Perlis           31
Structure & Function
• Control unit level
                                                                    Control Unit
              CPU
                                                            Sequencing
        ALU                                                    Logic
                      Control
           Internal
                       Unit
             Bus
                                                                           Control Unit
        Registers                                                          Registers and
                                                                             Decoders
                                                                               Control
                                                                               Memory
                                Faculty of Electronic Engineering Technology
                                          Universiti Malaysia Perlis                       32
Structure & Function
• Central Processing Unit (CPU)
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           33
Outline
• Organization & Architecture
• Structure & Function
• Computer Evolution & Performance
• Designing for Performance
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           34
Computer Evolution & Performance
                                                                             Typical speed
Generation     Year                Technology
                                                                        (operations per-second)
    1        1946-1957           Vacuum tube                                    40,000
    2        1958-1964              Transistor                                 200,000
                              Small & medium
    3        1965-1971                                                        1,000,000
                              scale integration
                                 Large scale
    4        1972-1977                                                       10,000,000
                              integration (LSI)
                              Very large scale
    5        1978-1991                                                       100,000,000
                             integration (VLSI)
                              Ultra large scale
    6          1991-                                                       >1,000,0000,000
                             integration (ULSI)
                         Faculty of Electronic Engineering Technology
                                   Universiti Malaysia Perlis                                 35
Computer Evolution & Performance
                                                                             Typical speed
Generation     Year                Technology
                                                                        (operations per-second)
    1        1946-1957           Vacuum tube                                    40,000
    2        1958-1964              Transistor                                 200,000
                              Small & medium
    3        1965-1971                                                        1,000,000
                              scale integration
                                 Large scale
    4        1972-1977                                                       10,000,000
                              integration (LSI)
                              Very large scale
    5        1978-1991                                                       100,000,000
                             integration (VLSI)
                              Ultra large scale
    6          1991-                                                       >1,000,0000,000
                             integration (ULSI)
                         Faculty of Electronic Engineering Technology
                                   Universiti Malaysia Perlis                                 36
Computer Evolution & Performance
• 1st Generation – Vacuum tube
   ―Electronic Numerical Integrator And Computer (ENIAC)
   ―Invented by Eckert & Mauchly
   ―From University of Pennsylvania
   ―Used for Trajectory tables for weapons
   ―Started 1943
      • World’s first general purpose electronic digital
        computer
   ―Finished 1946
      • Too late for war effort
   ―Used until 1955
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           37
Computer Evolution & Performance
• Specification of ENIAC
   ―Decimal (not binary)
   ―20 accumulators of 10 digits
   ―Programmed manually by switches
   ―18,000 vacuum tubes
   ―30 tons
   ―15,000 square feet
   ―140 kW power consumption
   ―5,000 additions per second
   ―Entering/altering programs – extremely tedious
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           38
      Computer Evolution & Performance
      • 1st Generation – Vacuum tube
         ―Electronic Discrete Variable Automatic Computer
           (EDVAC)
         ―Invented by Jon Von Neumann/Alan Turing
         ―Based on stored program concept
         ―Started 1946
         ―Not completed until 1952 but still remained as first
           prototype of       all subsequent general-purpose
           computer
         ―Also known as IAS computer
IAS – referred to The Princeton       Faculty of Electronic Engineering Technology
     Institute for Advanced Studies             Universiti Malaysia Perlis           39
Computer Evolution & Performance
• Specification
   ―Binary
   ―6,000 vacuum tubes + 12,000 diodes
   ―0.785 tons
   ―490 square feet (45.5 m2)
   ―56 kW power consumption
   ―1,160 additions & 340 multiplications per second
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           40
Computer Evolution & Performance
• IAS computer general structure
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           41
Computer Evolution & Performance
• Basic structure & function
   ―Main memory (M)
     • Stores both data & instruction
   ―Arithmetic Logic Unit (CA)
     • Operates on binary data
   ―Control unit (CC)
     • Interprets the instruction in memory & executed
       the instruction
   ―Input-Output (I/O)
     • Operated by control unit
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           42
Computer Evolution & Performance
IAS Memory Format:            0 1                                                                                                         39
• Number word
    ―1000 x 40-bit words     sign bit                                           (a) Number word
• Instruction word
    ―2 x 20-bit instructions                  left instruction (20 bits)                              right instruction (20 bits)
      • 8-bit operation code 0                        8                                 20                    28                          39
        (opcode)
      • 12-bit address            opcode (8 bits)             address (12 bits)            opcode (8 bits)            address (12 bits)
                                                                                              (b) Instruction word
                                                         Universiti Malaysia Perlis
                                                                                      Figure 1.7 IAS Memory Formats
                                               Faculty of Electronic Engineering Technology
                                                                                                                              43
Computer Evolution & Performance
• IAS computer
  details
  (pg. 36-39)
                 Faculty of Electronic Engineering Technology
                           Universiti Malaysia Perlis           44
Computer Evolution & Performance
• Set of registers (storage in CPU)
   ―Memory Buffer Register (MBR)
      • Contains word to be stored in memory/sent to I/O
        unit
      • To receive a word from memory/I/O unit
   ―Memory Address Register (MAR)
      • Specifies the address in memory of the word to be
        written/read into MBR
   ―Instruction Register (IR)
      • Contains 8-bit opcode to be executed
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           45
Computer Evolution & Performance
 ―Instruction Buffer Register (IBR)
   • Hold temporarily right-hand instruction from a word
     in memory
 ―Program Counter (PC)
   • Contains the address of the next instruction pair to
     be fetched from memory
 ―Accumulator Register (AC) & Multiplier Quotient (MQ)
   • Hold temporarily operands & results of ALU
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           46
Computer Evolution & Performance
• Commercial computers
   ―1947 - Eckert-Mauchly Computer Corporation
   ―UNIVAC I (Universal Automatic Computer)
   ―US Bureau of Census 1950 calculations
   ―Became part of Sperry-Rand Corporation
   ―Late 1950s - UNIVAC II
     • Faster
     • More memory
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           47
      Computer Evolution & Performance
      • IBM
         ―Manufacturer of Punched-card processing equipment
         ―1953 - the 701
            • IBM’s first stored program computer
            • Scientific calculations
         ―1955 - the 702
            • Business applications
         ―Lead to 700/7000 series
                                           Faculty of Electronic Engineering Technology
                                                     Universiti Malaysia Perlis           48
IBM – International Business Machines Corporation
Computer Evolution & Performance
                                                                             Typical speed
Generation     Year                Technology
                                                                        (operations per-second)
    1        1946-1957           Vacuum tube                                    40,000
    2        1958-1964              Transistor                                 200,000
                              Small & medium
    3        1965-1971                                                        1,000,000
                              scale integration
                                 Large scale
    4        1972-1977                                                       10,000,000
                              integration (LSI)
                              Very large scale
    5        1978-1991                                                       100,000,000
                             integration (VLSI)
                              Ultra large scale
    6          1991-                                                       >1,000,0000,000
                             integration (ULSI)
                         Faculty of Electronic Engineering Technology
                                   Universiti Malaysia Perlis                                 49
Computer Evolution & Performance
• 2nd Generation: Transistor
   ―Replaced vacuum tubes
   ―Smaller
   ―Cheaper
   ―Less heat dissipation
   ―Solid State device
   ―Made from Silicon (Sand)
   ―Invented 1947 at Bell Labs
   ―William Shockley et al
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           50
      Computer Evolution & Performance
      • Transistor based computers
         ―Second generation machines
         ―NCR & RCA produced small transistor machines
         ―IBM 7000
         ―DEC - 1957
            • Produced PDP-1 (Programmed Data Processor-1)
            • used for process control, scientific research & graphics
              applications as well as to pioneer timesharing systems.
            • also made it possible for smaller businesses and
              laboratories to have access to much more computing
              power than ever before.
NCR – National Cash Register Corporation
RCA – Radio Corporation of America         Faculty of Electronic Engineering Technology
DEC - Digital Equipment Corporation                  Universiti Malaysia Perlis           51
Computer Evolution & Performance
                                                                             Typical speed
Generation     Year                Technology
                                                                        (operations per-second)
    1        1946-1957           Vacuum tube                                    40,000
    2        1958-1964              Transistor                                 200,000
                              Small & medium
    3        1965-1971                                                        1,000,000
                              scale integration
                                 Large scale
    4        1972-1977                                                       10,000,000
                              integration (LSI)
                              Very large scale
    5        1978-1991                                                       100,000,000
                             integration (VLSI)
                              Ultra large scale
    6          1991-                                                       >1,000,0000,000
                             integration (ULSI)
                         Faculty of Electronic Engineering Technology
                                   Universiti Malaysia Perlis                                 52
Computer Evolution & Performance
• Microelectronics
   ―Literally - “small electronics”
   ―A computer is made up of gates, memory cells and
    interconnections
   ―These can be manufactured on a semiconductor
   ―e.g. silicon wafer
                  Faculty of Electronic Engineering Technology
                            Universiti Malaysia Perlis           53
Computer Evolution & Performance
• Moore ‘s Law
   ―Increased density of components on chip
   ―Gordon Moore – co-founder of Intel
     • Number of transistors on a chip will double every year
   ―Since 1970’s development has slowed a little
     • Number of transistors doubles every 18 months
   ―Cost of a chip  remained almost unchanged
   ―Higher packing density
     • Shorter electrical paths  higher performance, speed
   ―Smaller size  more convenient
   ―Reduced power & cooling requirements
   ―Fewer interconnections  increases reliability
                       Faculty of Electronic Engineering Technology
                                 Universiti Malaysia Perlis           54
Computer Evolution & Performance
                                                       Ref: https://humanswlord.wordpress.com/
             Faculty of Electronic Engineering Technology
                       Universiti Malaysia Perlis                                                55
Computer Evolution & Performance
                                                   Ref: www.intechopen.com
Trends in device count/chip & feature size of MOS device. A DRAM cell consists of two devices of a cell transistor & a storage capacitor.
                                                    Faculty of Electronic Engineering Technology
                                                              Universiti Malaysia Perlis                                                    56
      Computer Evolution & Performance
      • IBM 360 series
          ―1964
          ―Replaced (& not compatible with) 7000 series
          ―First planned “family” of computers
            • Similar or identical instruction sets
            • Similar or identical O/S
            • Increasing speed
            • Increasing number of I/O ports (i.e. more terminals)
            • Increased memory size
            • Increased cost
          ―Multiplexed switch structure
                                          Faculty of Electronic Engineering Technology
                                                    Universiti Malaysia Perlis           57
IBM – International Business Machines Corporation
     Computer Evolution & Performance
     • DEC PDP-8
        ―1964
        ―First minicomputer (after miniskirt!)
        ―Did not need air conditioned room
        ―Small enough to sit on a lab bench
        ―$16,000
          • $100k+ for IBM 360
        ―Embedded applications by OEMs
        ―BUS STRUCTURE –
DEC – Digital Equipment Corporation       Faculty of Electronic Engineering Technology
OEMs – Original Equipment Manufacturers             Universiti Malaysia Perlis           58
Computer Evolution & Performance
• DEC PDP-8 structure
                              Ref: textbook Wiliam Stallings
• Omnibus : Latin word, meaning “for all”.
                        Faculty of Electronic Engineering Technology
                                  Universiti Malaysia Perlis           59
Computer Evolution & Performance
                                                                             Typical speed
Generation     Year                Technology
                                                                        (operations per-second)
    1        1946-1957           Vacuum tube                                    40,000
    2        1958-1964              Transistor                                 200,000
                              Small & medium
    3        1965-1971                                                        1,000,000
                              scale integration
                                 Large scale
    4        1972-1977                                                       10,000,000
                              integration (LSI)
                              Very large scale
    5        1978-1991                                                       100,000,000
                             integration (VLSI)
                              Ultra large scale
    6          1991-                                                       >1,000,0000,000
                             integration (ULSI)
                         Faculty of Electronic Engineering Technology
                                   Universiti Malaysia Perlis                                 60
Computer Evolution & Performance
• Semiconductor Memory
   ―1950s & 1960s - Magnetic-core memory
   ―Constructed from tiny rings of ferromagnetic material
   ―Fast as millionth of a second to read a bit stored memory
   ―Expensive & bulky
   ―Destructive readout
                                                                     http://echochamber.me/
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis                                    61
Computer Evolution & Performance
• Semiconductor Memory
   ―1970
   ―Fairchild
   ―Size of a single core
   ―Holds 256 bits
   ―Non-destructive read
   ―Much faster than core (70 billionths of a second to read a
    bit
   ―Capacity approximately doubles each year
   ―Expensive at the beginning but dropped from time to time
   ―13 generations
     • 1k, 4k,16k,64k,256k,1M,4M,16M,64M,256M,1G,4G & 8G
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis           62
Computer Evolution & Performance
• Evolution of Intel Microprocessors
• 1970s Processors
                          4004                  8008                 8080                         8086                  8088
Introduced                1971                  1972                  1974                        1978                  1979
Clock speeds             108 kHz               108 kHz               2 MHz                5 MHz, 8 MHz, 10 MHz      5 MHz, 8 MHz
Bus width                 4 bits                8 bits               8 bits                      16 bits               8 bits
Number of transistors     2,300                 3,500                6,000                       29,000                29,000
Feature size (µm)           10                    8                    6                            3                     6
Addressable memory      640 Bytes               16 KB                64 KB                        1 MB                  1 MB
• 1980s Processors
                             80286                        386TM DX                           386TM SX            486TM DX CPU
Introduced                    1982                          1985                                1988                   1989
Clock speeds            6 MHz - 12.5 MHz               16 MHz - 33 MHz                    16 MHz - 33 MHz        25 MHz - 50 MHz
Bus width                   16 bits                        32 bits                            16 bits                 32 bits
Number of transistors       134,000                        275,000                            275,000               1.2 million
Feature size (µm)              1.5                            1                                  1                    0.8 - 1
Addressable memory           16 MB                          4 GB                               16 MB                   4 GB
Virtual memory                1 GB                          64 TB                              64 TB                   64 TB
Cache                           —                             —                                  —                      8 kB
                                           Faculty of Electronic Engineering Technology
                                                     Universiti Malaysia Perlis                                              63
Computer Evolution & Performance
• 1990s Processors
               486TM SX                                  Pentium                      Pentium Pro             Pentium II
Introduced                   1991                          1993                           1995                   1997
Clock speeds            16 MHz - 33 MHz             60 MHz - 166 MHz,              150 MHz - 200 MHz      200 MHz - 300 MHz
Bus width                   32 bits                       32 bits                        64 bits                64 bits
Number of transistors    1.185 million                  3.1 million                    5.5 million            7.5 million
Feature size (µm)               1                           0.8                            0.6                    0.35
Addressable memory           4 GB                          4 GB                           64 GB                  64 GB
Virtual memory               64 TB                         64 TB                          64 TB                  64 TB
Cache                         8 kB                          8 kB                  512 kB L1 and 1 MB L2        512 kB L2
• Recent Processors
                          Pentium III                   Pentium 4                        Core 2 Duo       Core i7 EE 4960X
Introduced                    1999                          2000                             2006                 2013
Clock speeds             450 - 660 MHz                 1.3 - 1.8 GHz                    1.06 - 1.2 GHz           4 GHz
Bus width                   64 bits                       64 bits                           64 bits             64 bits
Number of transistors     9.5 million                    42 million                       167 million         1.86 billion
Feature size (nm)             250                           180                               65                   22
Addressable memory           64 GB                         64 GB                            64 GB                64 GB
Virtual memory               64 TB                         64 TB                            64 TB                64 TB
Cache                      512 kB L2                    256 kB L2                          2 MB L2        1.5 MB L2/15 MB L3
Number of cores                 1                             1                                2                    6
                                         Faculty of Electronic Engineering Technology
                                                   Universiti Malaysia Perlis                                           64
Computer Evolution & Performance
Embedded system
• refers to the use of electronics & software within a
  product.
• tightly coupled to their environment.
• Deeply embedded
    ―Difficult to observe by
      programmer & user
    ―Use microcontroller
                   Faculty of Electronic Engineering Technology   Ref: textbook Wiliam Stallings
                             Universiti Malaysia Perlis                                            65
Computer Evolution & Performance
• Internet of Things (IoTs)
  ―The expanding interconnection of
   smart     devices,    ranging     from
   appliances to tiny sensors.
  ―Enabling    the     new     forms    of
   communication between people &
   things, & between things themselves.
  ―Internet supports the interconnection
   of billions of industrial & personal
   objects through cloud system.
                       Faculty of Electronic Engineering Technology
                                 Universiti Malaysia Perlis           66
     Computer Evolution & Performance
     • Internet of Things (IoTs)
         ―Four(4) generations of development
           1. IT – PCs, servers, routers, firewalls & etc
           2. OT – medical machinery, SCADA, kiosk & etc
           3. Personal tech. – smartphones, tablets, & e-book
           4. Sensors/actuator tech.
               single-purpose devices that using wireless
                 connectivity, become a part of larger system
IT – Information technology
OT – Operational technology   Faculty of Electronic Engineering Technology
                                        Universiti Malaysia Perlis           67
     Computer Evolution & Performance
     • Embedded Operating System
        ―TWO(2) approaches
          1. Use existing OS & adapt it
             ― e.g. Linux, Windows & MAC
          2. Design & implement a new OS intended solely
             ― e.g. TinyOS (WSN)
WSN – Wireless Sensor Networks   Faculty of Electronic Engineering Technology
                                           Universiti Malaysia Perlis           68
Computer Evolution & Performance
• Application vs Dedicated Processor
   ―Application
     • Able to execute complex operating system such as
       Linux, Android & Chrome
     • General purpose in nature
     • e.g. smartphone
   ―Dedicated
     • Dedicated to one/a small number of specific tasks
     • Can be engineered to reduce size & cost
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           69
Computer Evolution & Performance
• Microprocessor vs Microcontroller
• Microprocessor
   ―Consists of register, ALU & control unit/instruction
     processing logic
• Microcontroller
   ―Consists of processor, memory
    (ROM & RAM), clock & I/O control
    unit
   ―Slower than microprocessor
   ―No human interaction.
   ―Specific task
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis           70
Computer Evolution & Performance
• Embedded vs Deeply Embedded Systems
• Embedded
   ―Uses general purpose processor
   ―Tightly coupled to their environmental
• Deeply Embedded
   ―Has a processor whose behavior is difficult to observe
    both programmer & user
   ―Uses microcontroller
   ―Not programmable
   ―No interaction with user
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           71
      Computer Evolution & Performance
      • ARM
         ―ARM Holdings, Cambridge, England
         ―RISC-based microprocessor & microcontroller
         ―High speed, small size & low power
         ―Apple iPhone & iPod
         ―Product:
           • Cortex-A,Cortex-A50 (Application processors)
           • Cortex-R (Real-time applications)
           • Cortex-M (Microcontroller)
RISC – Reduced Instruction Set Computer   Faculty of Electronic Engineering Technology
                                                    Universiti Malaysia Perlis           72
Computer Evolution & Performance
• Cloud computing
   ―A model for enabling
    ubiquitous, convenient,
    on-demand network
    access to a shared pool
    of configurable computing
    resources.
   ―To back-up data,
     synch devices & share
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           73
Outline
• Organization & Architecture
• Structure & Function
• Computer Evolution & Performance
• Designing for Performance
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           74
Designing for Performance
• Desktop application
   ―Requirements
     • Image processing
     • 3-D rendering
     • Speech recognition
     • Videoconferencing
     • Multimedia authoring
     • Voice & video annotation of files
     • Simulation modeling
                     Faculty of Electronic Engineering Technology
                               Universiti Malaysia Perlis           75
Designing for Performance
• Results:
   ―Evolution of processors continues to bear out Moore’s
    Law
   ―New generation of chips every THREE(3) years
   ―Memory chips quadrupled the capacity of DRAM every
    THREE(3) years
   ―New circuit & speed boost by reducing distance
    improved performance FOUR(4) or FIVE(5) fold every
    THREE(3) years
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           76
Designing for Performance
• Actual:
   ―Raw speed will not achieved
     • Computer instruction
     • Greater chip & greater density need new technique
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           77
Designing for Performance
Issue: Microprocessor speed
• Pipelining
   ―Enables the processor to work simultaneously on
    multiple instruction
• Branch prediction
   ―Increases amount of work available for the processor
    to be executed
• Superscalar execution
   ―Able to issue more than one instruction in every
    processor clock cycle.
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           78
Designing for Performance
• Data flow analysis
   ―Create optimized schedule of instructions
• Speculative execution
   ―Enables the processor to keep its execution’s engine
    as busy as possible results from branch prediction &
    data flow analysis.
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           79
Designing for Performance
• High performance processor caused increment                      of
  processor speed but not others components
• Main problem
   ―Processor speed increased rapidly but the speed
     between processor & memory lagged badly
   ―Processing time lost
              Performance balance
    Adjustment/tuning organization & architecture
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           80
Designing for Performance
Performance Balance Consideration:
• Increase number of bits retrieved at one time
    ―Make DRAM “wider” rather than “deeper”
• Change DRAM interface
    ―Cache/buffering scheme
• Reduce frequency of memory access
    ―More complex cache & cache on chip
• Increase interconnection bandwidth
    ―High speed buses
    ―Hierarchy of buses
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           81
Designing for Performance
Issue:
 I/O devices
               Faculty of Electronic Engineering Technology
                         Universiti Malaysia Perlis           82
Designing for Performance
• Issue: I/O devices
• Peripherals with intensive I/O demands
   ―Large data throughput demands
   ―Problem: data that moved from processor to peripheral
• Strategies:
   ―Caching/buffering scheme
   ―Higher-speed interconnection buses
   ―More elaborate bus structures
   ―Multiple-processor configurations
                       Faculty of Electronic Engineering Technology
                                 Universiti Malaysia Perlis           83
Designing for Performance
• Processor Design Consideration
   ―Evolution factor
     1. Performance changes in various technology areas
        ―Processor, buses, memory, peripherals
     2. New applications & new peripherals change the
        nature of the demand on the systems
        ―Instruction profile, data access patterns
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           84
Designing for Performance
THREE(3) approaches of increasing processor speed
1. Increase hardware speed of processor
   ―Fundamentally due to shrinking logic gate size
     • More gates, packed more tightly, increasing clock rate
     • Propagation time for signals reduced
2. Increase size & speed of caches
   ―Dedicating part of processor chip
     • Cache access times drop significantly
3. Change processor organization & architecture
   ―Increase effective speed of execution
   ―Parallelism
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis           85
Designing for Performance
Another factors:
1. Power
   ―Power density   density of logic & clock speed
                 Dissipating heat difficulty
2. RC delay
   ―Speed at which electrons flow limited by resistance &
    capacitance of metal wires
   ―Wire interconnects thinner,  resistance
   ―Wires closer together,  capacitance
                 Delay  as RC product 
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           86
Designing for Performance
3. Memory latency & throughput
   ―Memory access speed (latency) & data transfer speed
    (throughput) lag processor speed
                      Emphasis on
             Organizational & Architectural
                      approaches
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           87
Designing for Performance
              Ref: textbook Wiliam Stallings
               Faculty of Electronic Engineering Technology
                         Universiti Malaysia Perlis           88
Designing for Performance
New Approach – Multiple Cores/Multicores
• Multiple processors on single chip
    ―Large shared cache
    ―Performance   complexity  by 2
• If software can use multiple processors, 2x number of processors
  almost 2x performance
    two simpler processors > one complex processor
• Two processors, larger caches are justified
    ―Power consumption: memory logic < processing logic
                            IBM POWER4
                   1st multi-core based on PowerPC
                       Faculty of Electronic Engineering Technology
                                 Universiti Malaysia Perlis           89
Designing for Performance
• Power4 chip
  organization
                 Faculty of Electronic Engineering Technology
                           Universiti Malaysia Perlis           90
Designing for Performance
Pentium Evolution (Self-reading)
• 8080                                            • 80486
    ― first general purpose microprocessor            ― sophisticated     powerful   cache     and
    ― 8 bit data path                                   instruction pipelining
    ― Used in first personal computer – Altair
                                                      ― built in maths co-processor
• 8086
    ― much more powerful                          • Pentium
    ― 16 bit                                          ― Superscalar
    ― instruction    cache,    pre-fetch      few     ― Multiple instructions executed in parallel
      instructions
    ― 8088 (8 bit external bus) used in first IBM • Pentium Pro
      PC                                              ― Increased superscalar organization
• 80286                                               ― Aggressive register renaming
    ― 16 Mbyte memory addressable
    ― up from 1Mb                                     ― branch prediction
• 80386                                               ― data flow analysis
    ― 32 bit                                          ― speculative execution
    ― Support for multitasking
                                   Faculty of Electronic Engineering Technology
                                             Universiti Malaysia Perlis                      91
Designing for Performance
Pentium Evolution (Self-reading) (cont’d)
• Pentium II
    ― MMX technology
    ― graphics, video & audio processing
• Pentium III
    ― Additional floating point instructions for 3D graphics
• Pentium 4
    ― Note Arabic rather than Roman numerals
    ― Further floating point and multimedia enhancements
• Itanium
     ― 64 bit
• Itanium 2
     ― Hardware enhancements to increase speed
• See Intel web pages for detailed information on processors
                                    Faculty of Electronic Engineering Technology
                                              Universiti Malaysia Perlis           92
Designing for Performance
PowerPC
• 1975, 801 minicomputer project (IBM) RISC
• Berkeley RISC I processor
• 1986, IBM commercial RISC workstation product, RT PC.
    ― Not commercial success
    ― Many rivals with comparable or better performance
• 1990, IBM RISC System/6000
    ― RISC-like superscalar machine
    ― POWER architecture
• IBM alliance with Motorola (68000 microprocessors), and Apple, (used 68000 in Macintosh)
• Result  PowerPC architecture
    ― Derived from the POWER architecture
    ― Superscalar RISC
    ― Apple Macintosh
    ― Embedded chip applications Faculty of Electronic Engineering Technology
                                                  Universiti Malaysia Perlis                 93
Designing for Performance
PowerPC Family (cont’d)
• 601:                                                 • 740/750:
   ― Quickly to market. 32-bit machine
                                                           ― Also known as G3
• 603:
                                                           ― Two levels of cache on chip
   ― Low-end desktop and portable
   ― 32-bit                                            • G4:
   ― Comparable performance with 601                       ― Increases parallelism and internal speed
   ― Lower   cost    and   more    efficient
     implementation                                    • G5:
• 604:                                                     ― Improvements in parallelism and internal
   ― Desktop and low-end servers                             speed
   ― 32-bit machine                                        ― 64-bit organization
   ― Much more advanced superscalar design
   ― Greater performance
• 620:
   ― High-end servers
   ― 64-bit architecture
                                Faculty of Electronic Engineering Technology
                                          Universiti Malaysia Perlis                              94
Designing for Performance
Internet resources
• http://www.intel.com/
    ―Search for the Intel Museum
• http://www.ibm.com
• http://www.dec.com
• Charles Babbage Institute
• PowerPC
• Intel Developer Home
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           95
Designing for Performance
TWO(2) LAWS : AHMDAHL’S LAW & LITTLE LAW’S
• Ahmdahl’s Law (Gene Ahmdahl, 1967)
   ―Limitation
             𝑇𝑖𝑚𝑒 𝑡𝑜 𝑒𝑥𝑒𝑐𝑢𝑡𝑒 𝑝𝑟𝑜𝑔𝑟𝑎𝑚 𝑜𝑛 𝑎 𝑠𝑖𝑛𝑔𝑙𝑒 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑜𝑟
  𝑆𝑝𝑒𝑒𝑑𝑢𝑝 =
            𝑇𝑖𝑚𝑒 𝑡𝑜 𝑒𝑥𝑒𝑐𝑢𝑡𝑒 𝑝𝑟𝑜𝑔𝑟𝑎𝑚 𝑜𝑛 𝑁 𝑝𝑎𝑟𝑎𝑙𝑙𝑒𝑙 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑜𝑟
               𝑇 1−𝑓 +𝑇𝑓            1                𝑓 ↓  less effect
           =          𝑇𝑓   =               𝑓                                      1
               𝑇 1−𝑓 + 𝑁       1−𝑓 +𝑁                𝑁   speedup bounded to
                                                                                (1−𝑓)
  𝑇 - total execution time using single processor
   1 − 𝑓 - fraction time of execution time
  𝑁 – number of processor
                           Faculty of Electronic Engineering Technology
                                     Universiti Malaysia Perlis                  96
Designing for Performance
              Faculty of Electronic Engineering Technology
                        Universiti Malaysia Perlis           97
Designing for Performance
              Faculty of Electronic Engineering Technology
                        Universiti Malaysia Perlis           98
Designing for Performance
• Example
• Suppose that a task make extensive use
  of floating-points operations, with 40%
  of the time consumed by floating-point
  operations. With new hardware design,
  the floating-point module is sped up by
  a factor of K. Then the overall speed up
  is as follows:
                           1
             𝑆𝑝𝑒𝑒𝑑𝑢𝑝 =
                             0.4
                       0.6 +
                              𝐾
                     Faculty of Electronic Engineering Technology
                               Universiti Malaysia Perlis           99
Designing for Performance
• Little’s Law
• Assumption
   ―A steady state system
   ―No leakage
                         𝐿 = 𝜆𝑊
  𝜆 : average rate of arrival items
  𝑊 : average time of item stay in the system
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           100
Designing for Performance
• Consideration for processor hardware evaluation &
  setting requirements
   ―Performance, Cost, Size, Security, Reliability, Power
    consumption
                  Raw speed < execution
• Application performance
   ―Instruction set, Implementation language, Compiler
    efficiency, Programming skill
                    Faculty of Electronic Engineering Technology
                              Universiti Malaysia Perlis           101
Designing for Performance
• Basic measurement of computer performance
   ―Clock speed
      • Also known as clock rate
      • Clock cycle
      • Cycle time
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           102
Designing for Performance
• Basic measurement of computer performance
   ―Instruction execution rate
                                   𝑛
                                   𝑖=1(𝐶𝑃𝐼𝑖             × 𝐼𝑖 )
                  𝐶𝑃𝐼 =
                                               𝐼𝑐
  𝐼𝑐 : instruction count
  ―Processor time,
                    𝑇 = 𝐼𝑐 × 𝐶𝑃𝐼 × 𝜏
  𝜏: constant cycle time
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis           103
Designing for Performance
• Basic measurement of computer performance
   ―Memory cycle time > processor cycle time
              𝑇 = 𝐼𝑐 × [𝑝 + (𝑚 × 𝑘)] × 𝜏
  𝑝: number of processor
  m: number of memory
  k: ratio between memory cycle time & processor cycle
  time
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           104
Designing for Performance
• Basic measurement of computer performance
   ―Million instructions per second (MIPS) rate
                               𝐼𝑐          𝐼𝑐
                𝑀𝐼𝑃𝑆 𝑟𝑎𝑡𝑒 =        6
                                     =
                            𝑇 × 10     𝐶𝑃𝐼 × 106
  ―Million of floating-point operations per second (MFLOPS) rate
  𝑀𝐹𝐿𝑂𝑃𝑆 𝑟𝑎𝑡𝑒
    𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑥𝑒𝑐𝑢𝑡𝑒𝑑 𝑓𝑙𝑜𝑎𝑡𝑖𝑛𝑔 − 𝑝𝑜𝑖𝑛𝑡 𝑜𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑎 𝑝𝑟𝑜𝑔𝑟𝑎𝑚
  =
                       𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 𝑡𝑖𝑚𝑒 × 106
                        Faculty of Electronic Engineering Technology
                                  Universiti Malaysia Perlis           105
Designing for Performance
Example:
Consider 2 million of instructions on a 400MHz processor.
Four major instructions are used as follow:
        Instruction type                               CPI                   Instruction Mix (%)
         Arithmetic & Logic                              1                           60
      Load/store with cache hit                          2                           18
               Branch                                    4                           12
  Memory reference with cache miss                       8                           10
Solution
   𝐶𝑃𝐼 = 0.6 + 2 × 0.18 + 4 × 0.12 + 8 × 0.1 = 2.24
         𝑀𝐼𝑃𝑆 = (400 × 106 )/(2.24 × 106 ) ≈ 178
                              Faculty of Electronic Engineering Technology
                                        Universiti Malaysia Perlis                                 106
Designing for Performance
• Calculation
   ―Mean
      • Arithmetic mean                                       𝑛
                    𝑥1 + ⋯ + 𝑥𝑛 1
              𝐴𝑀 =             =                                     𝑥𝑖
                         𝑛       𝑛
                                                            𝑖=1
    • Geometric mean
                                              𝑛              1/𝑛                  𝑛
           𝑛
                                                                              1
    𝐺𝑀 =       𝑥1 × ⋯ × 𝑥𝑛 =                        𝑥𝑖               =                  ln(𝑥𝑖 )
                                                                              𝑛
                                           𝑖=1                            𝑒       𝑖=1
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis                                    107
Designing for Performance
• Calculation
   ―Mean
      • Harmonic mean
                      𝑛             𝑛
        𝐻𝑀 =                  =           , 𝑥𝑖 > 0
                1          1      𝑛   1
                   + ⋯+
               𝑥1         𝑥𝑛      𝑖=1 𝑥
                                       𝑖
      • Functional mean                          𝑛
                  𝑓 𝑥1  + ⋯ + 𝑓 𝑥𝑛             1
      𝐹𝑀 = 𝑓 −1                      = 𝑓 −1        𝑓(𝑥𝑖 )
                          𝑛                    𝑛
                                                                     𝑖=1
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis                 108
Designing for Performance
• Benchmark principles
   ―MIPS
   ―MFLOPS
• SPEC benchmarks  SPEC CPU2006
   ―SPECviewperf     : 3D graphic
   ―SPECwpc         : workstation
   ―SPECjvm2008     : hardware & software
   ―SPECjbb2013     : commerce application
   ―SPECsfs2008     : speed & request-handling capabilities
   ―SPECvirt_sc2013 : datacenter servers
                     Faculty of Electronic Engineering Technology
                               Universiti Malaysia Perlis           109
Designing for Performance
• SPEC suites
   ―SPEC CPU89
   ―SPEC CPU92
   ―SPEC CPU95
   ―SPEC CPU2000
   ―SPEC CPU2006
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           110
Designing for Performance
• Compile & run each program. Runtime
  is measured & median value is selected
  (3x)  execution time not intrinsic to
  program
• 12 results are normalized  ratio of
  reference to system under test
• GM is calculated  yield overall metric
                      Faculty of Electronic Engineering Technology
                                Universiti Malaysia Perlis           111
Designing for Performance
• FOUR(4) metric for benchmarks
   ―SPECint2006
     • Compiled with peak tuning
   ―SPECint_base2006
     • Compiled with base tuning
   ―SPECint_rate2006
     • Throughput ratio when compiled with peak tuning
   ―SPECint_rate_base2006
     • Throughput ratio when compiled with base tuning
Throughput: how many tasks can accomplish in certain
amount of time
                   Faculty of Electronic Engineering Technology
                             Universiti Malaysia Perlis           112
          Finish!
          Q&A
Faculty of Electronic Engineering Technology
          Universiti Malaysia Perlis           113
Intel Processors (add)
•   1The 4-bit processors                                      ―    7.580386EX                                            ―    12.1Itanium
         ―    1.1Intel 4004                                                                                               ―    12.2Itanium 2
                                                      •   832-bit processors: the 80486 range
•   2The 8-bit processors                                      ―    8.180486DX                                   •   1364-bit processors: Intel 64 – NetBurst microarchitecture
         ―    2.18008                                          ―    8.280486SX                                            ―    13.1Pentium 4F
         ―    2.28080                                          ―    8.380486DX2                                           ―    13.2Pentium D
         ―    2.38085                                          ―    8.480486SL                                            ―    13.3Pentium Extreme Edition
                                                               ―    8.580486DX4                                           ―    13.4Xeon
•   3Microcontrollers
         ―    3.1Intel 8048                           •   932-bit processors: P5 microarchitecture               •   1464-bit processors: Intel 64 – Core microarchitecture
         ―    3.2Intel 8051                                    ―    9.1Original Pentium                                   ―    14.1Intel Core 2
         ―    3.3Intel 80151                                   ―    9.2Pentium with MMX Technology                        ―    14.2Intel Pentium Dual-Core
         ―    3.4Intel 80251                                                                                              ―    14.3Celeron
                                                      •   1032-bit processors: P6/Pentium M microarchitecture
         ―    3.5MCS-96 Family                                                                                            ―    14.4Celeron M
                                                               ―    10.1Pentium Pro
•   4The bit-slice processor                                   ―    10.2Pentium II                               •   1564-bit processors: Intel 64 – Nehalem microarchitecture
         ―    4.13000 Family                                   ―    10.3Celeron (Pentium II-based)                        ―    15.1Intel Pentium
                                                               ―    10.4Pentium III                                       ―    15.2Core i3
•   5The 16-bit processors: MCS-86 family
                                                               ―    10.5Pentium II and III Xeon                           ―    15.3Core i5
         ―    5.18086
                                                               ―    10.6Celeron (Pentium III Coppermine-based)            ―    15.4Core i7
         ―    5.28088
                                                               ―    10.7Pentium III Tualatin-based                        ―    15.5Xeon
         ―    5.380186
                                                               ―    10.8Celeron (Pentium III Tualatin-based)
         ―    5.480188                                                                                           •   1664-bit processors:      Intel   64   –   Sandy   Bridge   /   Ivy   Bridge
                                                               ―    10.9Pentium M                                    microarchitecture
         ―    5.580286
                                                               ―    10.10Celeron M                                        ―    16.1Celeron
•   632-bit processors: the non-x86 microprocessors            ―    10.11Intel Core                                       ―    16.2Pentium
         ―    6.1iAPX 432                                      ―    10.12Dual-Core Xeon LV                                ―    16.3Core i3
         ―    6.2i960 aka 80960                                                                                           ―    16.4Core i5
                                                      •   1132-bit processors: NetBurst microarchitecture
         ―    6.3i860 aka 80860                                                                                           ―    16.5Core i7
                                                               ―    11.1Pentium 4
         ―    6.4XScale
                                                               ―    11.2Xeon                                     •   1764-bit processors: Intel 64 – Haswell microarchitecture
•   732-bit processors: the 80386 range                        ―    11.3Mobile Pentium 4-M
                                                                                                                 •   1864-bit processors: Intel 64 – Broadwell microarchitecture
         ―    7.180386DX                                       ―    11.4Pentium 4 EE
         ―    7.280386SX                                       ―    11.5Pentium 4E                               •   1964-bit processors: Intel 64 – Skylake microarchitecture
                                                          Faculty of Electronic Engineering Technology
                                                                                                                                                                                  114
         ―    7.380376
                                                      •   1264-bit processors: IA-64
         ―    7.480386SL                                            Universiti Malaysia Perlis
IBM processors (add)
•   1.1Early developments                                   •   1.4PowerPC                                              •   1.11POWER7
        ― 1.1.1The 801 research project                                                                                         ― 1.11.1POWER7 processors
                                                            •   1.5The Amazon project
                • 1974                                                                                                                • POWER7 – Comes in single-chip modules
                                                            •   1.6POWER3                                                                or in quad-chip MCM-configurations for
        ― 1.1.2The Cheetah project                                                                                                       supercomputer applications.
                • 1982                                              ― 1.6.1POWER3 processors
                                                                                                                                      • POWER7+ – Scaled down fabrication
        ― 1.1.3The America project                                        • POWER3 – Introduced in 1998, it                              process, and increased L3 cache and
                                                                             combined the POWER and PowerPC                              frequency.
                • 1985                                                       instruction sets.
                                                                          • POWER3-II – A faster POWER3 fabricated•         1.12POWER8
•   1.2POWER                                                                 on a reduced size, copper based process.                 •   2015
        ― 1.2.1POWER1 processors
              • 1990                                        •   1.7POWER4                                           •       1.13POWER9
                                                                    ― 1.7.1POWER4 processors                                          •   2017
              • RIOS-1 – the original 10-chip version
                                                                          • POWER4       –   The    first dual core
              • RIOS.9 – a less powerful version of RIOS-1                   microprocessor and the first PowerPC
              • POWER1+ – a faster version of RIOS-1                         processor to reach beyond 1 GHz.
                 made on a reduced fabrication process                    • POWER4+ – A faster POWER4 fabricated
              • POWER1++ – an even faster version of                         on a reduced process
                 RIOS-1
                                                            •   1.8POWER5
              • RSC – a single-chip implementation of
                 RIOS-1                                             ― 1.8.1POWER5 processors
              • RAD6000 – a radiation-hardened version of                 • POWER5 – The iconic setup with four
                 the RSC was made available for primarily                    POWER5 chips and four L3 cache chips on
                 use in space; it was a very popular design                  a large multi-chip module.
                 and was used extensively on many high-
                 profile missions                                         • POWER5+ – A faster POWER5 fabricated
                                                                             on a reduced process mainly to reduce
                                                                             power consumption.
•   1.3POWER2
        ― 1.3.1POWER2 processors                           •    1.9Power Architecture
              • 1993
                                                           •    1.10POWER6
              • POWER2 – 6 to 8 chips were mounted on               ― 1.10.1POWER6 processors
                 aceramic multi chip module
              • POWER2+ – a cheaper 6-chip version of                     • POWER6 – Reached 5 GHz; comes in
                 POWER2 with support for external L2                         modules with a single chip on it, and in
                 caches                                                      MCM with two L3 cache chips.
              • P2SC – a faster and single chip version of                • POWER6+ – A minor update, fabricated on
                 POWER2                                                      the same process as POWER6.
              • P2SC+ – an even faster version or P2SC due
                 to reduced fabrication process
                                                                Faculty of Electronic Engineering Technology
                                                                          Universiti Malaysia Perlis                                                                115
UserBenchmark:
AMD Ryzen 5 3600 vs Intel Core i5-9400F
               Faculty of Electronic Engineering Technology
                         Universiti Malaysia Perlis           116
 Comparison
                                       Supported
                                                                                   Fabri-     Number
             Series    Code Production Features                                                                           L1    L2    L3 Overclock
Processor                                               Clock Rate       Socket    cation TDP   of        Bus Speed
          Nomenclature Name    Date   (Instruction                                                                       Cache Cache Cache Capable
                                                                                  (micron)     Cores
                                          Set)
                                Nov.
  4004                                                   740 kHz          DIP        10              1       N/A          N/A    N/A
                              15,1971
  8008        N/A        N/A April 1972    N/A       200 kHz - 800 kHz    DIP        10              1      200 kHz       N/A    N/A    N/A
  8080        N/A        N/A April 1974    N/A       2 MHz - 3.125 MHz    DIP         6              1       2 MHz        N/A    N/A    N/A
                               March                   3 MHz, 5 MHz,
  8085        N/A        N/A               N/A                            DIP        3               1      2 MHz         N/A    N/A    N/A
                               1976                       6 MHz
                                                                                                            10 MHz,
                                June 8,               10 MHz, 8 MHz,
  8086        N/A        N/A               N/A                            DIP        3               1       8 MHz,       N/A    N/A    N/A
                                 1978                    4.77 MHz
                                                                                                           4.77 MHz
                                                                                                              8 MHz,
  8088        N/A        N/A June 1979     N/A       8 MHz, 4.77 MHz      DIP        3               1                    N/A    N/A    N/A
                                                                                                            4.77 MHz
                                                     12 MHz, 10 MHz,                                         12 MHz,
 80286        N/A        N/A   Feb. 1982   N/A                           DLPP        1.5             1                    N/A    N/A    N/A
                                                         6 MHz                                           10 MHz, 6 MHz
                                                         33 MHz,                                           33 MHz,
                                1985 -                   25 MHz,                                           25 MHz,
 i80386    DX, SX, SL    N/A               N/A                           DLPP      1 - 1.5           1                    N/A    N/A    N/A
                                 1990                    20 MHz,                                           20 MHz,
                                                         16 MHz                                            16 MHz
                                                                      Socket 1,
          DX, SX, DX2,          1989 -                                                                     25 MHz -       8 KiB -
 i80486                  N/A               N/A       25 MHz - 100 MHz Socket 2,    1 - 0.6           1                            N/A   N/A
            DX4, SL              1992                                                                       50 MHz       16 KiB
                                                                      Socket 3
                                                      Faculty of Electronic Engineering Technology
                                                                Universiti Malaysia Perlis                                                    117
                                                   Supported
                                                                                             Fabri-             Number
             Series         Code        Production Features    Clock                                                    Bus   L1    L2    L3 Overclock
Processor                                                                    Socket          cation      TDP      of
          Nomenclature      Name           Date   (Instruction Rate                                                    Speed Cache Cache Cache Capable
                                                                                              (nm)               Cores
                                                      Set)
                           P5, P54C,                                   Socket 2, Socket                                   50 MHz
  Intel                                                      65 MHz -                       800 nm -
              N/A           P54CTB,     1993 - 1999                   3, Socket 4, Socket            Unknown        1        - 16 KiB N/A       N/A
Pentium                                                      250 MHz                         350 nm
                             P54CS                                        5, Socket 7                                     66 MHz
  Intel                                                      120 MHz                                                      60 MHz
                             P55C,                                                          350 nm -
Pentium       N/A                       1996 - 1999              -         Socket 7                    Unknown      1        - 32 KiB N/A       N/A
                           Tillamook                                                         250 nm
  MMX                                                        300 MHz                                                      66 MHz
                                                                                                                           400 M
           Z5xx, Z6xx,   Diamondville, 2008 - 2009                                                                          Hz,
           N2xx, 2xx,    Pineview,Silv (as Centrino                        Socket                                          533 M
                                                             800 MHz                                                             56 KiB 512
 Intel     3xx, N4xx,    erthorne,Linc    Atom)                        PBGA437,Socket       32 nm,     0.65 W -             Hz,
                                                                 -                                              1, 2 or 4         per KiB - 1   N/A
 Atom     D4xx, D5xx,    roft,Cedarvie     2008–                      PBGA441, Socket       45 nm        13 W              667 M
                                                             2.13 GHz                                                             core MiB
          N5xx, D2xxx,   w,Medfield,Cl present (as                    micro-FCBGA8 559                                      Hz,
             N2xxx         over Trail     Atom)                                                                             2.5
                                                                                                                           GT/s
                                                                           Slot 1,Socket
                        Banias,Cedar                                        370,Socket
                        Mill,Conroe,C                                       478,Socket
                        oppermine,C                                     479,Socket 495, LGA
                                                                                                                        66 MHz
                        ovington,Dot                                        775,Socket
                                                                                                                           ,
                        han,Mendocin                                    M,Socket P, FCBGA6, 14 nm,
                                                                                                                         100 M
                        o,Northwood,                                       μFC-BGA 956,       22 nm,
                                                                                                                          Hz,
                        Prescott,Tual                                   BGA479, Socket G1, 32 nm,                              8 KiB
                                                                                                                         133 M
                        atin,Willamet                        266 MHz BGA-1288, Socket 45 nm,                                   - 64
 Intel                                    1998–                                                      4 W - 86             Hz,        0 KiB - 0 KiB -
          3xx, 4xx, 5xx te,Yonah,Mer                   Faculty of -Electronic
                                                                           G2,Engineering
                                                                               BGA-1023, Technology
                                                                                              65 nm,          1, 2 or 4         KiB
Celeron
                        om,Penryn,Ar
                                         present
                                                             3.6 Universiti
                                                                   GHz Socket    G3, BGA-
                                                                            Malaysia  Perlis 90 nm,
                                                                                                        W                400 M
                                                                                                                          Hz,
                                                                                                                                per
                                                                                                                                     1 MiB 2 MiB
                                                                                                                                                      118
                        randale,Sand                                           1168,         130 nm,                           core
                                                          Supported
                                                                                             Fabri-           Number
             Series            Code            Production Features       Clock                                           Bus     L1    L2    L3 Overclock
Processor                                                                         Socket     cation    TDP      of
          Nomenclature         Name               Date   (Instruction    Rate                                           Speed   Cache Cache Cache Capable
                                                                                              (nm)             Cores
                                                             Set)
  Intel                                                                  150 MHz                                                      256 KiB,
                                                 1995 -                                      350 nm, 29.2 W            60 MHz,
Pentium       52x                P6                                          -     Socket 8                     1              16 KiB 512 KiB, N/A
                                                  1998                                       500 nm - 47 W             66 MHz
   Pro                                                                   200 MHz                                                      1024 KiB
                                                                                    Slot 1,
                              Klamath,
  Intel                                                                  233 MHz MMC-1,              16.8 W
                             Deschutes,          1997 -                                      250 nm,                   66 MHz,       256 KiB
Pentium       52x                                                            -      MMC-2,           - 38.2     1              32KiB          N/A
                               Tonga,             1999                                       350 nm                    100 MHz       -512 KiB
    II                                                                   450 MHz     Mini-             W
                                Dixon
                                                                                   Cartridge
  Intel                                                                                      130 nm,
                              Katmai,            1999 -                  450 MHz Slot 1,             17 W -            100 MHz,        256 KiB
Pentium     52x, 53x                                                                         180 nm,            1               32 KiB          N/A
                         Coppermine,Tualatin      2003                  - 1.4 GHz Socket 370         34.5 W            133 MHz         -512 KiB
    III                                                                                      250 nm
                        Allendale,Cascades,
                            Clovertown,
                                                                                   Slot 2,
                         Conroe,Cranford,                                                                              100 MHz,
                                                                                   Socket
                          Dempsey,Drake,                                                                               133 MHz,
                                                                                    603,
                            Dunnington,                                                                                400 MHz,
                                                                                   Socket
                                Foster,                                                         45 nm,                 533 MHz,
                                                                                    604,
                            Gainestown,                                                         65 nm,                 667 MHz, 8 KiB ~
                                                                                  Socket J,
  Intel   n3xxx, n5xxx,        Gallatin,         1998–                  400 MHz                 90 nm, 16 W - Up to 28 800 MHz, 64 KiB 256 KiB 4 MiB -
                                                                                  Socket T,
  Xeon       n7xxx          Harpertown,         present                - 4.4 GHz               130 nm, 165 W Cores 1066 MHz per -12 MiB 16 MiB
                                                                                  Socket B,
                              Irwindale,                                                       180 nm,                 1333 MHz core
                                                                                  LGA 1150,
                             Kentsfield,                                                       250 nm                  1600 M4.
                                                                                  LGA 1155,
                          Nocona,Paxville,                                                                              8 GT/s,
                                                                                  LGA 1156,
                        Potomac,Prestonia,                Faculty of Electronic Engineering                            5.86GT/s
                                                                                  LGA 1366, Technology
                         Sossaman,Tanner,                           Universiti Malaysia Perlis
                                                                                  LGA 2011
                                                                                                                       6.4 GT/s                       119
                           Tigerton,Tulsa,
                                                Supported
                                                                                         Fabri-             Number
             Series        Code      Production Features    Clock                                                   Bus   L1    L2    L3 Overclock
Processor                                                                Socket          cation     TDP       of
          Nomenclature     Name         Date   (Instruction Rate                                                   Speed Cache Cache Cache Capable
                                                                                          (nm)               Cores
                                                   Set)
                         Cedar Mill,                                   Socket 423,       65 nm,                400 MHz,
                                                           1.3 GHz                                      1 /w                     256
Pentium                  Northwood,                                    Socket 478,       90 nm, 21 W -         533 MHz, 8 KiB -
           5xx, 6xx                  2000 - 2008               -                                       hyperth                  KiB - 2 MiB
   4                      Prescott,                                      LGA 775,       130 nm, 115 W          800 MHz, 16 KiB
                                                           3.8 GHz                                     reading                  2 MiB
                         Willamette                                      Socket T       180 nm                 1066 MHz
                                                           3.2 GHz                                         1 /w                   512
Pentium                   Gallatin,                                    Socket 478,      90 nm,     92 W -         800 MHz,             0 KiB -
           5xx, 6xx                  2000 - 2008               -                                          hyperth          8 KiB KiB -
   4                     Prescott 2M                                    Socket T        130 nm     115 W          1066 MHz             2 MiB
                                                          3.73 GHz                                        reading                1 MiB
                                                          800 MHz
                                                                                                                                 1 MiB
Pentium                    Banias,                             -                        90 nm, 5.5 W -            400 MHz,
              7xx                    2003 - 2008                       Socket 479                             1            32 KiB - 2  N/A
   M                       Dothan                         2.266 GH                      130 nm 27 W               533 MHz
                                                                                                                                  MiB
                                                              z
                                                                                                                                  2×1
                                                          2.66 GHz                                                533 MHz, 16 KiB
Pentium                  Smithfield,                                                    65 nm,     95 W -                         MiB -
           8xx, 9xx                  2005 - 2008              -         Socket T                              2   800 MHz, per            N/A
 D/EE                      Presler                                                      90 nm      130 W                          2×2
                                                          3.73 GHz                                                1066 MHz core
                                                                                                                                  MiB
                                                    Faculty of Electronic Engineering Technology
                                                              Universiti Malaysia Perlis                                                        120
       Comparison
                    Effects     Clock
                                rate
       Multi
       core
       MICs
       GPUs
MIC – MicroNational Cash Register Corporation
RCA – Radio Corporation of America         Faculty of Electronic Engineering Technology
DEC - Digital Equipment Corporation                  Universiti Malaysia Perlis           121