0% found this document useful (0 votes)
16 views75 pages

C o Updated

The document outlines the fundamentals of computer organization and architecture, detailing various components such as the CPU, memory hierarchy, and input-output organization. It covers concepts like register transfer language, micro-operations, and the design of control units, alongside the functions of different memory types and arithmetic operations. Additionally, it discusses the implementation of bus systems for efficient data transfer between registers and the use of three-state buffers in constructing bus lines.

Uploaded by

thotasravani545
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views75 pages

C o Updated

The document outlines the fundamentals of computer organization and architecture, detailing various components such as the CPU, memory hierarchy, and input-output organization. It covers concepts like register transfer language, micro-operations, and the design of control units, alongside the functions of different memory types and arithmetic operations. Additionally, it discusses the implementation of bus systems for efficient data transfer between registers and the use of three-state buffers in constructing bus lines.

Uploaded by

thotasravani545
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

ND

II BSc SEM-III

BVK
Computer Organization
UNIT – I
Register Transfer Language and Micro Operations: Introduction- Functional units,
computer registers, register transfer language, register transfer, bus and memory
transfers, arithmetic, logic and shift micro-operations, arithmetic logic shift unit.
Basic Computer Organization and Design: Instruction codes, instruction cycle.
Register reference instructions, Memory – reference instructions, input – output and
interrupt.

UNIT – II
CPU and Micro Programmed Control: Central Processing unit: Introduction,
instruction formats, addressing modes.Control memory, address sequencing, design of
control unit - hard wired control, micro programmed control.

UNIT – III
Memory Organization: Memory hierarchy, main memory, auxiliary memory,
associative memory, cache Memory and mappings.

UNIT – IV
Input-Output Organization: Peripheral Devices, input-output interface, asynchronous
datatransfer, modes of transfer- programmed I/O, priority interrupt, direct memory
access, Input – Output Processor (IOP).

UNIT – V
Computer Arithmetic and Parallel Processing: Data representation- fixed point,
floating point, addition and subtraction, multiplication and division algorithms.
Parallel Processing-Parallel Processing, Pipelining, Arithmetic Pipeline, Instruction
Pipeline.
UNIT – I
Introduction
Computer Organization and Architecture is used to design computer systems. Computer
Architecture is considered to be those attributes of a system that are visible to the user like
addressing techniques, instruction sets, and bits used for data, and have a direct impact on
the logic execution of a program, It defines the system in an abstract manner, It deals with
What does the system do.
Whereas, Computer Organization is the way in which a system has to structure and It is
operational units and the interconnections between them that achieve the architectural
specifications, It is the realization of the abstract model, and It deals with How to
implement the system.
 A digital system is an interconnection of digital hardware modules.
 The modules are registers, decoders, arithmetic elements, and control logic.
 The various modules are interconnected with common data and control paths to
form a digitalcomputer system.
 Digital modules are best defined by the registers they contain and the operations that
are performedon the data stored in them.
 The operations executed on data stored in registers are called micro operations.
 A micro operation is an elementary operation performed on the information stored in
one or moreregisters.
 The result of the operation may replace the previous binary information of a register
or may betransferred to another register.
 Examples of micro operations are shift, count, clear, and load.
 The internal hardware organization of a digital computer is best defined by specifying:
1. The set of registers it contains and their function.
2. The sequence of micro operations performed on the binary information stored in the
registers.
3. The control that initiates the sequence of micro operations.

 FUNCTIONAL UNITS
 A computer organization describes the functions and design of the various units of a
digital system.
 A general-purpose computer system is the best-known example of a digital system.
Other examples include telephone switching exchanges, digital voltmeters, digital
counters, electronic calculators anddigital displays.
 Computer architecture deals with the specification of the instruction set and the
hardware units thatimplement the instructions.
 Computer hardware consists of electronic circuits, displays, magnetic and optic storage
media and alsothe communication facilities.
 Functional units are a part of a CPU that performs the operations and calculations
called for by the computer program.
 Functional units of a computer system are parts of the CPU (Central Processing Unit)
that performs the operations and calculations called for by the computer program. A
computer consists of five main components namely, Input unit, Central Processing
Unit, Memory unit Arithmetic & logical unit,Control unit and an Output unit.

Fig: Functional units of a digital system


Input unit
 Input units are used by the computer to read the data. The most commonly used input
devices are keyboards, mouse, joysticks, trackballs, microphones, etc.
 However, the most well-known input device is a keyboard. Whenever a key is pressed,
the corresponding letter or digit is automatically translated into its corresponding
binary code and transmitted over a cable to either the memory or the processor.
Central processing unit
 Central processing unit commonly known as CPU can be referred as an electronic
circuitry within a computer that carries out the instructions given by a computer
program by performing the basic arithmetic, logical, control and input/output (I/O)
operations specified by the instructions.
Memory unit
 The Memory unit can be referred to as the storage area in which programs are kept which
are running,and that contains data needed by the running programs.
 The Memory unit can be categorized in two ways namely, primary memory and
secondary memory.
 It enables a processor to access running execution applications and services that are
temporarily stored in a specific memory location.
 Primary storage is the fastest memory that operates at electronic speeds. Primary
memory contains a large number of semiconductor storage cells, capable of storing a bit
of information. The word lengthof a computer is between 16-64 bits.
 It is also known as the volatile form of memory, means when the computer is shut
down, anything contained in RAM is lost.
 Cache memory is also a kind of memory which is used to fetch the data very soon.
They are highly coupled with the processor.
 The most common examples of primary memory are RAM and ROM.
 Secondary memory is used when a large amount of data and programs have to be
stored for a long-term basis.
 It is also known as the Non-volatile memory form of memory, means the data is stored
permanentlyirrespective of shut down.
 The most common examples of secondary memory are magnetic disks, magnetic tapes,
and optical disks.
Arithmetic & logical unit
 Most of all the arithmetic and logical operations of a computer are executed in the ALU
(Arithmetic and Logical Unit) of the processor. It performs arithmetic operations like
addition, subtraction, multiplication, division and also the logical operations like AND,
OR, NOT operations.
Control unit
 The control unit is a component of a computer's central processing unit that coordinates
the operationof the processor. It tells the computer's memory, arithmetic/logic unit and
input and output devices how to respond to a program's instructions.
 The control unit is also known as the nerve center of a computer system.
 Let's us consider an example of addition of two operands by the instruction given as
Add LOCA, RO. This instruction adds the memory location LOCA to the operand in the
register RO and places the sum in the register RO. This instruction internally performs
several steps.
Output Unit
 The primary function of the output unit is to send the processed results to the user.
Output devices display information in a way that the user can understand.
 Output devices are pieces of equipment that are used to generate information or any
other response processed by the computer. These devices display information that has
been held or generated withina computer.
 The most common example of an output device is a monitor
3. COMPUTER REGISTERS
Computer registers are high-speed memory storing units. It is an element of the computer
processor. It cancarry any type of information including a bit sequence or single data.
A register should be 32 bits in length for a 32-bit instruction computer. Registers can be
numbered relies uponthe processor design and language rules.
The instructions in a computer are saved in memory locations and implemented one after
another at a time. The function of the control unit is to fetch the instruction from the
memory and implement it. The control does the similar for all the instructions in the
memory in sequential order.
A counter is needed to maintain a path of the next instruction to be implemented and
evaluate its address. The figure2 shows the registers with their memories. The memory
addresses are saved in multiple registers. These requirements certainly state the use for
registers in a computer.
The data register holds the operand read from the memory.
The accumulator is a general-purpose register need for processing.
The instruction register holds the read memory.
The temporary data used while processing is stored in the temporary register.
The address register holds the address of the instruction that is to be implemented next
from thememory.
The Program Counter (PC) controls the sequence of instructions to be read. In case a
branch instruction is detected, the sequential execution does not arise. A branch
execution calls for a transfer to an instruction that is not in sequence with the
instructions in the PC.

Fig-2: Registers with their memories


The registers are also listed in Table 5.1 together with a brief description of their function
and the number ofbits that they contain.

The input register (INPR) and output register (OUTPR) are the registers used for the I/O
operations. The INPR receives an 8-bit character from the input device. It is similar to
the OUTPR
Computer registers are designated by upper case letters (and optionally followed by
digits or letters) to denote the function of the register.
For example, the register that holds an address for the memory unit is usually called a
memory address register and is designated by the name MAR.
Other designations for registers are PC (for program counter), IR (for instruction
register, and R1 (for processor register).
The individual flip-flops in an n-bit register are numbered in sequence from 0 through
n-1, starting from 0 in the rightmost position and increasing the numbers toward the
left.
Figure 4-1 shows the representation of registers in block diagram form.
The most common way to represent a register is by a rectangular box with the name of
the registerinside, as in Fig. 4-1(a).
The individual bits can be distinguished as in (b).
The numbering of bits in a 16-bit register can be marked on top of the box as shown in
(c).
16-bit register is partitioned into two parts in (d). Bits 0 through 7 are assigned the
symbol L (for lowbyte) and bits 8 through 15 are assigned the symbol H (for high byte).
The name of the 16-bit register is PC. The symbol PC (0-7) or PC (L) refers to the low-
order byte andPC (8-15) or PC (H) to the high-order byte.

REGISTER TRANSFER LANGUAGE(RTL): / 5. REGISTER TRANSFER


Information transfer from one register to another is designated in symbolic form by
means of a replacement operator.
The statement R2← R1 denotes a transfer of the content of register R1 into register R2.
It designates a replacement of the content of R2 by the content of R1.
By definition, the content of the source register R 1 does not change after the transfer.
If we want the transfer to occur only under a predetermined control condition then it
can beshown by an if-then statement. if (P=1) then R2← R1
P is the control signal generated by a control section.
We can separate the control variables from the register transfer operation by specifying a
ControlFunction.
Control function is a Boolean variable that is equal to 0 or 1.
control function is included in the statement as P: R2← R1
Control condition is terminated by a colon implies transfer operation be executed by the
hardware only if P=1.
Every statement written in a register transfer notation implies a hardware construction
forimplementing the transfer.
Figure 4-2 shows the block diagram that depicts the transfer from R1 to R2.

P may go back to 0 at time t+1; otherwise, the transfer will occur with every clock pulse
transition while P remains active.
Even though the control condition such as P becomes active just after time t, the actual
transfer does not occur until the register is triggered by the next positive transition of the
clock at time t +1.
The basic symbols of the register transfer notation are listed in below table

A comma is used to separate two or more operations that are executed at the same time.
The statement T : R2← R1, R1← R2 (exchange operation) denotes an operation that
exchanges the contents of two registers during one common clock pulse provided that
T=1.
 BUS AND MEMORY TRANSFERS
A more efficient scheme for transferring information between registers in a multiple-register
configuration is a Common Bus System.
A common bus consists of a set of common lines, one for each bit of a register.
Control signals determine which register is selected by the bus during each particular
register transfer.
Different ways of constructing a Common Bus System
1. Using Multiplexers
2. Using Tri-state Buffers

 Common bus system is with multiplexers:


The multiplexers select the source register whose binary information is then placed on the
bus.The construction of a bus system for four registers is shown in below Figure4-3.

The bus consists of four 4 x 1 multiplexers each having four data inputs, 0 through 3, and
twoselection inputs, S1 and S0.
For example, output 1 of register A is connected to input 0 of MUX 1 because this input
is labelled A1.
The diagram shows that the bits in the same significant position in each register are
connected to thedata inputs of one multiplexer to form one line of the bus.
Thus MUX 0 multiplexes the four 0 bits of the registers, MUX 1 multiplexes the four 1
bits of theregisters, and similarly for the other two bits.
The two selection lines Si and So are connected to the selection inputs of all four
multiplexers.
The selection lines choose the four bits of one register and transfer them into the four-
line commonbus.
When S1S0 = 00, the 0 data inputs of all four multiplexers are selected and applied to the
outputsthat form the bus.
This causes the bus lines to receive the content of register A since the outputs of this
register areconnected to the 0 data inputs of the multiplexers.
Similarly, register B is selected if S1S0 = 01, and so on.
Table 4-2 shows the register that is selected by the bus for each of the four possible binary
value ofthe selection lines.

In general, a bus system has


multiplex ―k‖ Registers
each register of ―n‖ bits
to produce ―n-line bus‖
no. of multiplexers required = n
size of each multiplexer = k x 1
When the bus is includes in the statement, the register transfer is symbolized as follows:
BUS← C, R1← BUS
The content of register C is placed on the bus, and the content of the bus is loaded into
register R1 by activating its load control input. If the bus is known to exist in the system,
it may be convenient just toshow the direct transfer.
R1← C
 Three-State Bus Buffers:
 A bus system can be constructed with three-state gates instead of multiplexers.
 A three-state gate is a digital circuit that exhibits three states.
 Two of the states are signals equivalent to logic 1 and 0 as in a conventional gate.
 The third state is a high-impedance state.
 The high-impedance state behaves like an open circuit, which means that the output is
disconnectedand does not have logic significance.
 Because of this feature, a large number of three-state gate outputs can be connected with
wires toform a common bus line without endangering loading effects.
 The graphic symbol of a three-state buffer gate is shown in Fig. 4-4.
 It is distinguished from a normal buffer by having both a normal input and a control
input.
 The control input determines the output state. When the control input is equal to 1, the
output is enabled and the gate behaves like any conventional buffer, with the output
equal to the normalinput.
 When the control input is 0, the output is disabled and the gate goes to a high-
impedance state,regardless of the value in the normal input.
 The construction of a bus system with three-state buffers is shown in Fig. 4

The outputs of four buffers are connected together to form a single bus line.
The control inputs to the buffers determine which of the four normal inputs will
communicate with the busline.
 No more than one buffer may be in the active state at any given time. The connected
buffers must be controlled so that only one three-state buffer has access to the bus line
while all other buffers are maintained in a high impedance state.
 One way to ensure that no more than one control input is active at any given time is to
use adecoder, as shown in the diagram.
 When the enable input of the decoder is 0, all of its four outputs are 0, and the bus line is
in a high-impedance state because all four buffers are disabled.
 When the enable input is active, one of the three-state buffers will be active, depending
on the binaryvalue in the select inputs of the decoder.
Memory Transfer:
 The transfer of information from a memory word to the outside environment is called a
read
operation.
 The transfer of new information to be stored into the memory is called a write operation.
 A memory word will be symbolized by the letter M.
 The particular memory word among the many available is selected by the memory
address duringthe transfer.
 It is necessary to specify the address of M when writing memory transfer operations.
 This will be done by enclosing the address in square brackets following the letter M.
 Consider a memory unit that receives the address from a register, called the address
register,symbolized by AR.
 The data are transferred to another register, called the data register, symbolized by DR.
 The read operation can be stated as follows:
 Read: DR<- M [AR]
 This causes a transfer of information into DR from the memory word M selected by the
address inAR.
 The write operation transfers the content of a data register to a memory word M
selected by the address. Assume that the input data are in register R1 and the address is
in AR.
 The write operation can be stated as follows:
 Write: M [AR] <- R1
 TYPES OF MICRO-OPERATIONS
a) Register Transfer Micro-operations: Transfer binary information from one register to another.
b) Arithmetic Micro-operations: Perform arithmetic operation on numeric data stored in
registers.
c) Logical Micro-operations: Perform bit manipulation operations on data stored in registers.
d) Shift Micro-operations: Perform shift operations on data stored in registers.
Register Transfer Micro-operation doesn‘t change the information content when the
binary informationmoves from source register to destination register.
Other three types of micro-operations change the information change the information
content during thetransfer.
Arithmetic Micro-operations:
The basic arithmetic micro-operations are
I. Addition
II. Subtraction
III. Increment
IV. Decrement
V. Shift
The arithmetic Micro-operation defined by the statement below specifies the add micro-
operation.
R3 ← R1 + R2
It states that the contents of R1 are added to contents of R2 and sum is transferred to R3.
To implement this statement hardware requires 3 registers and digital component that
performs addition Subtraction is most often implemented through complementation
and addition.
The subtract operation is specified by the following statement
R3 ← R1 + R2‘ + 1
Instead of minus operator, we can write as R2‘ is the symbol for the 1‘s complement of R2
Adding 1 to 1‘s complement produces 2‘s complement Adding the contents of R1 to the
2's complement of R2 is equivalent to R1-R2.

Arithmetic Circuit:
The basic component of an arithmetic circuit is the parallel adder.
By controlling the data inputs to the adder, it is possible to obtain different types of
arithmetic operations.
The diagram of a 4-bit arithmetic circuit is shown in Fig. 4-9. It has four full-adder circuits
that constitute the4-bit adder and four multiplexers for choosing different operations.
 There are two 4-bit inputs A and B and a 4-bit output D.
 The four inputs from A go directly to the X inputs of the binary adder.
 Each of the four inputs from B are connected to the data inputs of the multiplexers.
 The multiplexers data inputs also receive the complement of B.
 The other two data inputs are connected to logic-0 and logic-1.
 The four multiplexers are controlled by two selection inputs S1 and S0. The input carry
C in, goes to the carry input of the FA in the least significant position. The other carries
are connected from one stage to the next.
 By controlling the value of Y with the two selection inputs S1 and S0 and making Cin
equal to 0 or 1, it is possible to generate the eight arithmetic micro operations listed in
Table 4-4.

Logic Micro-operations:
Logic micro operations specify binary operations for strings of bits stored in registers.
These operations consider each bit of the register separately and treat them as binary
variables.
For example, the exclusive-OR micro operation with the contents of two registers RI and
R2 is symbolized bythe statement
It specifies a logic micro operation to be executed on the individual bits of the registers
provided that thecontrol variable P = 1.
List of Logic Micro operations:
There are 16 different logic operations that can be performed with two binary variables.
They can be determined from all possible truth tables obtained with two binary variables
as shown in Table4-5.

 The 16 Boolean functions of two variables x and y are expressed in algebraic form in the
first columnof Table 4-6.
 The 16 logic micro operations are derived from these functions by replacing variable x
by the binarycontent of register A and variable y by the binary content of register B.
 The logic micro-operations listed in the second column represent a relationship between
the binarycontent of two registers A and B.
Some Applications:
 Logic micro-operations are very useful for manipulating individual bits or a portion of
a word storedin a register.
 They can be used to change bit values, delete a group of bits or insert new bits values into
a register.
 The following example shows how the bits of one register (designated by A) are
manipulated by logic micro operations as a function of the bits of another register
(designated by B).
a) Selective set
The selective-set operation sets to 1 the bits in register A where there are corresponding l's in
register B. It does not affect bit positions that have 0's in B. The following numerical
example clarifies this operation:
The OR micro operation can be used to selectively set bits of a register.
B) Selective complement
The selective-complement operation complements bits in A where there are corresponding
1's in B. It doesnot affect bit positions that have 0's in B. For example:

The exclusive-OR micro operation can be used to selectively complement bits of a register.
C) Selective clear
The selective-clear operation clears to 0 the bits in A only where there are corresponding l's
in B. Forexample:

The corresponding logic micro operation is


D) Mask
The mask operation is similar to the selective-clear operation except that the bits of A are
cleared only where there are corresponding O's in B . The mask operation is an AND
micro operation as seen from the followingnumerical example:

Insert:- The insert operation inserts a new value into a group of bits. This is done by first
masking the bitsand then ORing them with the required value.
For example, suppose that an A register contains eight bits, 0110 1010. To replace the four
leftmost bits bythe value 1001 we first mask the four unwanted bits:

The mask operation is an AND micro operation and the insert operation is an OR micro
operation.
E) Clear: The clear operation compares the words in A and B and produces an all 0's
result if the two numbers are equal. This operation is achieved by an exclusive-OR
micro operation as shown by the followingexample

Shift Micro operations:


 Shift micro operations are used for serial transfer of data.
 The contents of a register can be shifted to the left or the right.
 During a shift-left operation the serial input transfers a bit into the rightmost position.
 During a shift-right operation the serial input transfers a bit into the leftmost position.
 There are three types of shifts: logical, circular, and arithmetic.
 The symbolic notation for the shift micro operations is shown in Table 4-7.

Logical Shift:
 A logical shift is one that transfers 0 through the serial input.
 The symbols shl and shr for logical shift-left and shift-right micro operations.
 The micro operations that specify a 1-bit shift to the left of the content of register R and a
1-bit shift tothe right of the content of register R shown in table 4.7.
 The bit transferred to the end position through the serial input is assumed to be 0 during
a logicalshift.
Circular Shift:
 The circular shift (also known as a rotate operation) circulates the bits of the register
around the twoends without loss of information.
 This is accomplished by connecting the serial output of the shift register to its serial
input.
 We will use the symbols cil and cir for the circular shift left and right, respectively.
Arithmetic Shift:
 An arithmetic shift is a micro operation that shifts a signed binary number to the left or
right.
 An arithmetic shift-left multiplies a signed binary number by 2.
 An arithmetic shift-right divides the number by 2.
 Arithmetic shifts must leave the sign bit unchanged because the sign of the number
remains the samewhen it is multiplied or divided by 2.
7. Arithmetic Logic Shift Unit:
Instead of having individual registers performing the micro operations directly,
computer systems employ a number of storage registers connected to a common
operational unit called an arithmeticlogic unit, abbreviated ALU.
The ALU is a combinational circuit so that the entire register transfer operation from
the source registers through the ALU and into the destination register can be performed
during one clock pulseperiod.
The shift micro operations are often performed in a separate unit, but sometimes the
shift unit is madepart of the overall ALU.
The arithmetic, logic, and shift circuits introduced in previous sections can be combined
into one ALU with common selection variables. One stage of an arithmetic logic shift
unit is shown in Fig. 4-13.
Particular micro operation is selected with inputs S1 and S0. A 4 x 1 multiplexer at the
output choosesbetween an arithmetic output in Di and a logic output in Ei.
The data in the multiplexer are selected with inputs S3 and S2. The other two data inputs
to the multiplexer receive inputs Ai-1 for the shift-right operation and Ai+1 for the shift-
left operation.
The circuit whose one stage is specified in Fig. 4-13 provides eight arithmetic operation,
four logicoperations, and two shift operations.
Each operation is selected with the five variables S3, S2, S1, S0 and Cin.
The input carry Cin is used for selecting an arithmetic operation only.
Table 4-8 lists the 14 operations of the ALU. The first eight are arithmetic operations and are
selectedwith S3S2 = 00.
The next four are logic and are selected with S3S2 = 01.
The input carry has no effect during the logic operations and is marked with don't-care
x‘s.
The last two operations are shift operations and are selected with S3S2= 10 and 11.
The other three selection inputs have no effect on the shift.
Part-2: BASIC COMPUTER ORGANIZATION AND DESIGN
CONTENTS:
 Instruction Codes
 Instruction Cycle
 Register – Reference Instructions
 Memory – Reference Instructions
 Input – Output And Interrupt
1. Instruction Codes:
 The organization of the computer is defined by its internal registers, the timing and
control structure, and the set of instructions that it uses.
 Internal organization of a computer is defined by the sequence of micro-operations it
performs on data stored in its registers.
 Computer can be instructed about the specific sequence of operations it must perform.
 User controls this process by means of a Program.
 Program: set of instructions that specify the operations, operands, and the sequence by
which processing has to occur.
 Instruction: a binary code that specifies a sequence of micro-operations for the
computer.
 The computer reads each instruction from memory and places it in a control register.
The control then interprets the binary code of the instruction and proceeds to execute it
by issuing a sequence of micro-operations. – Instruction Cycle
 Instruction Code: group of bits that instruct the computer to perform specific
operation.
 Instruction code is usually divided into two parts: Opcode and address(operand)

 Operation Code (opcode):


 group of bits that define the operation
 Eg: add, subtract, multiply, shift, complement.
 No. of bits required for opcode depends on no. of operations available in computer.
 n bit opcode >= 2n (or less) operations
 Address (operand):
 specifies the location of operands (registers or memory words)
 Memory words are specified by their address
 Registers are specified by their k-bit binary code
 k-bit address >= 2k registers
Stored Program Organization:
 The ability to store and execute instructions is the most important property of a
general-purpose computer. That type of stored program concept is called stored
program organization.
 The simplest way to organize a computer is to have one processor register and an
instruction code format with two parts. The first part specifies the operation to be
performed and the second specifies an address.
 The below figure shows the stored program organization

 Instructions are stored in one section of memory and data in another.


 For a memory unit with 4096 words we need 12 bits to specify an address since 212 =
4096.
 If we store each instruction code in one 16-bit memory word, we have available four
bits for the operation code (abbreviated opcode) to specify one out of 16 possible
operations, and 12 bits to specify the address of an operand.
 Accumulator (AC):
 Computers that have a single-processor register usually assign to it the name
accumulator and label it AC.
 The operation is performed with the memory operand and the content of AC.
Addressing of Operand:
 Sometimes convenient to use the address bits of an instruction code not as an address
but as the actual operand.
 When the second part of an instruction code specifies an operand, the instruction is
said to have an
immediate operand.
 When the second part specifies the address of an operand, the instruction is said to
have a direct address.
 When second part of the instruction designate an address of a memory word in which
the address of the operand is found such instruction have indirect address.
 One bit of the instruction code can be used to distinguish between a direct and an
indirect address.
 The instruction code format shown in Fig. 5-2(a). It consists of a 3-bit operation code, a
12-bit address, and an indirect address mode bit designated by I. The mode bit is 0 for a
direct address and 1 for an indirect address.

 A direct address instruction is shown in Fig. 5-2(b).


 It is placed in address 22 in memory. The I bit is 0, so the instruction is recognized as a
direct address instruction. The opcode specifies an ADD instruction, and the address
part is the binary equivalent of 457.
 The control finds the operand in memory at address 457 and adds it to the content of
AC.
 The instruction in address 35 shown in Fig. 5-2(c) has a mode bit I = 1.
 Therefore, it is recognized as an indirect address instruction.
 The address part is the binary equivalent of 300. The control goes to address 300 to find
the address of the operand. The address of the operand in this case is 1350.
 The operand found in address 1350 is then added to the content of AC.
 The effective address to be the address of the operand in a computation-type
instruction or the target address in a branch-type instruction.
 Thus the effective address in the instruction of Fig. 5-2(b) is 457 and in the instruction
of Fig 5-2(c) is 1350.
Instruction Cycle:
 A program residing in the memory unit of the computer consists of a sequence of
instructions.
 The program is executed in the computer by going through a cycle for each instruction.
 Each instruction cycle in turn is subdivided into a sequence of sub cycles or phases.
 In the basic computer each instruction cycle consists of the following phases:
1. Fetch an instruction from memory.
2. Decode the instruction.
3. Read the effective address from memory if the instruction has an indirect address.
4. Execute the instruction.
 Upon the completion of step 4, the control goes back to step 1 to fetch, decode, and
execute the next instruction.
Fetch and Decode:
 Initially, the program counter PC is loaded with the address of the first instruction in
the program.
 The sequence counter SC is cleared to 0, providing a decoded timing signal T0.
 The micro operations for the fetch and decode phases can be specified by the following
register transfer statements.

 Figure 5-8 shows how the first two register transfer statements are implemented in the
bus system.
 To provide the data path for the transfer of PC to AR we must apply timing signal T0 to
achieve the following connection:
o Place the content of PC onto the bus by making the bus selection inputs S2, S1, S0 equal
to 010.
o Transfer the content of the bus to AR by enabling the LD input of AR.
 In order to implement the second statement it is necessary to use timing signal T1 to
provide the following connections in the bus system.
o Enable the read input of memory.
o Place the content of memory onto the bus by making S2S1S0=111.
o Transfer the content of the bus to IR by enabling the LD input of IR.
o Increment PC by enabling the INR input of PC.
 Multiple input OR gates are included in the diagram because there are other control
functions that will initiate similar operations.
Determine the Type of Instruction:
 The timing signal that is active after the decoding is T3.
 During time T3, the control unit determine the type of instruction that was read from
the memory.
 The flowchart of fig.5-9 shows the initial configurations for the instruction cycle and
also how the control determines the instruction cycle type after the decoding.
 Decoder output D7 is equal to 1 if the operation code is equal to binary 111.
 If D7=1, the instruction must be a register-reference or input-output type.
 If D7 = 0, the operation code must be one of the other seven values 000 through 110,
specifying a memory-reference instruction.Control then inspects the value of the first
bit of the instruction, which is now available in flip-flop I.
 If D7 = 0 and I = 1, indicates a memory-reference instruction with an indirect address.
So it is then necessary to read the effective address from memory.
 If D7 = 0 and I = 0, indicates a memory-reference instruction with a direct address.
 If D7 = 1 and I = 0, indicates a register-reference instruction.
 If D7 = 01and I = 1, indicates an input-output instruction.
 The three instruction types are subdivided into four separate paths.
 The selected operation is activated with the clock transition associated with timing
signal T3.
 This can be symbolized as follows:
Register-Reference Instructions:
 Register-reference instructions are recognized by the control when D7 = 1 and I=0.
 These instructions use bits 0 through 11 of the instruction code to specify one of 12
instructions.
 These 12 bits are available in IR (0-11).
 The control functions and micro operations for the register-reference instructions are
listed in Table 5-3.
 These instructions are executed with the clock transition associated with timing
variable T3.
 Control function needs the Boolean relation D7I‘T3, which we designate for convenience
by the symbol r.
 By assigning the symbol Bi to bit i of IR, all control functions can be simply denoted by
rBi.
 For example, the instruction CLA has the hexadecimal code 7800, which gives the
binary equivalent 0111 1000 0000 0000. The first bit is a zero and is equivalent to I‘.
 The next three bits constitute the operation code and are recognized from decoder
output D7.
 Bit 11 in IR is 1 and is recognized from B11. The control function that initiates the micro
operation for this instruction is D7I‘T3 B11 = rB11.
 The execution of a register-reference instruction is completed at time T3.
 The sequence counter SC is cleared to 0 and the control goes back to fetch the next
instruction with timing signal T0.
 The first seven register-reference instructions perform clear, complement, circular shift,
and increment micro operations on the AC or E registers.
 The next four instructions cause a skip of the next instruction in sequence when a stated
condition is satisfied. The skipping of the instruction is achieved by incrementing PC
once again.
 The condition control statements must be recognized as part of the control conditions.
 The AC is positive when the sign bit in AC(15) = 0; it is negative when AC(15) = 1. The
content of AC is zero (AC = 0) if all the flip-flops of the register are zero.
 The HLT instruction clears a start-stop flip-flop S and stops the sequence counter from
counting.

2. Memory-Reference Instructions:
 Table 5-4 lists the seven memory-reference instructions.
 The decoded output Di for i = 0, 1, 2, 3, 4, 5, and 6 from the operation decoder that
belongs to each instruction is included in the table.
 The effective address of the instruction is in the address register AR and was placed
there during timing signal T2 when I= 0, or during timing signal T3 when I = 1.
 The execution of the memory-reference instructions starts with timing signal T4.
 The symbolic description of each instruction is specified in the table in terms of register
transfer notation.

AND to AC:
 This is an instruction that performs the AND logic operation on pairs of bits in AC and
the memory word specified by the effective address.
 The result of the operation is transferred to AC.
 The micro operations that execute this instruction are:

ADD to AC:
 This instruction adds the content of the memory word specified by the effective
address to the value of AC.
 The sum is transferred into AC and the output carry Cout is transferred to the E
(extended accumulator) flip-flop.
 The micro operations needed to execute this instruction are

LDA: Load to AC
 This instruction transfers the memory word specified by the effective address to AC.
 The micro operations needed to execute this instruction are

STA: Store AC
 This instruction stores the content of AC into the memory word specified by the
effective address.
 Since the output of AC is applied to the bus and the data input of memory is connected
to the bus, we can execute this instruction with one micro operation.

BUN: Branch Unconditionally


 This instruction transfers the program to the instruction specified by the effective
address.
 The BUN instruction allows the programmer to specify an instruction out of sequence
and we say that the program branches (or jumps) unconditionally.
 The instruction is executed with one micro operation:

BSA: Branch and Save Return Address


 This instruction is useful for branching to a portion of the program called a subroutine
or procedure.
 When executed, the BSA instruction stores the address of the next instruction in
sequence (which is available in PC) into a memory location specified by the effective
address.
 The effective address plus one is then transferred to PC to serve as the address of the
first instruction in the subroutine.
 This operation was specified with the following register transfer:

 A numerical example that demonstrates how this instruction is used with a subroutine
is shown in Fig. 5-10.

 The BSA instruction is assumed to be in memory at address 20.


 The I bit is 0 and the address part of the instruction has the binary equivalent of 135.
 After the fetch and decode phases, PC contains 21, which is the address of the next
instruction in the program (referred to as the return address). AR holds the effective
address 135.
 This is shown in part (a) of the figure.
 The BSA instruction performs the following numerical operation:

 The result of this operation is shown in part (b) of the figure.


 The return address21 is stored in memory location 135 and control continues with the
subroutine program starting from address 136.
 The return to the original program (at address 21) is accomplished by means of an
indirect BUN instruction placed at the end of the subroutine.
 When this instruction is executed, control goes to the indirect phase to read the
effective address at location 135, where it finds the previously saved address 21.
 When the BUN instruction is executed, the effective address 21 is transferred to PC.
 The next instruction cycle finds PC with the value 21, so control continues to execute the
instruction at the return address.
 The BSA instruction must be executed with a sequence of two micro operations:

ISZ: Increment and Skip if Zero


 This instruction increment the word specified by the effective address, and if the
incremented value is equal to 0, PC is incremented by 1 to skip the next instruction in
the program.
 Since it is not possible to increment a word inside the memory, it is necessary to read
the word into DR, increment DR, and store the word back into memory.
 This is done with the following sequence of micro operations:

Input-Output and Interrupt:


 Instructions and data stored in memory must come from some input device.
 Computational results must be transmitted to the user through some output device.
 To demonstrate the most basic requirements for input and output communication, we
will use as an illustration a terminal unit with a keyboard and printer.
Input-Output Configuration:
 The terminal sends and receives serial information.
 Each quantity of information has eight bits of an alphanumeric code.
 The serial information from the keyboard is shifted into the input register INPR.
 The serial information for the printer is stored in the output register OUTR.
 These two registers communicate with a communication interface serially and with the
AC in parallel.
 The input—output configuration is shown in Fig. 5-12.
 The input register INPR consists of eight bits and holds alphanumeric input
information.
 The 1-bit input flag FGI is a control flip-flop.
 The flag bit is set to 1 when new information is available in the input device and is
cleared to 0 when the information is accepted by the computer.
 The output register OUTR works similarly but the direction of information flow is
reversed.
 Initially, the output flag FGO is set to 1.
 The computer checks the flag bit; if it is 1, the information from AC is transferred in
parallel to OUTR and FGO is cleared to 0.
 The output device accepts the coded information, prints the corresponding character,
and when the operation is completed, it sets FGO to 1.
Input-Output Instructions:
 Input and output instructions are needed for transferring information to and from AC
register, for checking the flag bits, and for controlling the interrupt facility.
 Input-output instructions have an operation code 1111 and are recognized by the
control when D7 = 1 and I = 1.
 The remaining bits of the instruction specify the particular operation.
 The control functions and micro operations for the input-output instructions are listed
in Table 5-5.
 These instructions are executed with the clock transition associated with timing signal
T3.
 Each control function needs a Boolean relation D7IT3, which we designate for
convenience by the symbol p.
 The control function is distinguished by one of the bits in IR (6-11).
 By assigning the symbol Bi to bit i of IR, all control functions can be denoted by pBi for i
= 6 though 11.
 The sequence counter SC is cleared to 0 when p = D7IT3 = 1.
 The last two instructions set and clear an interrupt enable flip-flop IEN.
Program Interrupt:
 The computer keeps checking the flag bit, and when it finds it set, it initiates an
information transfer.
 The difference of information flow rate between the computer and that of the input—
output device makes this type of transfer inefficient.
 An alternative to the programmed controlled procedure is to let the external device
inform the computer when it is ready for the transfer.
 In the meantime the computer can be busy with other tasks. This type of transfer uses
the interrupt facility.
 While the computer is running a program, it does not check the flags.
 When a flag is set, the computer is momentarily interrupted from the current program.
 The computer deviates momentarily from what it is doing to perform of the input or
output transfer.
 It then returns to the current program to continue what it was doing before the
interrupt.
 The interrupt enable flip-flop IEN can be set and cleared with two instructions.
o When IEN is cleared to 0 (with the IOF instruction), the flags cannot interrupt the
computer.
o When IEN is set to (with the ION instruction), the computer can be interrupted.
 The way that the interrupt is handled by the computer can be explained by means of
the flowchart of Fig. 5-13.
 An interrupt flip-flop R is included in the computer. When R = 0, the computer goes
through an instruction cycle.
 During the execute phase of the instruction cycle IEN is checked by the control.
 If it is 0, it indicates that the programmer does not want to use the interrupt,so control
continues with the next instruction cycle.
 If IEN is 1, control checks the flag bits. If both flags are 0, it indicates that neither the
input nor the output registers are ready for transfer of information. In this case, control
continues
with the next instruction cycle.
 If either flag is set to 1 while 1EN = 1, flip-flop R is set to 1. At the end of the execute
phase, control checks the value of R, and if it is equal to 1, it goes to an interrupt cycle
instead of an instruction cycle.

Interrupt cycle:
 The interrupt cycle is a hardware implementation of a branch and save return address
operation.
 The return address available in PC is stored in a specific location.
 This location may be a processor register, a memory stack, or a specific memory
location.
 An example that shows what happens during the interrupt cycle is shown in Fig. 5-14.
 When an interrupt occurs and R is set to 1 while the control is executing the instruction
at address 255.
 At this time, the returns address 256 is in PC.
 The programmer has previously placed an input—output service program in memory
starting from address 1120 and a BUN 1120 instruction at address 1. This is shown in
Fig. 5.14(a).
 When control reaches timing signal T0and finds that R = 1, it proceeds with the
interrupt cycle.
 The content of PC (256) is stored in memory location 0, PC is set to 1, and R is cleared to
0.
 The branch instruction at address 1 causes the program to transfer to the input—output
service program at address 1120.
 This program checks the flags, determines which flag is set, and then transfers the
required input or output information.
 Once this is done, the instruction ION is executed to set IEN to 1 (to enable further
interrupts), and the program returns to the location where it was interrupted.
 This is shown in Fig. 5-14(b).
UNIT III
Memory Organization

12-1 MEMORY HIERARCHY

Memory hierarchy in a computer system :

Memory hierarchy system consists of all storage devices employed in a computer system
from the slow but high capacity auxiliary memory to a relatively faster main memory,
to an even smaller and faster cache memory accessible to the high speed processing
logic.

 Main Memory: memory unit that communicates directly with the CPU (RAM)
 Auxiliary Memory: device that provide backup storage (Disk Drives)
 Cache Memory: special very-high-speed memory to increase the processing speed
(Cache RAM)

Figure12- 1 illustrates the components in a typical memory hierarchy. At the bottom of


the hierarchy are the relatively slow magnetic tapes used to store removable files. Next
are the Magnetic disks used as backup storage. The main memory occupies a central
position by being able to communicate directly with CPU and with auxiliary memory
devices through an I/O process. Program not currently needed in main memory are
transferred into auxiliary memory to provide space for currently used programs and
data.
The cache memory is used for storing segments of programs currently being executed in
the CPU. The I/O processor manages data transfer between auxiliary memory and
main memory. The auxiliary memory has a large storage capacity is relatively
inexpensive, but has low access speed compared to main memory. The cache memory
is very small, relatively expensive, and has very high access speed. The CPU has direct
access to both cache and main memory but not to auxiliary memory.
Multiprogramming:
Many operating systems are designed to enable the CPU to process a number of
independent programs concurrently.
Multiprogramming refers to the existence of 2 or more programs in different parts of the
memory hierarchy at the same time.
Memory management System:

The part of the computer system that supervises the flow of information between
auxiliary memory and main memory.
MAIN MEMORY
Main memory is the central storage unit in a computer system. It is a relatively large and
fast memory used to store programs and data during the computer operation. The
principal technology used for the main memory is based on semi conductor integrated
circuits. Integrated circuits RAM chips are available in two possible operating modes,
static and dynamic.
 Static RAM – Consists of internal flip flops that store the binary information.
 Dynamic RAM – Stores the binary information in the form of electric charges that are
applied to capacitors.
Most of the main memory in a general purpose computer is made up of RAM
integrated circuit chips, but a portion of the memory may be constructed with ROM
chips.
 Read Only Memory –Store programs that are permanently resident in the computer
and for tables of constants that do not change in value once the production of the
computer is completed.

The ROM portion of main memory is needed for storing an initial program called a
Bootstrap loader.
 Boot strap loader –function is start the computer software operating when power is
turned on.
 Boot strap program loads a portion of operating system from disc to main memory and
control is then transferred to operating system.
RAM and ROM CHIP
 RAM chip –utilizes bidirectional data bus with three state buffers to perform
communication with CPU
The block diagram of a RAM Chip is shown in Fig.12-2. The capacity of memory is 128
words of eight bits (one byte) per word. This requires a 7-bit address and an 8-bit
bidirectional data bus. The read and write inputs specify the memory operation and
the two chips select (CS) control inputs are enabling the chip only when it is selected
by the microprocessor. The read and write inputs are sometimes combined into one
line labelled R/W.
The function table listed in Fig.12-2(b) specifies the operation of the RAM chip. The unit
is in operation only when CS1=1 and CS2=0.The bar on top of the second select
variable indicates that this input is enabled when it is equal to 0. If the chip select
inputs are not enabled, or if they are enabled but the read or write inputs are not
enabled, the memory is inhibited and its data bus is in a high-impedance state. When
CS1=1 and CS2=0, the memory can be placed in a write or read mode. When the WR
input is enabled, the memory stores a byte from the data bus into a location specified
by the address input lines. When the RD input is enabled, the content of the selected
byte is placed into the data bus. The RD and WR signals control the memory operation
as well as the bus buffers associated with the bidirectional data bus.
A ROM chip is organized externally in a similar manner. However, since a ROM can
only read, the data bus can only be in an output mode. The block diagram of a ROM
chip is shown in fig.12-3. The nine address lines in the ROM chip specify any one of
the512 bytes stored in it. The two chip select inputs must be CS1=1 and CS2=0 for the
unit to operate. Otherwise, the data bus is in a high-impedance state.
Memory Address Map
The interconnection between memory and processor is then established from knowledge
of the size of memory needed and the type of RAM and ROM chips available. The
addressing of memory can be established by means of a table that specify the memory
address assigned to each chip. The table called Memory address map, is a pictorial
representation of assigned address space for each chip in the system.

The memory address map for this configuration is shown in table 12-1. The component
column specifies whether a RAM or a ROM chip is used. The hexadecimal address
column assigns a range of hexadecimal equivalent addresses for each chip. The address
bus lines are listed in the third column. The RAM chips have 128 bytes and need seven
address lines. The ROM chip has 512 bytes and needs 9 address lines.
Memory Connection to CPU:
RAM and ROM chips are connected to a CPU through the data and address buses. The
low order lines in the address bus select the byte within the chips and other lines in
the address bus select a particular chip through its chip select inputs.
The connection of memory chips to the CPU is shown in Fig.12-4. This configuration
gives a memory capacity of 512 bytes of RAM and 512 bytes of ROM. Each RAM
receives the seven low-order bits of the address bus to select one of 128 possible bytes.
The particular RAM chip selected is determined from lines 8 and 9 in the address bus.
This is done through a 2 X 4 decoder whose outputs go to the CS1 inputs in each RAM
chip. Thus, when address lines 8 and 9 are equal to 00, the first RAM chip is selected.
When 01, the second RAM chip is select, and so on. The RD and WR outputs from the
microprocessor are applied to the inputs of each RAM chip. The selection between
RAM and ROM is achieved through bus line 10. The RAMs are selected when the bit in
this line is 0, and the ROM when the bit is 1. Address bus lines 1 to 9 are applied to the
input address of ROM without going through the decoder. The data bus of the ROM
has only an output capability, whereas the data bus connected to the RAMs can
transfer information in both directions.
AUXILIARY MEMORY
The time required to find an item stored in memory can be reduced considerably if stored
data can be identified for access by the content of the data itself rather than by an
address. A memory unit accessed by content is called an associative memory or content
addressable memory (CAM).
 CAM is accessed simultaneously and in parallel on the basis of data content rather
than by specific address or location
 Associative memory is more expensive than a RAM because each cell must have
storage capability as well as logic circuits
 Argument register –holds an external argument for content matching
 Key register –mask for choosing a particular field or key in the argument word
Hardware Organization
It consists of a memory array and logic for m words with n bits per word. The argument

register A and key register K each have n bits, one for each bit of a word. The match
register M has m bits, one for each memory word. Each word in memory is compared
in parallel with the content of the argument register. The words that match the bits of
the argument register set a corresponding bit in the match register. After the matching
process, those bits in the match register that have been set b indicate the fact that their
corresponding words have been matched. Reading is accomplished by a sequential
access to memory for those words whose corresponding bits in the match register have
been set.
The relation between the memory array and external registers in an associative memory
is shown in Fig.12-7. The cells in the array are marked by the letter C with two
subscripts. The first subscript gives the word number and second specifies the bit
position in the word. Thus cell Cij is the cell for bit j in word
i. A bit Aj in the argument register is compared with all the bits in column j of the array
provided that kj
=1.This is done for all columns j=1,2,….n. If a match occurs between all the unmasked
bits of the argument and the bits in word I, the corresponding bit Mi in the match
register is set to 1. If one or more unmasked bits of the argument and the word do not
match, Mi is cleared to 0.

It consists of flip-flop storage element Fij and the circuits for reading, writing, and
matching the cell. The input bit is transferred into the storage cell during a write
operation. The bit stored is read out during a read operation. The match logic
compares the content of the storage cell with corresponding unmasked bit of the
argument and provides an output for the decision logic that sets the bit in Mi.
Match Logic
The match logic for each word can be derived from the comparison algorithm for two
binary numbers. First, neglect the key bits and compare the argument in A with the
bits stored in the cells of the words.
Word i is equal to the argument in A if Aj=F ijfor j=1,2,…..,n. Two bits are equal if they are
both 1 or both 0. The equality of two bits can be expressed logically by the Boolean
function
xj=Aj Fij + Aj ‗Fij ‗

where xj = 1 if the pair of bits in position j are equal;otherwise , xj =0. For a word i is
equal to the argument in A we must have all xj variables equal to 1. This is the
condition for setting the corresponding match bit Mi to 1. The Boolean function for this
condition is
Mi = x1 x2 x3…… xn

Each cell requires two AND gate and one OR gate. The inverters for A and K are needed
once for each column and are used for all bits in the column. The output of all OR gates
in the cells of the same word go to the input of a common AND gate to generate the
match signal for Mi. Mi will be logic 1 if a match occurs and 0 if no match occurs.
Read and Write operation Read Operation
 If more than one word in memory matches the unmasked argument field , all the
matched words will have 1‘s in the corresponding bit position of the match register
 In read operation all matched words are read sequence by applying a read
signal to each word line whose corresponding Mi bit is a logic 1
 In applications where no two identical items are stored in the memory , only one word
may match
, in which case we can use Mi output directly as a read signal for the corresponding word
Write Operation
Can take two different forms
1. Entire memory may be loaded with new information
2. Unwanted words to be deleted and new words to be inserted

1. Entire memory : writing can be done by addressing each location in sequence – This
makes it random access memory for writing and content addressable memory for
reading – number of lines needed for decoding is d Where m = 2 d , m is number of
words.
2. Unwanted words to be deleted and words to be inserted :
Tag register is used which has as many bits as there are words in memory.
For every active ( valid ) word in memory , the corresponding bit in tag register is set to
1.
When word is deleted the corresponding tag bit is reset to 0.
The word is stored in the memory by scanning the tag register until the first 0 bit
is encountered After storing the word the bit is set to 1.

CACHE MEMORY
 Effectiveness of cache mechanism is based on a property of computer programs called
“locality of reference”
 The references to memory at any given time interval tend to confined within a
localized areas
 Analysis of programs shows that most of their execution time is spent on routines in
which instructions are executed repeatedly These instructions may be – loops, nested
loops , or few procedures that call each other
 Many instructions in localized areas of program are executed
 repeatedly during some time period and reminder of the program is accessed
infrequently This property is called ―Locality of Reference‖.
Locality of Reference

Locality of reference is manifested in two ways :

1. Temporal- means that a recently executed instruction is likely to be executed again very
soon.
 The information which will be used in near future is likely in use already( e.g.
reuse of information in loops)
2. Spatial- means that instructions in close proximity to a recently executed instruction are
also likely to be executed soon
 If a word is accessed, adjacent (near) words are likely to be accessed soon ( e.g.
related data items (arrays) are usually stored together; instructions are executed
sequentially
)If active segments of a program can be placed in afast (cache) memory , then total
execution time can be reduced significantly
3. Temporal Locality of Reference suggests whenever an information (instruction or data)
is needed first , this item should be brought in to cache
4. Spatial aspect of Locality of Reference suggests that instead of bringing just one item
from the main memory to the cache ,it is wise to bring several items that reside at
adjacent addresses as well ( ie a block of information )
Principles of cache

The main memory can store 32k words of 12 bits each. The cache is capable of storing 512
of these words at any given time. For every word stored , there is a duplicate copy in
main memory. The Cpu communicates with both memories. It first sends a 15 bit
address to cahache. If there is a hit, the CPU accepts the 12 bit data from cache. If there
is a miss, the CPU reads the word from main memory and the word is then transferred
to cache.
 When a read request is received from CPU,contents of a block of memory words
containing the location specified are transferred in to cache
 When the program references any of the locations in this block , the contents are read
from the cache Number of blocks in cache is smaller than number of blocks in main
memory
 Correspondence between main memory blocks and those in the cache is specified by a
mapping function
 Assume cache is full and memory word not in cache is referenced
 Control hardware decides which block from cache is to be removed to create space for
new block containing referenced word from memory
 Collection of rules for making this decision is called
 “Replacement algorithm ” Read/ Write operations on cache
 Cache Hit Operation
 CPU issues Read/Write requests using addresses refer to locations in main
memory
 Cache control circuitry determines whether requested word currently exists in cache
 If it does, Read/Write operation is performed on the appropriate location in cache
(Read/Write Hit )
Read/Write operations on cache in case of Hit
 In Read operation main memory is not involved.
 In Write operation two things can happen.
1. Cache and main memory locations are simultaneously (“ Write Through ”)
OR
2. ‖ and update main
memory location at the time of cache block removal (“ Write Back ” or “ Copy Back ”)
.
Read/Write operations on cache in case of Miss Read Operation

 When addressed word is not in cache Read Miss occurs there are two ways this can
be dealt with
1. Entire block of words that contain the requested word is copied from main memory to
cache and the particular word requested is forwarded to CPU from the cache ( Load
Through ) (OR)
2. The requested word from memory is sent to CPU first and then the cache is
updated ( Early Restart )
Write Operation

 If addressed word is not in cache Write Miss occurs


 If write through protocol is used information is directly written in to main memory
 In write back protocol , block containing the word is first brought in to cache , the
desired word is then overwritten.
Mapping Functions

 Correspondence between main memory blocks and those in the cache is specified by a
memory mapping function
 There are three techniques in memory mapping
1. Direct Mapping
2. Associative Mapping
3. Set Associative Mapping

Direct mapping:

A particular block of main memory can be brought to a particular block of cache


memory. So, it is not flexible.
In fig 12-12. The CPU address of 15 bits is divided into two fields. The nine least
significant bits constitute the index field and remaining six bits form the tag field. The
main memory needs an address that includes both the tag and the index bits. The
number of bits in the index field is equal to the number of address bits required to
access the cache memory.

The direct mapping cache organization uses the n- bit address to access the main memory
and the k-bit index to access the cache.Each word in cache consists of the data word
and associated tag. When a new word is first brought into the cache, the tag bits are
stored alongside the data bits.When the CPU generates a memory request, the index
field is used the index field is used for the address to access the cache. The tag field of
the CPU address is compared with the tag in the word read from the cache. If the two
tags match, there is a hit anfd the desired data word is in cache. If there is no match,
there is a miss and the required word is read from main memory.
In fig 12-14, The index field is now divided into two parts: Block field and The word field.
In a 512 word cache there are 64 blocks of 8 words each, since 64X8=512. The block
number is specified with a 6 bit field and the word with in the block is specified with a
3-bit field. Th etag field stored within the the cache is common to all eight words of the
same block.
Associative mapping:
In this mapping function, any block of Main memory can potentially reside in any cache
block position. This is much more flexible mapping method.

In fig 12-11, The associative memory stores both address and content(data) of the
memory word. This permits any location in cache to store any word from main
memory.The diagram shows three words presently stored in the cache. The address
value of 15 bits is shown as a five-digit ctal number and its corressponding 12-bit word
is shown as a four-digit octal number. A CPU address of 15-bits is placed in the
argument register and the associative memory is searched for a matching address. If
address is found, the corresponding 12-bit data is read and sent to the CPU. If no
match occurs, the main memory is accessed for the word.
Set-associative mapping:
In this method, blocks of cache are grouped into sets, and the mapping allows a block of
main memory to reside in any block of a specific set. From the flexibility point of view,
it is in between to the other two methods.

The octal numbers listed in Fig.12-15 are with reference to the main memory contents.
When the CPU generats a memory request, the index valus of the address is used to
access the cache. The tag field of the CPU address is then compared with both tags in
the cache to determine if a match occurs. The comparison logic dine by an associative
search of the tags in the set similar to anassociative memory search thus the name ―Set
Associative‖.
Replacement Policies
 When the cache is full and there is necessity to bring new data to cache , then a decision
must be made as to which data from cache is to be removed
 The guideline for taking a decision about which data is to be removed is called
replacement policy Replacement policy depends on mapping
 There is no specific policy in case of Direct mapping as we have no choice of block
placement in cache Replacement Policies
In case of associative mapping
 A simple procedure is to replace cells of the cache in round robin order whenever a
new word is requested from memory
 This constitutes a First-in First-out (FIFO) replacement policy
In case of set associative mapping
 Random replacement
 First-in First-out (FIFO) ( item chosen is the item that has been in the set longest)

by CPU)
UNIT-IV

INPUT-OUTPUT ORGANIZATION
Peripheral Devices:
The Input / output organization of computer depends upon the size of computer and
the peripherals connected to it. The I/O Subsystem of the computer, provides an
efficient mode of communication between the central system and the outside
environment
The most common input output devices are:
i) Monitor
ii) Keyboard
iii) Mouse
iv) Printer
v) Magnetic tapes

The devices that are under the direct control of the computer are said to be connected
online.
Input - Output Interface
Input Output Interface provides a method for transferring information between
internal storage and external I/O devices.
Peripherals connected to a computer need special communication links for interfacing
them with the central processing unit.
The purpose of communication link is to resolve the differences that exist between the
central computer and each peripheral.
The Major Differences are:-
1. Peripherals are electromechnical and electromagnetic devices and CPU and memory
are electronic devices. Therefore, a conversion of signal values may be needed.
2. The data transfer rate of peripherals is usually slower than the transfer rate of CPU
and consequently, a synchronization mechanism may be needed.
3. Data codes and formats in the peripherals differ from the word format in the CPU
and memory.
4. The operating modes of peripherals are different from each other and must be
controlled so as not to disturb the operation of other peripherals connected to the
CPU.
To Resolve these differences, computer systems include special hardware components
between the CPU and Peripherals to supervises and synchronizes all input and out
transfers
 These components are called Interface Units because they interface between the
processor bus and the peripheral devices.
I/O BUS and Interface Module
It defines the typical link between the processor and several peripherals.
The I/O Bus consists of data lines, address lines and control lines. The I/O bus from
the processor is attached to all peripherals interface.
GVR 50
GVR&S WOMEN’S DEGREE COLLEGE BVK
To communicate with a particular device, the processor places a device address on
address lines.
Each Interface decodes the address and control received from the I/O bus, interprets
them for peripherals and provides signals for the peripheral controller.
It is also synchronizes the data flow and supervises the transfer between peripheral
and processor.
Each peripheral has its own controller.
For example, the printer controller controls the paper motion, the print timing The
control lines are referred as I/O command. The commands are as following:
Control command- A control command is issued to activate the peripheral and to
inform it what to do.
Status command- A status command is used to test various status conditions in the
interface and the peripheral.
Data Output command- A data output command causes the interface to respond by
transferring data from the bus into one of its registers.
Data Input command- The data input command is the opposite of the data output.
In this case the interface receives on item of data from the peripheral and places it in its
buffer register. I/O Versus Memory Bus

To communicate with I/O, the processor must communicate with the memory unit.
Like the I/O bus, the memory bus contains data, address and read/write control
lines. There are 3 ways that computer buses can be used to communicate with
memory and I/O:
i. Use two Separate buses , one for memory and other for I/O.
ii. Use one common bus for both memory and I/O but separate control lines for each.
iii. Use one common bus for memory and I/O with common control lines. I/O
Processor

In
GVRthe first method, the computer has independent sets of data, address and
51control
GVR&S WOMEN’S DEGREE COLLEGE BVK
buses one for accessing memory and other for I/O. This is done in computers that
provides a separate I/O processor (IOP). The purpose of IOP is to provide an
independent pathway for the transfer of information between external device and
internal memory.
Asynchronous Data Transfer :
This Scheme is used when speed of I/O devices do not match with microprocessor,
and timing characteristics of I/O devices is not predictable. In this method, process
initiates the device and check its status. As a result, CPU has to wait till I/O device
is ready to transfer data. When device is ready CPU issues instruction for I/O
transfer. In this method two types of techniques are used based on signals before
data transfer.
i. Strobe Control
ii. Handshaking
Strobe Signal :
The strobe control method of Asynchronous data transfer employs a single control line
to time each transfer. The strobe may be activated by either the source or the
destination unit.
Data Transfer Initiated by Source Unit:

In the block diagram fig. (a), the data bus carries the binary information from source to
destination unit. Typically, the bus has multiple lines to transfer an entire byte or
word. The strobe is a single line that informs the destination unit when a valid data
word is available.
The timing diagram fig. (b) the source unit first places the data on the data bus. The
information on the data bus and strobe signal remain in the active state to allow the
destination unit to receive the data.
Data Transfer Initiated by Destination Unit:
In this method, the destination unit activates the strobe pulse, to informing the source
to provide the data. The source will respond by placing the requested binary
information on the data bus.
The data must be valid and remain in the bus long enough for the destination unit to
accept it. When accepted the destination unit then disables the strobe and the source
GVR 52
GVR&S WOMEN’S DEGREE COLLEGE BVK
unit removes the data from the bus.

Disadvantage of Strobe Signal :


The disadvantage of the strobe method is that, the source unit initiates the transfer has
no way of knowing whether the destination unit has actually received the data item
that was places in the bus. Similarly, a destination unit that initiates the transfer has
no way of knowing whether the source unit has actually placed the data on bus. The
Handshaking method solves this problem.
Handshaking:
The handshaking method solves the problem of strobe method by introducing a
second control signal that provides a reply to the unit that initiates the transfer.
Principle of Handshaking:
The basic principle of the two-wire handshaking method of data transfer is as follow:
One control line is in the same direction as the data flows in the bus from the source to
destination. It is used by source unit to inform the destination unit whether there a
valid data in the bus. The other control line is in the other direction from the
destination to the source. It is used by the destination unit to inform the source
whether it can accept the data. The sequence of control during the transfer depends
on the unit that initiates the transfer.
Source Initiated Transfer using Handshaking:
The sequence of events shows four possible states that the system can be at any given
time. The source unit initiates the transfer by placing the data on the bus and
enabling its data valid signal. The data accepted signal is activated by the destination
unit after it accepts the data from the bus. The source unit then disables its data
accepted signal and the system goes into its initial state.

GVR 53
GVR&S WOMEN’S DEGREE COLLEGE BVK
Destination Initiated Transfer Using Handshaking:
The name of the signal generated by the destination unit has been changed to ready for
data to reflects its new meaning. The source unit in this case does not place data on
the bus until after it receives the ready for data signal from the destination unit. From
there on, the handshaking procedure follows the same pattern as in the source
initiated case.
The only difference between the Source Initiated and the Destination Initiated transfer
is in their choice of Initial sate.
Advantage of the Handshaking method:

 The Handshaking scheme provides degree of flexibility and reliability because the
successful completion of data transfer relies on active participation by both units.
 If any of one unit is faulty, the data transfer will not be completed. Such an error can
be detected by means of a Timeout mechanism which provides an alarm if the 54
data is
GVR
GVR&S WOMEN’S DEGREE COLLEGE BVK
not completed within time.
Asynchronous Serial Transmission:
The transfer of data between two units is serial or parallel. In parallel data
transmission, n bit in the message must be transmitted through n separate conductor
path. In serial transmission, each bit in the message is sent in sequence one at a time.
Parallel transmission is faster but it requires many wires. It is used for short distances
and where speed is important. Serial transmission is slower but is less expensive.
In Asynchronous serial transfer, each bit of message is sent a sequence at a time, and
binary information is transferred only when it is available. When there is no
information to be transferred, line remains idle.
In this technique each character consists of three points :
i. Start bit
ii. Character bit
iii. Stop bit

i. Start Bit- First bit, called start bit is always zero and used to indicate the beginning
character.
ii. Stop Bit- Last bit, called stop bit is always one and used to indicate end of
characters. Stop bit is always in the 1- state and frame the end of the characters to
signify the idle or wait state.
iii. Character Bit- Bits in between the start bit and the stop bit are known as character
bits. The character bits always follow the start bit.

Serial Transmission of Asynchronous is done by two ways:


a) Asynchronous Communication Interface
b) First In First out Buffer
Asynchronous Communication Interface:
It works as both a receiver and a transmitter. Its operation is initialized by CPU by
sending a byte to the control register.
The transmitter register accepts a data byte from CPU through the data bus and
transferred to a shift register for serial transmission.
The receive portion receives information into another shift register, and when a
complete data byte is received it is transferred to receiver register.
CPU can select the receiver register to read the byte through the data bus. Data in the
status register is used for input and output flags.

GVR 55
GVR&S WOMEN’S DEGREE COLLEGE BVK
First In First Out Buffer (FIFO):
A First In First Out (FIFO) Buffer is a memory unit that stores information in such a
manner that the first item is in the item first out. A FIFO buffer comes with separate
input and output terminals. The important feature of this buffer is that it can input
data and output data at two different rates.
When placed between two units, the FIFO can accept data from the source unit at one
rate, rate of transfer and deliver the data to the destination unit at another rate.
If the source is faster than the destination, the FIFO is useful for source data arrive in
bursts that fills out the buffer. FIFO is useful in some applications when data are
transferred asynchronously.
Modes of Data Transfer :
Transfer of data is required between CPU and peripherals or memory or sometimes
between any two devices or units of your computer system. To transfer a data from
one unit to another one should be sure that both units have proper connection and
at the time of data transfer the receiving unit is not busy. This data transfer with the
computer is Internal Operation.
All the internal operations in a digital system are synchronized by means of clock
pulses supplied by a common clock pulse Generator. The data transfer can be
i. Synchronous or
ii. Asynchronous
When both the transmitting and receiving units use same clock pulse then such a data
transfer is called Synchronous process. On the other hand, if the there is not concept
of clock pulses and the sender operates at different moment than the receiver then
such a data transfer is called Asynchronous data transfer.
The data transfer can be handled by various modes. some of the modes use CPU as an
intermediate path, others transfer the data directly to and from the memory unit and
this can be handled by 3 following ways:
i. Programmed I/O
ii. Interrupt-Initiated I/O (or) Priority Interrupt.
iii. Direct Memory Access (DMA)
Programmed I/O Mode:
In this m ode of data transfer the operations are the results in I/O instructions which is
a part of computer program. Each data transfer is initiated by a instruction in the
program. Normally the transfer is from a CPU register to peripheral device or vice-
versa.
Once the data is initiated the CPU starts monitoring the interface to see when next
transfer can made. The instructions of the program keep close tabs on everything
that takes place in the interface unit and the I/O devices.

GVR 56
GVR&S WOMEN’S DEGREE COLLEGE BVK
⚫ The transfer of data requires three instructions:

In this technique CPU is responsible for executing data from the memory for output
and storing data in memory for executing of Programmed I/O as shown in
Flowchart-:

GVR 57
GVR&S WOMEN’S DEGREE COLLEGE BVK
Drawback of the Programmed I/O :
The main drawback of the Program Initiated I/O was that the CPU has to monitor the
units all the times when the program is executing. Thus the CPU stays in a program
loop until the I/O unit indicates that it is ready for data transfer. This is a time
consuming process and the CPU time is wasted a lot in keeping an eye to the
executing of program.
To remove this problem an Interrupt facility and special commands are used.
Interrupt-Initiated I/O (or) Priority Interrupt :
In this method an interrupt facility an interrupt command is used to inform the device
about the start and end of transfer. In the meantime the CPU executes other
program. When the interface determines that the device is ready for data transfer it
generates an Interrupt Request and sends it to the computer.
When the CPU receives such an signal, it temporarily stops the execution of the
program and branches to a service program to process the I/O transfer and after
completing it returns back to task, what it was originally performing.
⚫ In this type of IO, computer does not check the flag. It continue to perform its task.
⚫ Whenever any device wants the attention, it sends the interrupt signal to the CPU.
⚫ CPU then deviates from what it was doing, store the return address from PC and
branch to the address of the subroutine.
⚫ There are two ways of choosing the branch address:
⚫ Vectored Interrupt
⚫ Non-vectored Interrupt
⚫ In vectored interrupt the source that interrupt the CPU provides the branch
information. This information is called interrupt vectored.
⚫ In non-vectored interrupt, the branch address is assigned to the fixed address in the
memory.
Priority Interrupt:
⚫ There are number of IO devices attached to the computer.
⚫ They are all capable of generating the interrupt.
⚫ When the interrupt is generated from more than one device, priority interrupt
system is used to determine which device is to be serviced first.
⚫ Devices with high speed transfer are given higher priority and slow devices are
given lower priority.
⚫ Establishing the priority can be done in two ways:
⚫ Using Software
⚫ Using Hardware
⚫ A pooling procedure is used to identify highest priority in software means.
Polling Procedure :
⚫ There is one common branch address for all interrupts.
⚫ Branch address contain the code that polls the interrupt sources in sequence. The
highest priority is tested first.
⚫ The particular service routine of the highest priority device is served.
⚫ The disadvantage is that time required to poll them can exceed the time to serve
GVR 58
GVR&S WOMEN’S DEGREE COLLEGE BVK
them in large number of IO devices.
Using Hardware:
⚫ Hardware priority system function as an overall manager.
⚫ It accepts interrupt request and determine the priorities.
⚫ To speed up the operation each interrupting devices has its own interrupt vector.
⚫ No polling is required, all decision are established by hardware priority interrupt
unit.
⚫ It can be established by serial or parallel connection of interrupt lines.

Serial or Daisy Chaining Priority:


⚫ Device with highest priority is placed first.
⚫ Device that wants the attention send the interrupt request to the CPU.
⚫ CPU then sends the INTACK signal which is applied to PI(priority in) of the first
device.
⚫ If it had requested the attention, it place its VAD(vector address) on the bus. And it
block the signal by placing 0 in PO(priority out)
⚫ If not it pass the signal to next device through PO(priority out) by placing 1.
⚫ This process is continued until appropriate device is found.
⚫ The device whose PI is 1 and PO is 0 is the device that send the interrupt request.

Parallel Priority Interrupt :


⚫ It consist of interrupt register whose bits are set separately by the interrupting
devices.
⚫ Priority is established according to the position of the bits in the register.
⚫ Mask register is used to provide facility for the higher priority devices to interrupt
when lower priority device is being serviced or disable all lower priority devices
when higher is being serviced.
⚫ Corresponding interrupt bit and mask bit are ANDed and applied to priority
encoder.
⚫ Priority encoder generates two bits of vector address.
GVR 59
GVR&S WOMEN’S DEGREE COLLEGE BVK
⚫ Another output from it sets IST(interrupt status flip flop).

The Execution process of Interrupt–Initiated I/O is represented in the flowchart:

GVR 60
GVR&S WOMEN’S DEGREE COLLEGE BVK
Direct Memory Access (DMA):
In the Direct Memory Access (DMA) the interface transfer the data into and out of the
memory unit through the memory bus. The transfer of data between a fast storage
device such as magnetic disk and memory is often limited by the speed of the CPU.
Removing the CPU from the path and letting the peripheral device manage the
memory buses directly would improve the speed of transfer. This transfer technique
is called Direct Memory Access (DMA).
During the DMA transfer, the CPU is idle and has no control of the memory buses. A
DMA Controller takes over the buses to manage the transfer directly between the
I/O device and memory.
The CPU may be placed in an idle state in a variety of ways. One common method
extensively used in microprocessor is to disable the buses through special control
signals such as:
 Bus Request (BR)
 Bus Grant (BG)
These two control signals in the CPU that facilitates the DMA transfer. The Bus Request
(BR) input is used by the DMA controller to request the CPU. When this input is
active, the CPU terminates the execution of the current instruction and places the
address bus, data bus and read write lines into a high Impedance state. High
Impedance state means that the output is disconnected.

The CPU activates the Bus Grant (BG) output to inform the external DMA that the Bus
Request (BR) can now take control of the buses to conduct memory transfer without
processor.
When the DMA terminates the transfer, it disables the Bus Request (BR) line. The CPU
disables the Bus Grant (BG), takes control of the buses and return to its normal
operation.

GVR 61
GVR&S WOMEN’S DEGREE COLLEGE BVK
The transfer can be made in several ways that are:
i. DMA Burst
ii. Cycle Stealing

i) DMA Burst :- In DMA Burst transfer, a block sequence consisting of a number of


memory words is transferred in continuous burst while the DMA controller is
master of the memory buses.
ii) Cycle Stealing :- Cycle stealing allows the DMA controller to transfer one data word
at a time, after which it must returns control of the buses to the CPU.
DMA Controller:
The DMA controller needs the usual circuits of an interface to communicate with the
CPU and I/O device. The DMA controller has three registers:
i. Address Register
ii. Word Count Register
iii. Control Register
Address Register :- Address Register contains an address to specify the desired
location in memory.
Word Count Register :- WC holds the number of words to be transferred. The
register is incre/decre by one after each word transfer and internally tested for zero.
Control Register :- Control Register specifies the mode of transfer

The unit communicates with the CPU via the data bus and control lines. The registers
in the DMA are selected by the CPU through the address bus by enabling the DS
(DMA select) and RS (Register select) inputs. The RD (read) and WR (write) inputs
are bidirectional.
When the BG (Bus Grant) input is 0, the CPU can communicate with the DMA registers
through the data bus to read from or write to the DMA registers. When BG =1, the

GVR 62
GVR&S WOMEN’S DEGREE COLLEGE BVK
DMA can communicate directly with the memory by specifying an address in the
address bus and activating the RD or WR control.
DMA Transfer:
The CPU communicates with the DMA through the address and data buses as with
any interface unit. The DMA has its own address, which activates the DS and RS
lines. The CPU initializes the DMA through the data bus. Once the DMA receives
the start control command, it can transfer between the peripheral and the memory.
When BG = 0 the RD and WR are input lines allowing the CPU to communicate with
the internal DMA registers. When BG=1, the RD and WR are output lines from the
DMA controller to the random access memory to specify the read or write operation
of data.
Summary :
 Interface is the point where a connection is made between two different parts of a
system.
 The strobe control method of Asynchronous data transfer employs a single control
line to time each transfer.
 The handshaking method solves the problem of strobe method by introducing a
second control signal that provides a reply to the unit that initiates the transfer.
 Programmed I/O mode of data transfer the operations are the results in I/O
instructions which is a part of computer program.
 In the Interrupt Initiated I/O method an interrupt facility an interrupt command is
used to inform the device about the start and end of transfer.
 In the Direct Memory Access (DMA) the interface transfer the data into and out of
the memory unit through the memory bus.

Input-Output Processor:
⚫ It is a processor with direct memory access capability that communicates with IO
devices.
⚫ IOP is similar to CPU except that it is designed to handle the details of IO operation.
⚫ Unlike DMA which is initialized by CPU, IOP can fetch and execute its own
instructions.
⚫ IOP instruction are specially designed to handle IO operation.

GVR 63
GVR&S WOMEN’S DEGREE COLLEGE BVK
⚫ Memory occupies the central position and can communicate with each processor by
DMA.
⚫ CPU is responsible for processing data.
⚫ IOP provides the path for transfer of data between various peripheral devices and
memory.
⚫ Data formats of peripherals differ from CPU and memory. IOP maintain such
problems.
⚫ Data are transfer from IOP to memory by stealing one memory cycle.
⚫ Instructions that are read from memory by IOP are called commands to distinguish
them from instructions that are read by the CPU.

GVR 64
GVR&S WOMEN’S DEGREE COLLEGE BVK
Instruction that are read from memory by an IOP

» Distinguish from instructions that are read by the CPU

» Commands are prepared by experienced programmers and are stored in memory


» Command word = IOP program

GVR 65
GVR&S WOMEN’S DEGREE COLLEGE BVK
UINIT-V
Computer Arithmetic and Parallel Processing
Parallel processing:
• Parallel processing is a term used for a large class of techniques that
are used to provide simultaneous data-processing tasks for the purpose of increasing
the computational speed of a computer system.
⚫ It refers to techniques that are used to provide simultaneous data processing.
⚫ The system may have two or more ALUs to be able to execute two or more
instruction at the same time.
⚫ The system may have two or more processors operating concurrently.
⚫ It can be achieved by having multiple functional units that perform same or
different operation simultaneously.
• Example of parallel Processing:
– Multiple Functional Unit:

Separate the execution unit into eight functional units operating in parallel.
⚫ There are variety of ways in which the parallel processing can be classified
 Internal Organization of Processor
 Interconnection structure between processors
 Flow of information through system

Architectural Classification:
– Flynn's classification
» Based on the multiplicity of Instruction Streams and Data Streams
» Instruction Stream
• Sequence of Instructions read from memory
» Data Stream
• Operations performed on the data in the processor

GVR 66
GVR&S WOMEN’S DEGREE COLLEGE BVK
⚫ SISD represents the organization containing single control unit, a processor unit and
a memory unit. Instruction are executed sequentially and system may or may not
have internal parallel processing capabilities.
⚫ SIMD represents an organization that includes many processing units under the
supervision of a common control unit.
⚫ MISD structure is of only theoretical interest since no practical system has been
constructed using this organization.
⚫ MIMD organization refers to a computer system capable of processing several
programs at the same time.
The main difference between multicomputer system and multiprocessor system is that
the multiprocessor system is controlled by one operating system that provides
interaction between processors and all the component of the system cooperate in the
solution of a problem.
⚫ Parallel Processing can be discussed under following topics:
 Pipeline Processing
 Vector Processing
 Array Processors

PIPELINING:
• A technique of decomposing a sequential process into sub operations, with each sub
process being executed in a special dedicated segment that operates concurrently
with all other segments.
• It is a technique of decomposing a sequential process into sub operations, with each
sub process being executed in a special dedicated segments that operates
concurrently with all other segments.
• Each segment performs partial processing dictated by the way task is partitioned.
• The result obtained from each segment is transferred to next segment.
• The final result is obtained when data have passed through all segments.
• Suppose we have to perform the following task:
• Each sub operation is to be performed in a segment within a pipeline. Each segment
has one or two registers and a combinational circuit.

GVR 67
GVR&S WOMEN’S DEGREE COLLEGE BVK
OPERATIONS IN EACH PIPELINE STAGE:

GVR 68
GVR&S WOMEN’S DEGREE COLLEGE BVK
• General Structure of a 4-Segment Pipeline

• Space-Time Diagram

The following diagram shows 6 tasks T1 through T6 executed in 4segments.

PIPELINE SPEEDUP:
Consider the case where a k-segment pipeline used to execute n tasks.
 n = 6 in previous example
 k = 4 in previous example
• Pipelined Machine (k stages, n tasks)
 The first task t1 requires k clock cycles to complete its operation since there are k
segments
 The remaining n-1 tasks require n-1 clock cycles
 The n tasks clock cycles = k+(n-1) (9 in previous example)
• Conventional Machine (Non-Pipelined)
 Cycles to complete each task in nonpipeline = k
 For n tasks, n cycles required is
• Speedup (S)
 S = Nonpipeline time /Pipeline time
 For n tasks: S = nk/(k+n-1)
 As n becomes much larger than k-1; Therefore, S = nk/n = k PIPELINE AND
MULTIPLE FUNCTION UNITS:
Example:
- 4-stage pipeline
- 100 tasks to be executed
- 1 task in non-pipelined system; 4 clock cycles
Pipelined System : k + n - 1 = 4 + 99 = 103 clock cycles Non-Pipelined System : n*k =
100 * 4 = 400 clock cycles Speedup : Sk = 400 / 103 = 3.88

GVR 69
GVR&S WOMEN’S DEGREE COLLEGE BVK
Types of Pipelining:
• Arithmetic Pipeline
• Instruction Pipeline

ARITHMETIC PIPELINE:
⚫ Pipeline arithmetic units are usually found in very high speed computers.
⚫ They are used to implement floating point operations.
⚫ We will now discuss the pipeline unit for the floating point addition and
subtraction.
⚫ The inputs to floating point adder pipeline are two normalized floating point
numbers.
⚫ A and B are mantissas and a and b are the exponents.
⚫ The floating point addition and subtraction can be performed in four segments.
Floating-point adder:
[1] Compare the exponents
[2] Align the mantissa
[3] Add/sub the mantissa
[4] Normalize the result

X = A x 10a = 0.9504 x 103


Y = B x 10b = 0.8200 x 102

1) Compare exponents: 3 - 2 = 1
2) Align mantissas
X = 0.9504 x 103
Y = 0.08200 x 103
3) Add mantissas
Z = 1.0324 x 103
4) Normalize result
Z = 0.10324 x 104

GVR 70
GVR&S WOMEN’S DEGREE COLLEGE BVK
Instruction Pipeline:
⚫ Pipeline processing can occur not only in the data stream but in the instruction
stream as well.
⚫ An instruction pipeline reads consecutive instruction from memory while previous
instruction are being executed in other segments.
⚫ This caused the instruction fetch and execute segments to overlap and perform
simultaneous operation.
Four Segment CPU Pipeline:
⚫ FI segment fetches the instruction.
⚫ DA segment decodes the instruction and calculate the effective address.
⚫ FO segment fetches the operand.
⚫ EX segment executes the instruction.

GVR 71
GVR&S WOMEN’S DEGREE COLLEGE BVK
GVR&S WOMEN’S DEGREE COLLEGE

INSTRUCTION CYCLE:
Pipeline processing can occur also in the instruction stream. An instruction pipeline
reads consecutive instructions from memory while previous instructions are being
executed in other segments.
Six Phases* in an Instruction Cycle
[1] Fetch an instruction from memory
[2] Decode the instruction
[3] Calculate the effective address of the operand
[4] Fetch the operands from memory
[5] Execute the operation
[6] Store the result in the proper place
* Some instructions skip some phases
* Effective address calculation can be done in the part of the decoding phase
* Storage of the operation result into a register is done automatically in the execution
72
BVK
GVR&S WOMEN’S DEGREE COLLEGE

phase ==> 4-Stage Pipeline


[1] FI: Fetch an instruction from memory
[2] DA: Decode the instruction and calculate the effective address of the operand
[3] FO: Fetch the operand
[4] EX: Execute the operation
Pipeline Conflicts :
– Pipeline Conflicts : 3 major difficulties


1) Resource conflicts: memory access by two segments at the same time. Most of these
conflicts can be resolved by using separate instruction and data memories.
2) Data dependency: when an instruction depend on the result of a previous
instruction, but this result is not yet available.
Example: an instruction with register indirect mode cannot proceed to fetch the
operand if the previous instruction is loading the address into the register.
3) Branch difficulties: branch and other instruction (interrupt, ret, ..) that change the
value of PC.
Handling Data Dependency:
⚫ This problem can be solved in the following ways:
 Hardware interlocks: It is the circuit that detects the conflict situation and delayed
the instruction by sufficient cycles to resolve the conflict.
 Operand Forwarding: It uses the special hardware to detect the conflict and avoid it
by routing the data through the special path between pipeline segments.
 Delayed Loads: The compiler detects the data conflict and reorder the instruction as
necessary to delay the loading of the conflicting data by inserting no operation
instruction.
Handling of Branch Instruction:
⚫ Pre fetch the target instruction.
⚫ Branch target buffer(BTB) included in the fetch segment of the pipeline
⚫ Branch Prediction
⚫ Delayed Branch RISC Pipeline:
⚫ Simplicity of instruction set is utilized to implement an instruction pipeline using
73
BVK
GVR&S WOMEN’S DEGREE COLLEGE

small number of sub-operation, with each being executed in single clock cycle.
Since all operation are performed in the register, there is no need of effective address
calculation.
Three Segment Instruction Pipeline:
⚫ I: Instruction Fetch
⚫ A: ALU Operation
⚫ E: Execute Instruction Delayed Load:

Delayed Branch:
Let us consider the program having the following 5 instructions

74
BVK
GVR&S WOMEN’S DEGREE COLLEGE

75

You might also like