0% found this document useful (0 votes)
243 views148 pages

Arch Module Final-1

Uploaded by

Elijah Ibsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
243 views148 pages

Arch Module Final-1

Uploaded by

Elijah Ibsa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 148

Computer

Organization
and
Architecture
(ITec381)
GHION TECHNOLOGY COLLEGE
Department of COMPUTER SICENCE

Computer Organization
and
Architecture
(ITec381)
Module

August 2011
DebreMarkos
Table of Contents
Course Objectives....................................................................................................................................................i

Learning Outcomes.................................................................................................................................................i

Chapter 1 Boolean Algebra and Digital Logic Circuits.......................................................................................1


1.1 Boolean Algebra.............................................................................................................................................1
1.2 Logic Gates.....................................................................................................................................................2
1.3 Combinational Circuits...................................................................................................................................5
1.3.1 Implementation of Boolean Functions.....................................................................................................5
1.3.2 Algebraic Simplification..........................................................................................................................7
1.3.3 Karnaugh Maps.......................................................................................................................................8
1.3.4 The Quine–McKluskey Method............................................................................................................11
1.4 Sequential Circuits........................................................................................................................................13
1.4.1 Flip-Flops..............................................................................................................................................14
1.5 Review Questions.........................................................................................................................................19

Chapter 2 Common Digital Components............................................................................................................20


2.1 Integrated Circuits........................................................................................................................................20
2.2 Common Combinational Circuits.................................................................................................................21
2.2.1 Adders...................................................................................................................................................21
2.2.2 Multiplexers...........................................................................................................................................23
2.2.3 Decoders................................................................................................................................................25
2.3 Common Sequential Circuits........................................................................................................................27
2.3.1 Registers................................................................................................................................................27
2.3.2 Binary Counters.....................................................................................................................................28
2.4 Memory Circuits...........................................................................................................................................31
2.4.1 Random-Access Memory......................................................................................................................31
2.4.2 Read-Only Memory...............................................................................................................................33
2.5 Review Questions.........................................................................................................................................36

Chapter 3 Data Representation...........................................................................................................................37


3.1 Number systems...........................................................................................................................................37
3.1.1 Converting from one base to another.....................................................................................................39
3.2 Representation of Integers............................................................................................................................44
3.2.1 Sign-Magnitude Representation.............................................................................................................45
3.2.2 One’s Complement Integer Representation...........................................................................................46
3.2.3 Two’s Complement Integer Representation...........................................................................................46
3.2.4 Excess-N Integer Representation...........................................................................................................47
3.3 Floating Point Numbers................................................................................................................................48
3.3.1 The IEEE Floating Point........................................................................................................................49
3.4 Other Binary Codes......................................................................................................................................50
3.4.1 Binary Coded Decimal (BCD)...............................................................................................................50
3.4.2 Characters..............................................................................................................................................51
3.5 Review Questions.........................................................................................................................................51

Chapter 4 Basic Computer Organization and Design.......................................................................................53


4.1 Instruction Codes..........................................................................................................................................53
4.1.1 Stored Program Organization................................................................................................................54
4.1.2 Direct and Indirect Addressing Modes..................................................................................................55
4.2 Computer Registers......................................................................................................................................56
4.3 Computer Instructions..................................................................................................................................59
4.4. Timing and Control.....................................................................................................................................61
4.5. Instruction Cycle.........................................................................................................................................63
4.6. Memory-Reference Instructions..................................................................................................................64
4.7 Input-Output and Interrupt............................................................................................................................66
4.8 Design of Basic Computer............................................................................................................................68
4.9 Review Questions.........................................................................................................................................69

Chapter 5 Central Processing unit......................................................................................................................70


5.1 Introduction..................................................................................................................................................70
5.2 General Register Organizations....................................................................................................................71
5.3 Stack Organization.......................................................................................................................................75
5.4 Instruction Formats.......................................................................................................................................79
5.5 Addressing Mode..........................................................................................................................................83
5.6 Data Transfer and Manipulation...................................................................................................................85
5.7 Reduced Instruction Set computer (RISC)....................................................................................................91
5.8 Review Questions.........................................................................................................................................92

Chapter 6 Parallel Processing..............................................................................................................................93


6.1 Introduction to Parallel Processing...............................................................................................................93
6.2 Pipelining.....................................................................................................................................................96
6.2.1 Arithmetic Pipeline................................................................................................................................98
6.2.2 Instruction Pipeline..............................................................................................................................100
6.2.3 RISC Pipelines.....................................................................................................................................103
6.3 Vector Processing...................................................................................................................................103
6.4 Array Processor......................................................................................................................................104
6.5 Review Questions...................................................................................................................................104

Chapter 7 Input/Output Organization..............................................................................................................105


7.1 External Devices.........................................................................................................................................105
7.2 Input/output Module...................................................................................................................................106
7.3 Input/Output Techniques (Data transfer mode)...........................................................................................108
7.3.1 Programmed I/O..................................................................................................................................108
7.3.2 Interrupt Driven I/O.............................................................................................................................109
7.3.3 Direct Memory Access (DMA)...........................................................................................................111
7.4 Review Questions.......................................................................................................................................115

Chapter 8 Memory Organization......................................................................................................................116


8.1 Characteristics of Memory Systems...........................................................................................................116
8.2 Memory Hierarchy.....................................................................................................................................118
8.3 Types of Storage devices............................................................................................................................118
8.3.1 Main Memory......................................................................................................................................118
8.3.2 Types of ROM.....................................................................................................................................119
8.3.3 Cache Memory....................................................................................................................................120
8.3.4 External Memory.................................................................................................................................128
8.3.5 Virtual memory...................................................................................................................................130
8.4 Memory Management in Operating Systems..............................................................................................131
8.5 Review Questions.......................................................................................................................................132

Appendix: The ASCII Character Set....................................................................................................................I

References..........................................................................................................................................................VIII
ITec381 Computer Organization and Architecture

ITec 381: Computer Organization and


Architecture

Course Objectives
The overall objective of this course is to provide students with a fundamental understanding of the
functional components of a computer system, and how they are organized. The emphasis of the module
is on the hardware aspects of a system, and how hardware is used during the execution of software. This
course is aimed at introducing students with architecture (design) and organization of computer
components.

Learning Outcomes
By the end of the module the student would be able to:

 Understand Boolean algebra and its application in digital circuit design


 Understand the operation of electronic logic elements
 Understand data representation in computers
 Understand the organization of a computer system in terms of its main components
 Understand the instruction cycle
 Understand the detailsof a simple microprocessor’s(CPU’s) operations
 Understand different processor instruction sets
 Understand how parallel processing can be realized
 Understand input/output mechanisms in a computer
 Understand the various memory units and the memory hierarchy

i
ITec381 Computer Organization and Architecture

Chapter 1
Boolean Algebra and Digital Logic Circuits
Objectives:
After completing this chapter you will be able to:
 Understand Boolean algebra and its applications in circuit analysis
 Construct the truth table for a given Boolean function
 Be familiar with logic gates
 Simplify Boolean functions using different techniques
 Implement the circuit design the of Boolean functions
 Be familiar with combinational circuits
 Be familiar with Sequential circuits

1.1 Boolean Algebra


The digital circuitry in digital computers and other digital systems is designed, andits behavior is
analyzed, with the use of a mathematical discipline known as Booleanalgebra. The name is in honor of
an English mathematician George Boole, who proposedthe basic principles of this algebra in 1854 in his
treatise, An Investigation ofthe Laws of Thought on Which to Found the Mathematical Theories of Logic
andProbabilities. In 1938, Claude Shannon, a research assistant in the Electrical EngineeringDepartment
at M.I.T., suggested that Boolean algebra could be used tosolve problems in relay-switching circuit
design.Shannon’s techniqueswere subsequently used in the analysis and design of electronic digital
circuits.

Boolean algebra turns out to be a convenient tool in two areas:


 Analysis: It is an economical way of describing the function of digital circuitry.
 Design: Given a desired function, Boolean algebra can be applied to develop asimplified
implementation of that function.

Boolean algebra makes use of logical variables and logical operators.The possible values for a logical
variable are either TRUE or FALSE. For ease of use, these values are, conventionally, represented by 1
and 0 respectively. A system in which the only possible values are 0 and 1 is the binary number system.
Likewise,it is similar to the binary states of digital electronics and that is why Boolean algebra is used to
analyze digital circuits. The logical operators of Boolean algebra are AND, OR, and NOT, which are
symbolically represented by dot (∙), plus sign (+), and overbar( ¯ ). Often the dot is omitted in Boolean
expression. Hence, A∙B is written as AB without the dot.

The operation AND yields true (binary value 1) if and only if both of its operands aretrue.The operation
OR yields true if either or both of its operands are true.The unaryoperation NOT inverts the value of its
operand.

Other useful derived operators are NAND, NOR, and XOR. NAND (stands for NOT - AND) is a
combination of AND and NOT. It is the opposite of AND. NOR (NOT - OR) is formed by combining

Department of C.S GTC 2


ITec381 Computer Organization and Architecture

OR and NOT. NOR is the opposite of OR. Similarly XOR is equivalent to the equation X∙Y + X∙Y.
Operator NOT is a unary operator and others are binary operators. XOR yields 1 when only if exactly
one of the operands has the value 1.Except NOT, all operators and can be used with more than two
variables. Table 1-1 shows the truth tables for these Boolean operators. A truth table shows the results of
an operation for every possible combination of values for its variables.

Table 1-2 shows important identities of Boolean algebra. These identities are useful in simplifying
Boolean functions in order to find simple circuit designs.

Table 1-1: Truth table for Boolean operators


X Y X X+Y X∙Y X∙Y XY
X+Y
0 0 1 0 0 1 1 0

0 1 1 1 0 0 1 1

1 0 0 1 0 0 1 1

1 1 0 1 1 0 0 0

Table 1-2: Basic identities of Boolean algebra

1.2 Logic Gates


The fundamental building block of all digital logic circuits is the gate. Logical functionsare implemented
by the interconnection of gates.A gate is an electronic circuit that produces an output signal that is a
simpleBoolean operation on its input signals.The basic gates used in digital logic are AND,OR, NOT,
NAND, NOR, and XOR. Figure 1-1 depicts these six gates. Each gate isdefined in three ways: graphic
symbol, algebraic notation, and truth table.The symbologyused here is the IEEE standard, IEEE Std
91.Note that the inversion (NOT) operation is indicated by a circle.

Department of C.S GTC 3


ITec381 Computer Organization and Architecture

Each gate shown in Figure 1-1 has one or two inputs and one output. However,it is alreadystated that, all
of the gates except NOT can have more than twoinputs.Thus, they can be implemented with thee
inputs.When one or more of the values at the input are changed, the correct outputsignal appears almost
instantaneously, delayed only by the propagation time ofsignals through the gate (known as the gate
delay). In some cases, a gate is implemented with two outputs, oneoutput being the negation of the other
output.

Here we introduce a common term: we say that to assert a signal is to causesignal line to make a
transition from its logically false (0) state to its logically true(1) state. The true (1) state is either a high
or low voltage state, depending on thetype of electronic circuitry.
Figure 1-1: Basic logic gates

Typically, not all gate types are used in implementation. Design and fabricationare simpler if only one or
two types of gates are used.Thus, it is important to identifyfunctionally complete sets of gates.This
means that any Boolean function can be implementedusing only the gates in the set.The following are
functionally complete sets:

 AND, OR,NOT

Department of C.S GTC 4


ITec381 Computer Organization and Architecture

 AND,NOT
 OR,NOT
 NAND
 NOR

It should be clear that AND, OR, and NOT gates constitute a functionallycomplete set, because they
represent the three operations of Boolean algebra. Forthe AND and NOT gates to form a functionally
complete set, there must be a way tosynthesize the OR operation from the AND and NOT operations.

This can be doneby applying DeMorgan’s theorem:


A+B=A∙B
A OR B = NOT ((NOT A) AND (NOT B))

Similarly, the OR and NOT operations are functionally complete because they canbe used to synthesize
the AND operation.

Figure 1-2 (a) shows how the AND, OR, and NOT functions can be implementedsolely with NAND
gates, and Figure 2-2 (b)shows the same thing for NOR gates. Forthis reason, digital circuits can be, and
frequently are, implemented solely withNAND gates or solely with NOR gates. Although this may not
be the minimum-gate implementation, it has the advantage of regularity, which can simplify the
manufacturing process.

With gates, we have reached the most primitive circuit level of computer hardware.An examination of
the transistor combinations used to construct gates departsfrom that realm and enters the realm of
electrical engineering. For our purposes,however, we are content to describe how gates can be used as
building blocks toimplement the essential logical circuits of a digital computer.

Figure 1-2: (a) The use of NAND gate (b)The use of NOR gate

Department of C.S GTC 5


ITec381 Computer Organization and Architecture

1.3 Combinational Circuits


A combinational circuit is an interconnected set of gates whose output at any time is afunction only of
the input at that time.As with a single gate, the appearance of the inputis followed almost immediately
by the appearance of the output, with only gate delays.In general terms, a combinational circuit consists
of n binary inputs and m binaryoutputs. As with a gate, a combinational circuit can be defined in three
ways:

 Truth table: For each of the 2npossible combinations of input signals, thebinary value of each of
the m output signals is listed.
 Graphical symbols: The interconnected layout of gates is depicted.
 Boolean equations: Each output signal is expressed as a Boolean function ofits input signals.

1.3.1 Implementation of Boolean Functions


Any Boolean function can be implemented in electronic form as a network of gates.For any given
function, there are a number of alternative realizations. Consider theBoolean function represented by the
truth table in Table 1-3.We can express this functionby simply itemizing the combinations of values of
A,B, and C that cause F to be 1:

(1.1)
F = ABC + ABC + ABC
There are three combinations of input values that cause F to be 1, and if any oneof these combinations
occurs, the result is 1.This form of expression, for self-evidentreasons, is known as the sum of products
(SOP) form. Figure 1-3 shows a straightforwardimplementation with AND, OR, and NOT gates.

Table 1-3: Truth table for the function in Equation (1.1)


A B C F

0 0 0 0

0 0 1 0

0 1 0 1

0 1 1 1

1 0 0 0

1 0 1 0

1 1 0 1

1 1 1 0

Department of C.S GTC 6


ITec381 Computer Organization and Architecture

Figure 1-3: Sum of products implementation of Table 1-3

Another form can also be derived from the truth table.The SOP form expressesthat the output is 1 if any
of the input combinations that produce 1 is true.We canalso say that the output is 1 if none of the input
combinations that produce 0 is true.Thus,

F=(ABC)∙(ABC)∙(ABC)∙(ABC)∙(ABC)

This can be rewritten using a generalization of DeMorgan’s theorem:


(X∙Y∙Z)=X+Y+Z
Thus,

F=(A+B+C)∙(A+B+C)∙(A+B+C)∙(A+B+C)∙(A+B+C)=(A+B+C)∙(A+B+C)

(1.2)

This is in the product of sums (POS) form, which is illustrated in Figure 1-4. Forclarity,NOTgates are
not shown. Rather, it is assumed that each input signal and itscomplement are available.This simplifies
the logic diagram and makes the inputs tothe gates more readily apparent.

Thus, a Boolean function can be realized in either SOP or POS form. At thispoint, it would seem that the
choice would depend on whether the truth table containsmore 1s or 0s for the output function:The SOP
has one term for each 1, and thePOS has one term for each 0. However, there are other considerations:

Department of C.S GTC 7


ITec381 Computer Organization and Architecture

 It is often possible to derive a simpler Boolean expression from the truth tablethan either SOP or
POS.
 It may be preferable to implement the function with a single gate type(NAND or NOR).

The significance of the first point is that, with a simpler Boolean expression, fewer gates will be needed
to implement the function. Three methods that can be used to achieve simplification are
 Algebraic simplification
 Karnaugh maps
 Quine–McKluskey tables

1.3.2 Algebraic Simplification


Algebraic simplification involves the application of the identities of Table 1-2 to reduce the Boolean
expression to one with fewer elements. For example, Equation(1.1)can be simplified to:

(1.3)
F = AB + BC
Or even simpler as

F = B(A + C)

Figure 1-4: Product of sums implementation of Table 1-3

Department of C.S GTC 8


ITec381 Computer Organization and Architecture

This expression can be implemented as shown in Figure 1-5. The simplification of Equation (1.1) was
done essentially by observation. For more complex expressions, some more systematic approach is
needed.

Figure 1-5: Simplified implementation of Table 1-3

1.3.3 KarnaughMaps

For purposes of simplification, the Karnaugh map is a convenientway of representing a Boolean


function of a small number (up to four) of variables.The map is an array of 2 nsquares, representing all
possible combinations ofvalues of n binary variables. Figure 1-6a shows the map of four squares for a
functionof two variables. It is essential for later purposes to list the combinations in theorder 00, 01, 11,
10. Because the squares corresponding to the combinations are tobe used for recording information, the
combinations are customarily written abovethe squares. In the case of three variables, the representation
is an arrangement ofeight squares (Figure 1-6b), with the values for one of the variables to the left
andfor the other two variables above the squares. For four variables, 16 squares areneeded, with the
arrangement indicated in Figure 1-6c.

The map can be used to represent any Boolean function in the following way.Each square corresponds
to a unique product in the sum-of-products form, with a 1value corresponding to the variable and a 0
value corresponding to the NOT of that variable. Thus, the product corresponds to the fourth square in
Figure 1-6a. For each such product in the function, 1 is placed in the corresponding square. Thus, the
map in Figure 1-6a corresponds to the two-variable example. Given the truth table of a
Boolean function, it is an easy matter to construct the map: for each combination of values of variables
that produce a result of 1 in the truth table, fill in the corresponding square of the map with 1. Figure 1-
6b shows the result for the truth table of Table 1-3. To convert from a Boolean expression to a map, it is
first necessary to put the expression into what is referred to as canonical form: each term in the
expression must contain each variable. So, for example, if we have Equation (1.3), we must first expand
it into the full form of Equation (1.1) and then convert this to a map.

The labeling used in Figure 1-6d emphasizes the relationship between variables and the rows and
columns of the map. Here the two rows embraced by the symbol A are those in which the variable A has
the value 1; the rows not embraced by the symbol A are those in which A is 0; similarly for B, C, and D.

Once the map of a function is created, we can often write a simple algebraic expression for it by noting
the arrangement of the 1s on the map. The principle is as follows. Any two squares that are adjacent
differ in only one of the variables. If two adjacent squares both have an entry of one, then the

Department of C.S GTC 9


ITec381 Computer Organization and Architecture

corresponding product terms differ in only one variable. In such a case, the two terms can be merged by
eliminating that variable. For example, in Figure 1-7a, the two adjacent squares correspond to the two
terms ABCD and ABCD . Thus, the function expressed is
ABCD + ABCD = ABD

(d) Simplified labeling of map

Figure 1-6: The use of Karnaugh maps to represent Boolean fumctions

This process can be extended in several ways. First, the concept of adjacencycan be extended to include
wrapping around the edge of the map. Thus, the topsquare of a column is adjacent to the bottom square,
and the leftmost square of arow is adjacent to the rightmost square. These conditions are illustrated in
Figures1-7b and c. Second, we can group not just 2 squares but 2 nadjacent squares (that is,2, 4, 8, etc.).
The next three examples in Figure 1-7 show groupings of 4 squares.Note that in this case, two of the
variables can be eliminated. The last three examplesshow groupings of 8 squares, which allow three
variables to be eliminated.

We can summarize the rules for simplification as follows:

1. Among the marked squares (squares with a 1), find those that belong to aunique largest block of
1, 2, 4, or 8 and circle those blocks.
2. Select additional blocks of marked squares that are as large as possible and asfew in number as
possible, but include every marked square at least once. Theresults may not be unique in some
cases. For example, if a marked square combineswith exactly two other squares, and there is no
fourth marked square tocomplete a larger group, then there is a choice to be made as two which
of thetwo groupings to choose.When you are circling groups, you are allowed to usethe same 1
value more than once.

Department of C.S GTC 10


ITec381 Computer Organization and Architecture

3. Continue to draw loops around single marked squares, or pairs of adjacent marked squares, or
groups of four, eight, and so on in such a way that everymarked square belongs to at least one
loop; then use as few of these blocks aspossible to include all marked squares.

Figure 1-7: The use of Karnaugh maps

Figure 1-8a, based on Table 1-3, illustrates the simplification process. If anyisolated 1s remain after the
groupings, then each of these is circled as a group of 1s.Finally, before going from the map to a
simplified Boolean expression, any group of1s that is completely overlapped by other groups can be
eliminated.This is shown inFigure 1-8b. In this case, the horizontal group is redundant and may be
ignored increating the Boolean expression.

One additional feature of Karnaugh maps needs to be mentioned. In somecases, certain combinations of
values of variables never occur, and therefore the correspondingoutput never occurs. These are referred
to as “don’t care” conditions.For each such condition, the letter “d” is entered into the corresponding
square ofthe map. In doing the grouping and simplification, each “d” can be treated as a 1 or
0,whichever leads to the simplest expression.

Department of C.S GTC 11


ITec381 Computer Organization and Architecture

Figure 1-8: Overlapping groups

1.3.4 The Quine–McKluskey Method

For more than four variables, the Karnaughmap method becomes increasinglycumbersome. With five
variables, twomaps are needed, with one map considered to be on top of the other in three dimensionsto
achieve adjacency. Six variables require the use of four tablesin four dimensions! An alternative
approach is a tabular technique, referred to asthe Quine–McKluskey method.The method is suitable for
programming on a computerto give an automatic tool for producing minimized Boolean expressions.

The method is best explained by means of an example. Consider the followingexpression:

Let us assume that this expression was derived from a truth table.We would like toproduce a minimal
expression suitable for implementation with gates.

The first step is to construct a table in which each row corresponds to one of theproduct terms of the
expression. The terms are grouped according to the number ofcomplemented variables.That is, we start
with the term with no complements, if it exists,then all terms with one complement, and so on. Table 1-4
shows the list for theexample expression, with horizontal lines used to indicate the grouping. For clarity,
each term is represented by a 1 for each uncomplemented variable and a 0 for each complemented
variable. Thus, we group terms according to the number of 1s they contain.

Table 1-4: First stage of Quine-McKluskey method

Department of C.S GTC 12


ITec381 Computer Organization and Architecture

The index column is simply the decimal equivalent and is useful in what follows.The next step is to find
all pairs of terms that differ in only one variable, that is,all pairs of terms that are the same except that
one variable is 0 in one of the termsand 1 in the other. Because of the way in which we have grouped the
terms, we cando this by starting with the first group and comparing each term of the first groupwith
every term of the second group.Then compare each term of the second groupwith all of the terms of the
third group, and so on.Whenever a match is found, placea check next to each term, combine the pair by
eliminating the variable that differsin the two terms, and add that to a new list.Thus, for example, the
termsandare
ABCD combined ABCDto produce ABC.This process continues until the entire originaltable has been
examined.The result is a new table with the following entries:

ACDABCABD
BCD ACD
ABCBCD
ABD

The new table is organized into groups, as indicated, in the same fashion as thefirst table.The second
table is then processed in the same manner as the first.That is,terms that differ in only one variable are
checked and a new term produced for a thirdtable. In this example, the third table that is produced
contains only one term:BD.In general, the process would proceed through successive tables until a
tablewith no matches was produced. In this case, this has involved three tables.

Once the process just described is completed, we have eliminated many of thepossible terms of the
expression.Those terms that have not been eliminated are used toconstruct a matrix, as illustrated in
Table 1-5. Each row of the matrix corresponds toone of the terms that have not been eliminated (has no
check) in any of the tables usedso far. Each column corresponds to one of the terms in the original
expression.An X isplaced at each intersection of a row and a column such that the row element is
“compatible”with the column element. That is, the variables present in the row elementhave the same
value as the variables present in the column element. Next, circle each Xthat is alone in a column. Then

Department of C.S GTC 13


ITec381 Computer Organization and Architecture

place a square around each X in any row in which there is a circled X. If every column now has either a
squared or a circled X, then we are done, and those row elements whose Xs have been marked constitute
the minimal expression. Thus, in our example, the final expression is

ABC + ACD + ABC + ACD

In cases in which some columns have neither a circle nor a square, additional processing is required.
Essentially, we keep adding row elements until all columns are covered.

Table 1-5: Last stage of Quine-McKluskey method

Let us summarize the Quine–McKluskey method to try to justify intuitivelywhy it works. The first phase
of the operation is reasonably straightforward. Theprocess eliminates unneeded variables in product
terms. Thus, the expressionis equivalent to AB, because

ABC + ABC = AB (C + C) = AB

After the elimination of variables, we are left with an expression that is clearlyequivalent to the original
expression. However, there may be redundant terms inthis expression, just as we found redundant
groupings in Karnaugh maps.The matrixlayout assures that each term in the original expression is
covered and does so in away that minimizes the number of terms in the final expression.

1.4 Sequential Circuits


Combinational circuits implement the essential functions of a digital computer.However, except for the
special case of ROM, they provide no memory or state information,elements also essential to the
operation of a digital computer. For the latter purposes, a more complex form of digital logic circuit is
used: the sequentialcircuit. The current output of a sequential circuit depends not only on the
currentinput, but also on the past history of inputs. Another and generally more useful wayto view it is
that the current output of a sequential circuit depends on the currentinput and the current state of that
circuit.

Inthis section, we examine sequential circuits taking flip-flops as example.As will be seen, the
sequential circuit makes use of combinational circuits.

Department of C.S GTC 14


ITec381 Computer Organization and Architecture

1.4.1 Flip-Flops

The simplest form of sequential circuit is the flip-flop. There are a variety of flipflops,all of which share
two properties:

 The flip-flop is a bistable device, i.e. has two stable states. It exists in one of two states and, in
the absenceof input, remains in that state.Thus, the flip-flop can function as a 1-bit memory.
 The flip-flop has two outputs, which are always the complements of each other.These are
generally labeled Q and.Q

The S–R Latch


Figure 1-9 shows a common configuration known as the S–Rflip-flop or S–R latch.The circuit has two
inputs, S (Set) and R (Reset), and two outputs,Q and, and consists of two NOR gates connected in a
feedback arrangement. Q

Figure 1-9: The S-R latch implemented with NOR gate

First, let us show that the circuit is bistable. Assume that both S and R are 0 and that Q is 0. The inputs
to the lower NOR gate are and Thus, the output means that the inputs to the upper NOR gate are Q = 0
and S = 0. Thus the output Q = 1 means that the inputs to the upper NOR gate are Q = 1 and R = 0 which
has the output Q = 0. Thus, the state of the circuit is internally consistent and remains stable as long as S
= R = 0. A similar line of reasoning shows that the state Q = 1, Q = 0 is also stable for R = S = 0.
Thus, this circuit can function as a 1-bit memory. We can view the output Q as the “value” of the bit.
The inputs S and R serve to write the values 1 and 0, respectively, into memory. To see this, consider the
state Q = 0, Q = 1, S = 0, R = 0. Suppose that S changes to the value 1. Now the inputs to the lower
NOR gate are S = 1, Q = 0. After some time delay Δt, the output of the lower NOR gate will be Q = 0
(see Figure 1-14). So, at this point in time, the inputs to the upper NOR gate become R = 0, Q= 0.

Department of C.S GTC 15


ITec381 Computer Organization and Architecture

Figure 1-10: NOR S-R latch timing diagram

After another gate delay of Δt the output Q becomes 1. This is again a stable state. The inputs to the
lower gate are now S = 1, Q = 1, which maintain the output Q = 0. As long as S = 1 and R = 0, the
outputs will remain Q = 1, Q = 0. Furthermore, if S returns to 0, the outputs will remain unchanged.

The R output performs the opposite function. When R goes to 1, it forcesQ = 0, Q = 1 regardless of the
previous state of Q and. Q
Again, a time delay of2Δt occurs before the final state is established (Figure 1-
10).

The S–R latch can be defined with a table similar to a truth table, called acharacteristic table, which
shows the next state or states of a sequential circuit as a functionof current states and inputs. In the case
of the S–R latch, the state can be definedby the value of Q. Table 1-6a shows the resultingcharacteristic
table. Observe thatthe inputs S = 1, R = 1 are not allowed, because these would produce an
inconsistentoutput (bothQ Q andequal 0).The table can be expressed more compactly, as in Table1-6b. An
illustration of the behavior of the S–R latch is shown in Table 1-6c.

Department of C.S GTC 16


ITec381 Computer Organization and Architecture

Table 1-6: The S-R latch


(a) Characteristics Table (b) Simplified Characteristics Table
Curren Next S R Qn+1
Current inputs
t state state 0 0 Qn
S R Qn Qn+1 0 1 0
0 0 0 0 1 0 1
0 0 1 1 1 1 -
0 1 0 0
0 1 1 0
1 0 0 1
1 0 1 1
1 1 0 -
1 1 1 -

(c) Response to Series of Inputs

t 0 1 2 3 4 5 6 7 8 9

S 1 0 0 0 0 0 0 0 1 0

R 0 0 0 1 0 0 1 0 0 0

Qn+1 1 1 1 0 0 0 0 0 1 1

Clocked S–R Flip-Flop

The output of the S–R latch changes, after a brief time delay, in response to a change in the input. This is
referred to as asynchronous operation. More typically, events in the digital computer are synchronized to
a clock pulse, so that changes occur only when a clock pulse occurs. Figure 1-11 shows this
arrangement. This device is referred to as a clocked S–R flip-flop. Note that the R and S inputs are
passed to the NOR gates only during the clock pulse.

Figure 1-11: Clocked S-R flip-flop

Department of C.S GTC 17


ITec381 Computer Organization and Architecture

D Flip-Flop

One problem with S–R flip-flop is that the condition must be avoided. One way to do this is to allow just
a single input. The D flip-flop accomplishes this. Figure 1-12 shows a gate implementation and the
characteristic table of the D flip-flop. By using an inverter, the non-clock inputs to the two AND gates
are guaranteed to be the opposite of each other.

Figure 1-12: D flip flop

The D flip-flop is sometimes referred to as the data flip-flop because it is, in effect, storage for one bit of
data. The output of the D flip-flop is always equal to the most recent value applied to the input. Hence, it
remembers and produces the last input. It is also referred to as the delay flip-flop, because it delays a 0
or 1 applied to its input for a single clock pulse. We can capture the logic of the D flip-flop in the
following truth table:

D Qn+1
0 0
1 1

J–K Flip-Flop

Like the S–R flip-flop,it has two inputs. However, in this case all possible combinations of input values
arevalid. Figure 1-13 shows a gate implementation of the J–K flip-flop, and Figure1-14 shows its
characteristic table (along with those for the S–R and D flip-flops).Note that the first three combinations
are the same as for the S–R flip-flop.With noinput asserted, the output is stable. If only the J input is
asserted, the output is stable. If only the J input is asserted, the result is a setfunction, causing the output
to be 1; if only the K input is asserted, the result is areset function, causing the output to be 0.When both
J and K are 1, the function performedis referred to as the toggle function: the output is reversed.Thus, if
Q is 1 and1 is applied to J and K, then Q becomes 0.

Department of C.S GTC 18


ITec381 Computer Organization and Architecture

Figure 1-13: J-K flip-flop

Figure 1-14: Basic flip-flops

Department of C.S GTC 19


ITec381 Computer Organization and Architecture

1.5 Review Questions


1. Construct the truth table for each the following Boolean functions.
a) F = XY + XY(XY)

b) F = X+Y + XY

2. Simplify each the following Boolean functions using algebraic properties, Karnaugh Maps, and
theQuine–McKluskeymethod.
a) F = X+Y + X(XY) + Y(XY)

b) F = X(Y+Z) + XY+ XZ + YZ + X + Y + Z

3. Draw the logic diagram for the Boolean functions given in question 1 and the simplified functions in
question 2.
4. Construct the SOP and POS diagrams of the Boolean functions of question 1.
5. Explain the difference between combinational and sequential circuits.
6. List the different types of flip-flops and draw the block diagram of each.

Department of C.S GTC 20


ITec381 Computer Organization and Architecture

Chapter 2
Common Digital Components
Objective:
After completing this chapter you will be:
 Familiar with common combinational circuits (Adder, multiplexer, and decoder)
 Familiar with common sequential circuits (registers and binary counters)
 Familiar with memory circuits (RAM and ROM)
 Able to appreciate the characteristics and applications of different digital circuitry
 Familiar with integrated circuit (IC)

2.1 Integrated Circuits


Integrated circuit (IC) is the basic building block of digital circuits. An integrated circuit is a small
silicon semiconductor crystal, called a chip, containing the electronic components for the digital gates.
The various gates are interconnected inside the chip to form the required circuit. The chip is mounted in
a ceramic or plastic container, and connections are welded by thin gold wires to external pins to form the
IC. Each vendor publishes a catalog that contains the exact description and all the necessary information
about the ICs that it manufactures.

As the technology of ICs has improved, the number of gates that can be put in a single chip has
increased. Small-scale integration (SSI) devices contain several (usually less than 10) independent gates
in a single package. Medium-scale integration (MSI) devices contain approximately 10 to 200 gates in a
single package. They usually perform specific elementary digital functions such as decoders, adders, and
registers. Large-scale integration (LSI) devices contain between 200 and a few thousands gates in a
single package. They include digital systems such as processors, memory chips, and programmable
modules. Very-large-scale integration (VLSI) devices contain thousands of gates in a single package.
Examples are large memory arrays and complex microcomputer chips.

Digital integrated circuits are classified not only by their logic operation but also by the specific circuit
technology to which they belong. The circuit technology is referred to as a digital logic family. Each
logic family has its own basic electronic circuit upon which more complex digital circuits and functions
are developed. The basic circuit in each technology is either a NAND, a NOR, or an inverter gate. The
electronic components that are employed in the construction of the basic circuit are usually used for the
name of the technology. Many different logic families of integrated circuits have been introduced
commercially. The most popular are:

TTL Transistor-transistor logic


ECL Emitter-coupled logic
MOS Metal-oxide semiconductor
CMOS Complementary metal-oxide semiconductor

Department of C.S GTC 21


ITec381 Computer Organization and Architecture

Because of their many advantages, integrated circuits are used exclusively to provide various digital
components needed in the design of computer systems. To understand the organization and design of
digital computers it is very important to be familiar with the various components encountered in
integrated circuits. For this reason, the most basic components are introduced in this chapter with an
explanation of their logical properties. These components provide a catalog of elementary digital
functional units commonly used as basic building blocks in the design of digital computers.

2.2 Common Combinational Circuits


2.2.1 Adders

Binary addition differs from Boolean algebra in that the result includes a carry term. Thus,

However, addition can still be dealt with in Boolean terms. In Table 2-1a, we show the logic for adding
two input bits to produce a 1-bit sum and a carry bit. This truth table could easily be implemented in
digital logic. A digital arithmetic circuit that carries out the addition of a pair of bits is called a half
adder. However, we are not interested in performing addition on just a single pair of bits. Rather, we
wish to add two n-bit numbers along with a carry from a previous bitwise addition. Such digital circuit is
called a full adder. A combination of two half adders creates a full adder. This can be done by putting
together a set of adders so that the carry from one adder is provided as input to the next.Figure 2-1
shows the block diagram and the logic diagram for a half adder and Figure 2-2 shows that of a full
adder.

You can notice that the sum and carry columns of the half adder truth table are similar to that of an XOR
and AND logic operation truth tables. Therefore, the sum and carry of a pair of bits can be implemented
with an XOR and an AND gate as shown in Figure 2-1 and Figure 2-2.

Table 2-1: Binary addition truth tables


(a) Single-bit addition (half adder) (b)
A B Sum Carry Addition
with carry
0 0 0 0
input (full
0 1 1 0
adder)
1 0 1 0
1 Cin 1 A 0 B 1 Sum Cout
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0 1
1 0 0 1 0

Department of C.S GTC 22


ITec381 Computer Organization and Architecture

1 0 1 0 1
1 1 0 0 1
1 1 1 1 1

(a) Logic diagram (b) Block diagram

Figure 2-1: Half adder

(a) Logic diagram (b) Block diagram

Figure 2-2: Full adder

By combining a number of full adders, we can have the necessary logic to implement a multiple-bit
adder such as shown in Figure 2-3. Note that because the output from each adder depends on the carry
from the previous adder, there is an increasing delay from the least significant to the most significant bit.
Each single-bit adder experiences a certain amount of gate delay, and this gate delay accumulates. For
larger adders, the accumulated delay can become unacceptably high.

Department of C.S GTC 23


ITec381 Computer Organization and Architecture

Figure 2-3: Construction of a 32-bit adder using 8-bit adders

2.2.2 Multiplexers

The multiplexer connects multiple inputs to a single output. At any time, one of the inputs is selected to
be passed to the output. A general block diagram representation is shown in Figure 2-4. This represents
a 4-to-1 multiplexer. There are four input lines, labeled D0, D1, D2, and D3. One of these lines is
selected to provide the output signal F. To select one of the four possible inputs, a 2-bit selection code is
needed, and this is implemented as two select lines labeled S1 and S2.

D0
4-to-1
D1 MUX F
D2
D3

S2 S1

Figure 2-4: Block diagram of a 4-to-1 multiplexer

Table 2-2: 4-to-1 multiplexer truth table


S2 S1 F

0 0 D0

0 1 D1

1 0 D2

1 1 D3

Department of C.S GTC 24


ITec381 Computer Organization and Architecture

An example 4-to-1 multiplexer is defined by the truth table in Table 2-2. This is a simplified form of a
truth table. Instead of showing all possible combinations of input variables, it shows the output as data
from line D0, D1, D2, or D3. Figure 2-5 shows an implementation using AND, OR, and NOT gates. S1
and S2 are connected to the AND gates in such a way that, for any combination of S1 and S2, three of
the AND gates will output 0.The fourth AND gate will output the value of the selected line, which is
either 0 or 1. Thus, three of the inputs to the OR gate are always 0, and the output of the OR gate will
equal the value of the selected input gate. Using this regular organization, it is easy to construct
multiplexers of size 8-to-1, 16-to-1, and so on.

Figure 2-5: Implementation of a 4-to-1 multiplexer

Multiplexers are used in digital circuits to control signal and data routing. An example is the loading of
the program counter (PC).The value to be loaded into the program counter may come from one of
several different sources:

 A binary counter, if the PC is to be incremented for the next instruction


 The instruction register, if a branch instruction using a direct address has just been executed
 The output of the ALU, if the branch instruction specifies the address using a displacement mode

These various inputs could be connected to the input lines of a multiplexer, with the PC connected to the
output line. The select lines determine which value is loaded into the PC. Because the PC contains
multiple bits, multiple multiplexers are used, one per bit. Figure 2-6 illustrates this for 16-bit addresses.

Department of C.S GTC 25


ITec381 Computer Organization and Architecture

Figure 2-6: Multiplexer input to program counter

2.2.3 Decoders

A decoder is a combinational circuit with a number of output lines, only one of which is selected at any
time, depending on the pattern of input lines. In general, adecoder has n inputs and 2 noutputs. Figure 2-7
shows a decoder with three inputs and eight outputs.

Decoders find many uses in digital computers. One example is address decoding. Suppose we wish to
construct a 1K-byte memory using four–bit RAM chips. We want a single unified address space, which
can be broken down as follows:

Address Chip
0000 – 00FF 0
0100 – 01FF 1
0200 – 02FF 2
0300 – 03FF 3

Department of C.S GTC 26


ITec381 Computer Organization and Architecture

Figure 2-7: Decoder with 3 inputs and 23 = 8 outputs

Each chip requires 8 address lines, and these are supplied by the lower-order 8 bits of the address. The
higher-order 2 bits of the 10-bit address are used to select one of the four RAM chips. For this purpose, a
2-to-4 decoder is used whose output enables one of the four chips, as shown in Figure 2-8.

With an additional input line, a decoder can be used as a demultiplexer. The demultiplexer performs the
inverse function of a multiplexer; it connects a single input to one of several outputs. This is shown in
Figure 2-9. As before, n inputs are decoded to produce a single one of outputs. All of the output lines are
ANDed with a data input line. Thus, the n inputs act as an address to select a particular output line, and
the value on the data input line (0 or 1) is routed to that output line.

The configuration in Figure 2-9 can be viewed in another way. Change the label on the new line from
Data Input to Enable. This allows for the control of the timing of the decoder. The decoded output
appears only when the encoded input ispresent and the enable line has a value of 1.

Department of C.S GTC 27


ITec381 Computer Organization and Architecture

Figure 2-8: Address decoding

Figure 2-9: Implementation of a demultiplexer using a decoder

2.3 Common Sequential Circuits


2.3.1 Registers

As an example of the use of flip-flops, let us first examine one of the essential elements of the CPU: the
register. As we know, a register is a digital circuit used within the CPU to store one or more bits of data.
Two basic types of registers are commonly used: parallel registers and shift registers.

Parallel Registers

A parallel register consists of a set of 1-bit memories that can be read or written simultaneously. It is
used to store data. The 8-bit register of Figure 2-10 illustrates the operation of a parallel register using D
flip-flops. A control signal, labeled load, controls writing into the register from signal lines, D11

Department of C.S GTC 28


ITec381 Computer Organization and Architecture

through D18. These lines might be the output of multiplexers, so that data from a variety of sources can
be loaded into the register.

Figur
e 2-10: 8-bit parallel register

Shift Registers

A shift register accepts and/or transfers information serially. Consider, for example, Figure 2-11, which
shows a 5-bit shift register constructed from clocked D flip-flops. Data are input only to the leftmost
flip-flop. With each clock pulse, data are shifted to the right one position, and the rightmost bit is
transferred out.

Shift registers can be used to interface to serial I/O devices. In addition, they can be used within the
ALU to perform logical shift and rotate functions. In this latter capacity, they need to be equipped with
parallel read/write circuitry as well as serial.

Figure 2-11: 5-bit shift register

Department of C.S GTC 29


ITec381 Computer Organization and Architecture

2.3.2 Binary Counters

Another useful category of sequential circuit is the counter. A counter is a register whose value is easily
incremented by 1 modulo the capacity of the register; that is, after the maximum value is achieved the
next increment sets the counter value to 0. Thus, a register made up of n flip-flops can count up to 2 n-1.
An example of a counter in the CPU is the program counter.

Counters can be designated as asynchronous or synchronous, depending on the way in which they
operate. Asynchronous counters are relatively slow because the output of one flip-flop triggers a change
in the status of the next flip-flop. In a synchronous counter, all of the flip-flops change state at the same
time. Because the latter type is much faster, it is the kind used in CPUs.

Ripple Counter

An asynchronous counter is also referred to as a ripple counter, because the change that occurs to
increment the counter starts at one end and “ripples” through to the other end. Figure 2-12 shows an
implementation of a 4-bit counter using J–K flip-flops, together with a timing diagram that illustrates its
behavior. The timing diagram is idealized in that it does not show the propagation delay that occurs as
the signals move down the series of flip-flops. The output of the leftmost flip-flop (Q 0) is the least
significant bit. The design could clearly be extended to an arbitrary number of bits by cascading more
flip-flops.

In the illustrated implementation, the counter is incremented with each clock pulse. The J and K inputs
to each flip-flop are held at a constant 1. This means that, when there is a clock pulse, the output at Q
will be inverted (1 to 0; 0 to 1). Note that the change in state is shown as occurring with the falling edge
of the clock pulse; this is known as an edge-triggered flip-flop. Using flip-flops that respond to the
transition in a clock pulse rather than the pulse itself provides better timing control in complex circuits.
If one looks at patterns of output for this counter, it can be seen that it cycles through 0000, 0001, . . .,
1110, 1111, 0000, and so on.

Synchronous Counters

The ripple counter has the disadvantage of the delay involved in changing value, which is proportional
to the length of the counter. To overcome this disadvantage, CPUs make use of synchronous counters, in
which all of the flip-flops of the counter change at the same time.

For a 3-bit counter, three flip-flops will be needed. Let us use J–K flip-flops. Label the uncomplemented
output of the three flip-flops A, B, C respectively, with C representing the least significant bit. The first
step is to construct a truth table that relates the J–K inputs and outputs, to allow us to design the overall
circuit. Such a truth table is shown in Table 2-3. The first three columns show the possible combinations
of outputs A, B, and C. They are listed in the order that they will appear as the counter is incremented.

Department of C.S GTC 30


ITec381 Computer Organization and Architecture

Each row lists the current value of A, B, C and the inputs to the three flip-flops that will be required to
reach the next value of A, B, C.

Figure 2-12: Ripple counter

Department of C.S GTC 31


ITec381 Computer Organization and Architecture

Table 2-3: Truth table for a synchronous counter


C B A Jc Kc Jb Kb Ja Ka
0 0 0 0 d 0 d 1 d
0 0 1 0 d 1 d d 1
0 1 0 0 d d 0 1 d
0 1 1 1 d d 1 d 1
1 0 0 d 0 0 d 1 d
1 0 1 d 0 1 d d 1
1 1 0 d 0 d 0 1 d
1 1 1 d 1 d 1 d 1

To understand the way in which the truth table of Table 2-3 is constructed, it may be helpful to recast
the characteristic table for the J–K flip-flop. Recall that this table was presented as follows:

In this form, the table shows the effect that the J and K inputs have on the output. Now consider the
following organization of the same information:

In this form, the table provides the value of the next output when the inputs and the present output are
known. This is exactly the information needed to design the counter or, indeed, any sequential circuit. In
this form, the table is referred to as an excitation table.

Let us return to Table 2-3. Consider the first row. We want the value of A to remain 0, the value of B to
remain 0, and the value of C to go from 0 to 1 with the next application of a clock pulse. The excitation
table shows that to maintain an output of 0, we must have inputs of J = 0 and don’t care for K. To effect
a transition from 0 to 1, the inputs must be J = 1 and K = d. These values are shown in the first row of
the table. By similar reasoning, the remainder of the table can be filled in.

Having constructed the truth table of Table 2-3, we see that the table shows the required values of all of
the J and K inputs as functions of the current values of A, B, and C. With the aid of Karnaugh maps, we
can develop Boolean expressions for these six functions. This is shown in part b of the figure. For
example, the Karnaugh map for the variable Ja (the J input to the flip-flop that produces the A output)
yields the expression Ja = BC. When all six expressions are derived, it is a straightforward matter to
design the actual circuit, as shown in part c of Figure 2-13.

Department of C.S GTC 32


ITec381 Computer Organization and Architecture

2.4 Memory Circuits


2.4.1 Random-Access Memory
In random access memory (RAM) the memory cells can be accessed for information transfer from any
desired random location. That is, the process of locating a word in memory is the same and requires an
equal amount of time no matter where the cells are located physically in memory: thus the name
“random access.” RAM is made of sequential circuit components.
Communication between a memory and its environment is achieved through data input and output lines,
address selection lines, and control lines that specify the direction of transfer. A block diagram of a
RAM unit is shown in Figure 2-14. The n data input lines provide the information to be stored in
memory, and the n data output lines supply the information coming out of memory. The k address lines
provide a binary number of k bits that specify a particular word chosen among the 2 k available inside the
memory. The two control inputs specify the direction of transfer desired.

The two operations that a RAM can perform are the write and read operations. The write signal specifies
a transfer-in operation and the read signal specifies a transfer-out operation. On accepting one of these
control signals, the internal circuits inside the memory provide the desired function. These steps that
must be taken for the purpose of transferring a new word to be stored into memory are as follows:

1. Apply the binary address of the desired word into the address line.
2. Apply the data bits that must be stored in memory into the data input lines.
3. Activate the write input.

The memory unit will then take the bits presently available in the input data lines and store them in the
word specified by the address lines.

Department of C.S GTC 33


ITec381 Computer Organization and Architecture

Figure 2-13: Design of a synchronous counter

Figure 2-14: Block diagram of random access memory (RAM)


n data input lines

k address lines
Memory unit
read 2k words
write n bits per word

n data output lines

The steps that must be taken for the purpose of transferring a stored word out of memory are as follows:

Department of C.S GTC 34


ITec381 Computer Organization and Architecture

1. Apply the binary address of the desired word into the address lines.
2. Activate the read input.

The memory unit will then take the bits from the word that has been selected by the address and apply
them into the output data lines. The content of the selected word does not change after reading.

2.4.2 Read-Only Memory

Combinational circuits are often referred to as “memoryless” circuits, because their output depends only
on their current input and no history of prior inputs is retained. However, there is one sort of memory
that is implemented with combinational circuits, namely read-only memory (ROM).

Recall that a ROM is a memory unit that performs only the read operation. This implies that the binary
information stored in a ROM is permanent and was created during the fabrication process. Thus, a given
input to the ROM (address lines) always produces the same output (data lines). Because the outputs are a
function only of the present inputs, the ROM is in fact a combinational circuit. Figure 2-15 shows the
block diagram for ROM.

A ROM can be implemented with a decoder and a set of OR gates. As an example, consider Table 2-4.
This can be viewed as a truth table with four inputs and four outputs. For each of the 16 possible input
values, the corresponding set of values of the outputs is shown. It can also be viewed as defining the
contents of a 64-bit ROM consisting of 16 words of 4 bits each. The four inputs specify an address, and
the four outputs specify the contents of the location specified by the address. Figure 2-16 shows how this
memory could be implemented using a 4-to-16 decoder and four OR gates.

Figure 2-15: Block diagram of read only memory (ROM)


kaddress input lines

m×n ROM
(m = 2k)

Input Output
0 0 0 0 0 0 0 0 Table 2-4: Truth table for a ROM
0 0n data 0output lines
1 0 0 0 1
0 0 1 0 0 0 1 1
0 0 1 1 0 0 1 0
0 1 0 0 0 1 1 0
0 1 0 1 0 1 1 1
0 1 1 0 0 1 0 1
0 1 1 1 0 1 0 0
1 0 0 0 1 1 0 0
1 0 0 1 1 1 0 1
1 0 1 0 1 1 1 1
1 0 1 1 1 1 1 0
Department
1 of1 C.S 0 0 1 0 1 0 GTC 35
1 1 0 1 1 0 1 1
1 1 1 0 1 0 0 1
1 1 1 1 1 0 0 0
ITec381 Computer Organization and Architecture

Types of ROM

The required paths in a ROM may be programmed in three different ways. The first, mask
programming, is done by the semiconductor company during the last fabrication process of the unit. The
procedure for fabricating a ROM requires that the customer fill out the truth table that he wishes the
ROM to satisfy. The truth table may be submitted in a special form provided by the manufacturer or in a
specified format on a computer output medium. The manufacturer makes the corresponding mask for the
paths to produce the 1’s and 0’s according to the customer’s truth table. This procedure is costly because
the vendor changes the customer a special fee for custom masking the particular ROM. For this reason,
mask programming is economical only if a large quantity of the same ROM configuration is to be
ordered.

Figure 2-16: A 64-bit ROM


Department of C.S GTC 36
ITec381 Computer Organization and Architecture

For small quantities it is more economical to use a second type of ROM called a programmable read-
only memory or PROM. When ordered, PROM units contain all the fuses intact, giving all 1’s in the bits
of the stored words. The fuses in the PROM are blown by application of current pulses through the
output terminals for each address. A blown fuse defines a binary 0 state, and an intact fuse gives a
binary 1 state. This allows users to program PROMs in their own laboratories to achieve the desired
relationship between input addresses and stored words. Special instruments called PROM programmers
are available commercially to facilitate this procedure. In any case, all procedures for programming
ROMs are hardware procedures even though the word “programming” is used.

The hardware procedure for programming ROMs or PROMs is irreversible, and once programmed, the
fixed pattern is permanent and cannot be altered. Once a bit pattern has been established, the unit must
be discarded if the bit pattern is to be changed. A third type of ROM available is called erasable PROM
or EPROM. The EPROM can be restructured to the initial value even though its fuses have been blown
previously. When the EPROM is placed under a special ultraviolet light for a given period of time, the
shortwave radiation discharges the internal gates that serve as fuses. After erasure the EPROM returns to
its initial state and can be reprogrammed to a new set of words. Certain PROMs can be erased with
electrical signals instead of ultraviolet light. These PROMs are called electrically erasable PROM or
EEPROM.

2.5 Review Questions


1. Explain the purpose of the following digital circuits. Describe how many input and output they have.
 Half Adder  Decoder
 Full Adder  Binary Counter
 Multiplexer

2. Draw the block diagram for the digital circuits listed in question 1.

3. Classify the following digital circuits into two, as combinational and sequential circuit.
 Half Adder  Binary Counter
 Full Adder  Register
 Multiplexer  RAM
 Decoder  ROM

4. If the input lines of the multiplexer in Figure 2-4 accept input as D0 = 1, D1 = 1, D2 = 0, and D3 = 1,
and the select lines are set as S1 = 1, and S2 = 1, which one of the input lines is selected?

5. If the 4 input lines (A, B, C, and D) of a 4-to-16 decoder are 1001, then which of the output lines (D0
– D15) is selected?

Department of C.S GTC 37


ITec381 Computer Organization and Architecture

Chapter 3
Data Representation
Objectives:
After completing this chapter you will be able to:
 Understand binary, octal, and hexadecimal number system
 Perform conversions among decimal, binary. Octal, and hexadecimal numbers systems
 Understand different integer representations
 Understand floating point number representations
 Perform computer arithmetic
 Understand representation of characters

Data in computers is represented in binary form. The represented data can be number, text, movie, color
(picture), sound, or anything else. It is up to the application software that presents the data to portray the
data accordingly. We enter data into a computer using letters, digits & special symbols. But inside the
computer, there is no color, letter, digit or any other character inside the computer system unit.

Just like any other electrical device, computers understand and respond to only the flow of electrical
charge. They also have storage devices that work based on magnetism. This shows that the overall
structure of computers work only in binary conditions (the semi-conductors are conducting or not
conducting, a switch is closed or opened, a magnetic spot is magnetized or demagnetized). Hence, data
must be represented in the form of binary code that has a corresponding electrical signal.

The form of binary data representation system we are seeking is similar to the binary number system in
mathematics. Nevertheless we humans are not accustomed to the use of binary numbers. The main focus
of this chapter is on how data is represented in computers. Since understanding of binary number system
is essential to understand binary data representation, conversion of numbers from a given number base
to another base, is also discussed here. The number systems (bases) we will discuss are: decimal, binary,
octal, and hexadecimal.

3.1 Number systems


Basically, there are two types of number systems.

I. Non-positional number system: The symbols of the number have the same value regardless of its
position in the number. The value of a symbol (digit) in a number does not depend on the position of
the digit in number.

II. Positional number system: The value of a symbol in the number is determined by its position, the
symbol and the base of the number system. In all positional number systems, the base has the
following properties

Department of C.S GTC 38


ITec381 Computer Organization and Architecture

1. It determines the number of different symbols it has. For example, there are 10 different symbols
in base 10 (decimal) number system and there are 2 symbols in base 2 (binary) number system.
2. The maximum value of a single digit is one less than the base value. For example, the largest
single digit number in the decimal (base 10) number system is 9.
3. The positional value of each symbol is expressed by the power of the base. For example, the
value of the symbol 7 in the decimal number 75 is 70 (7×10 1) while the value of 7 in the decimal
number 756 is 700 (7×102). The source of the variation is the position of the digit in the number
and this value is expressed as a multiple of powers of the base.

In positional number system, the base of a number is indicated by writing it as a subscript at the right
side of the number. For instance the number 25110 is the decimal number two hundred fifty one, and
2518 is two-five-one octal (note that it is not read as two hundred fifty one octal). Often the base of a
decimal number is not indicated. If the base of a number is not shown, it is assumed to be a decimal
number.

Decimal number system

The decimal number system, also called the base 10 number system, is the number system we use in our
day-to-day life. The preference of this number system by humans is attributed to their nature that
humans have 10 fingers. It is believed that humans start counting using their fingers. This fact is the
basis for the preference of the decimal number system by humans.

The decimal number system has 10 different symbols identified as 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. All
decimal numbers are written as a combination of these 10 digits.

Binary number system

Although the number system is easily understood by humans, it cannot be used to represent data in
computers because there are only two (binary) states in a computer system. On the other hand, the
binary number system, also known as base 2 number system, has two digits 0 and 1. This makes it useful
for data representation in computers. The two digits of the binary number system correspond to the two
distinct states of the digital electronics. A binary digit is referred to as a bit.

We can associate the two digits of the binary number system with two states of electrical systems,
magnetic systems, and switches. The following table shows the conventional association. It is also
possible to exchange the association but it becomes confusing since it is the opposite of the convention
(what is agreed on).

Department of C.S GTC 39


ITec381 Computer Organization and Architecture

0 1
Electronic No current There is current
Magnetic Demagnetized Magnetized
Switch Off On

Data representation using the binary number system results in a large string of 0s and 1s. This makes the
represented data large and difficult to read. Writing such a binary string becomes tedious as well. For the
sake of writing the binary strings in a short hand form and make them readable, the octal and the
hexadecimal number systems are used.

Octal number system

Octal number system, also called base 8 number system, has 8 different symbols: 0, 1, 2, 3, 4, 5, 6, and
7. The octal number system is used to write binary numbers in short form. An octal number has about
one-third of the digits in its binary equivalent.

Hexadecimal number system

The hexadecimal number system, also called base 16 number system, has 16 different symbols 0, 1, 2, 3,
4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. The hexadecimal number system is usually referred as hex for short.
It is used to write binary numbers in short form. A hex number has about one-fourth of the digits in its
binary equivalent. Memory addresses and MAC addresses are usually written in hex.

3.1.1 Converting from one base to another

I. Conversion from Decimal to Base m

Step 1: Divide the given decimal number by m (the desired base). The result will have a quotient and a
remainder.
Step 2: Divide the quotient by m. Still you get a quotient and a remainder.
Step 3: Repeat step 2 until the quotient becomes 0. You should note that we are conducting integer
division. In integer division n/m, the quotient is 0 whenever n < m.
Step 4: Collect and arrange the remainders in such a way that the first remainder is the least significant
digit and the last remainder is the most significant digit (i.e., RnRn-1 … R2R1).

Department of C.S GTC 40


ITec381 Computer Organization and Architecture

Example: Convert the following decimal number 47 into binary, octal, and hexadecimal.
a. Conversion to binary
In order to convert the given decimal numbers into binary (base 2), they are divided by 2.
Quotient Remainder
47 ÷ 2 23 1
23 ÷ 2 11 1
11 ÷ 2 5 1
5÷2 2 1
2÷2 1 0
1÷2 0 1

Since the quotient becomes 0 at the last division, the division has to stop and we should collect the
remainders starting from the last one. Hence the result is 1011112. Note that, starting from the second
division, at each consecutive division the quotient of the previous division is used as the dividend.

b. Conversion to octal
Here the numbers are divided by 8 because the required base is octal (base 8).
Quotient Remainder
47 ÷ 8 5 7
5÷8 0 5

Therefore, 47 = 578

c. Conversion to hexadecimal
Since the conversion now is into hexadecimal (base 16) the given decimal numbers are divided by 16.
Quotient Remainder
47 ÷ 16 2 15
2 ÷ 16 0 2

Remember that the remainders are all in decimal and when you write the result in the required base, the
remainders has to be converted into that base. The hexadecimal equivalent for the decimal 15 is F and
that of 2 is 2. For conversion of decimal numbers into binary and octal or vice versa, there is no problem
of looking for the equivalent of each remainder. You need such conversion of a remainder only when the
remainder is a double digit number. Therefore, 47 = 2F16

II. Conversion from Base m to Decimal

Step 1: Multiply each digit by its positional value.


Step 2: Calculate the sum of the products you get in step 1. The resulting sum you get is the decimal
equivalent for the given number in base m.

Department of C.S GTC 41


ITec381 Computer Organization and Architecture

Example 1: Convert the binary number 110001 into decimal.

1100012 = (1 × 25) + (1 × 24) + (0 × 23) + (0 × 22) + (0 × 21) + (1 × 20)


= (1 × 32) + (1 × 16) + (0 × 8) + (0 × 4) + (0 × 2) + (1 × 1)
= 32 + 16 + 0 + 0 + 0 + 1
= 49

Therefore, 1100012 = 49

It is evident that calculating the product of 0s since the product is 0 and do not contribute anything to the
final result. However, you should remember to skip the positional value as well.

Example 2: Convert the octal number 22 into decimal.

228 = (2 × 81) + (2 × 80)


= (2 × 8) + (2 × 1)
= 16 + 2
= 18

Therefore, 228 = 18

Example 3: Convert the hexadecimal number D1 into decimal.

D116 = (13 × 161) + (1 × 160); you should be aware that the calculations are in decimal thus the hex digit
D must first be converted into its decimal equivalent (13).
= (13 × 16) + (1 × 1)
= 208 + 1
= 209

Therefore, D116 = 209

III. Conversion from Binary to Octal

It is possible to use decimal number system as an intermediate base to convert from any base to any
other base. However, for conversion from binary to octal or vice versa, there is a very simple method.

Step 1: Group the binary digits (bits) starting from the rightmost digit. Each group should contain 3 bits.
If the remaining bits at the leftmost position are fewer than 3, add 0s at the front.
Step 2: For each 3-bit binary string, find the corresponding octal number. Table 3-1 shows the
conversion equivalents.

Department of C.S GTC 42


ITec381 Computer Organization and Architecture

Example: Convert the binary numbers 110011 and 1101111 to octal.

A. 110011

110011
6 3

The bits are grouped in three with the equivalent octal digit given below the three bit group. Thus,
1100112 = 638

B. 1101111

001101111
1 5 7

Since we are left with a single bit at the leftmost position, two 0s are added at the front to make create a
three-bit group. The result shows that 11011112 = 1578.

IV. Conversion from Octal to Binary

Step 1: For each octal digit, find the equivalent three digit binary number.
Step 2: If there are leading 0s for the binary equivalent of the leftmost octal digit, remove them.

Example: Find the binary equivalent for the octal numbers 73 and 160.

A. 73
7 3
111 011

Since there are no leading 0s at the leftmost position (the bits for octal 7), there is no 0 to remove.
Therefore, 738 = 1110112

B. 160
1 6 0
001 110 000

The binary equivalent for the leftmost octal digit 1 has two 0s. To get the final result, remove them and
concatenate the rest. Thus, 1608 = 11100002

V. Conversion from Binary to Hexadecimal

One possible way to convert a binary number to hexadecimal, is first to convert the binary number to
decimal and then from decimal to hex. Nevertheless, the simple way to convert binary numbers to hex is
by grouping as used in conversion to octal. Here a single group has 4 bits.

Department of C.S GTC 43


ITec381 Computer Organization and Architecture

Step 1: Starting from the rightmost bit, group the bits in 4. If the remaining bits at the leftmost position
are fewer than 4, add 0s at the front.
Step 2: For each 4-bit group, find the corresponding hexadecimal number. You can use Table 3-1 to find
conversion equivalents.

Example: Convert the binary numbers 1110110001 and 10011110 to hexadecimal.

A. 1110110001
001110110001
2 B 1

Therefore, 11101100012 = 2B116

B. 10011110
10011110
9 E

Therefore, 100111102 = 9E16

VI. Conversion from Hexadecimal to Binary

Step 1: For each hexadecimal digit, find the equivalent four digit binary number.
Step 2: If there are leading 0s for the binary equivalent of the leftmost hexadecimal digit, remove them.

Example: Find the binary equivalents for the hexadecimal numbers 1C and 823.

A. 1C
1 C
0001 1100

After removing the leading 0s for the binary equivalent of the leftmost hexadecimal number 1, the result
becomes 11100. Thus, 1C16 = 111002

B. 823
8 2 3
1000 0010 0011

There is no leading 0s for the binary equivalent of the hexadecimal number 8, we simply concatenate the
binary digits to get the final result. Hence, 82316 = 1000001000112

Department of C.S GTC 44


ITec381 Computer Organization and Architecture

VII. Conversion from Octal to Hexadecimal of Vice Versa

The decimal number system can be used as an intermediate conversion base. As it is shown in the above
sections, however, it the binary number system is quite convenient for conversion to or from octal and
hexadecimal. To convert an octal number to a hexadecimal number or from hexadecimal to octal, the
binary number system is used as an intermediate base.

Step 1: Convert the given number into binary.


Step 2: Convert the binary number you got in step 1 into the required base.
Example : Convert the octal number 647 to hexadecimal.
Convert 6478 to binary
6 4 7
110 100 111

Convert 1101001112 to hexadecimal

000110100111
1 A 7

Therefore, 6478 = 1A716

Example 2: Find the octal equivalent for the hexadecimal number 3D5
Convert 3D516 to binary
3 D 5
0011 1101 0101

Convert 11110101012 to octal

001111010101
1 7 2 5

Therefore, 3D516 = 17258

3.2 Representation of Integers


If the numbers we want to represent are only positive (unsigned) integers, the solution is straight
forward; simply represent the unsigned integer with its binary value. For example, 34 is represented as
00100010 in 8 bits. In this section, the discussion is on representation of signed integers. Signed integers
can be are represented in several alternative ways. These alternatives are used for various purposes
based on their convenience for the applications.

Computer storage has a limited capacity to hold data. The number of bits available for data
representation determines the range of integers we can represent. With 4 bits, it is possible to represent a

Department of C.S GTC 45


ITec381 Computer Organization and Architecture

total of 16 integers. If the number of available bits increases to 5, we can represent 32 integers. In fact
with every bit added, the number of possible integers we can represent is doubled. In general, the
number of integers we can represent with n bits is 2 n. Singed integers include positive integers, negative
integers, as well as zero. The 2n places are partitioned among the negative, positive, and zero. For
example, with 8 bits, it is possible to represent 256 different integers. Typically, 128 of them are positive
integers and zero while the rest 128 of them are negative integers.

Signed integer representations discussed in this section are sign magnitude, 1’s complement, 2’s
complement, and excess-N. We assume the number of bits available for representation is 8 unless
explicitly specified.

3.2.1 Sign-Magnitude Representation

In mathematics, positive integers are indicated by a preceding + sign (although usually it is avoided) an
a preceding – sign identify the integer as a negative number. In computers, there is no place for a + or a
– sign; there are only 0s and 1s. A similar way of representing + and – signs is to treat the most
significant bit as a sign bit the remaining bits are used to represent the magnitude of the integer. By
convention a 0 on the sign bit indicate the integer is positive and a 1 indicate the integer is a negative.

A problem of this representation method is that there are two representations for 0; a positive 0 and a
negative 0. An integer with all the magnitude bits and the sign bit set to 0 is a positive 0 while an integer
with all the magnitude bits set to 0 and a 1 on its sign bit is a negative 0. This type of ambiguous
representation is undesirable and as a solution, the negative zero can be ignored. The biggest problem of
this representation method is, however, its inconvenience in binary arithmetic. Early computers, for
instance IBM 7090, used this representation.

As example, the sign-magnitude representation of 79 and -79 in 8 bits are 01001111 and 11001111
respectively. The only difference is on the sign (first) bit; for 79 it is 0 while for -79 it is 1. Figure 3-1,
shows the representation scheme with 8 bits. In sign magnitude, the lower half range of values represent
the positive integers starting at positive 0 while the higher half values represent negative integers
starting at negative 0.
0
1
2

20

126
127
128
129
130

236

254
255

... ... . . . (a)


. . .
00000000
00000001
00000010

01111110
01111111
10000000
10000001
10000010

11101100

11111110
11111111
00010100

. . . ... ... . . . (b)

. . . ... ... . . . (c)


-0
-1
-2

-108

-126
-127
0
1
2

20

126
127

Figure 3-1: Sign-magnitude representation number line; (a) the unsigned decimal value of the binary
number, (b) the binary numbers, (c) the actual value the binary numbers represent in sign-
magnitude representation

Department of C.S GTC 46


ITec381 Computer Organization and Architecture

3.2.2 One’s Complement Integer Representation

Every number system has two complement systems. For a given base n the complements are n’s
complement and (n-1)’s complement. Thus, in decimal numbers system (base 10), the complement
systems are 10’s complement and 9’s complement. Similarly in binary number system, the complements
are 2’s complement and 1’s complement. Using complements in binary number systems makes
subtraction and logical negation very simple. Calculating the complement of a binary integer is trivial.
Furthermore, using complements makes arithmetic operations simple.

The one’s complement of a binary integer is found by inverting all 0s to 1s and all 1s to 0s. This makes
complementing operation simple at hardware level. In one’s complement integer representation, the
negative of an integer is represented by its complement. For example, the one’s complement
representation of 16 and -16 in 8 bits are 00010000 and 11101111 respectively.

As with sign-magnitude representation, there are two representations for 0. A one’s complement
representation with all 0s is a positive 0 while one with all 1s is a negative 0. Still representations for
positive integer s begin with 0 while that of negative integers start with 1. The arrangement of numbers
on the number line is in such a way that it starts with positive 0, puts positive numbers starting with 1,
arranges negative numbers beginning with the smallest one (with the highest magnitude) and ends with
negative 0. Figure 3-2 shows the arrangement of integers with one’s complement representation on the
number line.
0
1
2

20

126
127
128
129
130

236

254
255
... ... . . . (a)
. . .
00000000
00000001
00000010

01111110
01111111
10000000
10000001
10000010

11101100

11111110
11111111
00010100

. . . ... ... . . . (b)

. . . ... ... . . . (c)


-127
-126
-125

-19

-1
-0
0
1
2

20

126
127

Figure 3-2: One’s complement number line; (a) the unsigned decimal value of the binary number, (b)
the binary numbers, (c) the actual value the binary numbers represent in one’s complement
representation

3.2.3 Two’s Complement Integer Representation

The two’s complement of an integer is found by adding 1 to its one’s complement. As a reminder, in
binary arithmetic, 0+1 = 1 and 1+1 = 0 with a carry of 1 to the next higher significant bit. A shortcut
method to find the two’s complement of a number is to keep all the bits up to and including the first 1
from the right and invert all the others. Two’s complement representation of 19 and -19 in 8 bits are
00010011 and 11101101 respectively.

Department of C.S GTC 47


ITec381 Computer Organization and Architecture

There is only one representation for 0 where all the bits are set to 0. The consequence is that the number
of negative integers is one more than their positive counterparts. The arrangement of two’s complement
integers on the number line is similar to that of one’s complement except that the last value is -1 rather
than -0. Figure 3-3 shows this.
0
1
2

20

126
127
128
129
130

236

254
255
... ... . . . (a)
. . .
00000000
00000001
00000010

01111110
01111111
10000000
10000001
10000010

11101100

11111110
11111111
00010100

. . . ... ... . . . (b)

. . . ... ... . . . (c)


-128
-127
-126

-20

-2
-1
0
1
2

20

126
127

Figure 3-3: Two’s complement number line; (a) the unsigned decimal value of the binary number, (b)
the binary numbers, (c) the actual value the binary numbers represent in two’s complement
representation

3.2.4 Excess-N Integer Representation

Another name for excess-N representation is biased representation. Excess-N, uses a pre-specified
number N as a biasing value, i.e., it is used to shift the position of the representation of 0 so that values
less than the bias are considered as representation for negative integers. An integer x is represented by
the unsigned number which is x + N. Thus 0 is represented by N, and −N is represented by a bit pattern
of all 0s. The value of N is usually chosen as 2(k-1)-1, where k is the number of bits used for
representation.

As an example, let us represent 34 and -34 in excess-127 in 8 bits (note that the value 127 is chosen
because 2(8-1)-1=127.) 34 is represented by the binary equivalent of 161 which is 10100001, because
34+127 is 161. With similar argument, -34 is represented by 01011101 which is the binary equivalent of
93 and 93 = -34+127. In general, in exess-127 representation, all values less than 127 represent negative
values, 127 represents 0, and all values greater than 127 represent positive integers. On the number line
of excess-N, the lowest value represent –N and the largest value represents N. Figure 3-4 shows the
number line for the excess-127 representation.

Excess-N integer representation is mainly used in representation of floating point numbers. The IEEE
floating-point standard defines the exponent field of as an 8-bit Excess-127 field and as an 11-bit
Excess-1023 field. Floating point number representation is discussed next.

Department of C.S GTC 48


ITec381 Computer Organization and Architecture
0
1
2

20

126
127
128
129
130

236

254
255
... ... . . . (a)
. . .
00000000
00000001
00000010

01111110
01111111
10000000
10000001
10000010

11101100

11111110
11111111
00010100
. . . ... ... . . . (b)
-125

. . . ... ... . . . (c)


-127
-126

-107

-1
0
1
2
3

109

127
128
Figure 3-4: Excess-127 number line; (a) the unsigned decimal value of the binary number, (b) the
binary numbers, (c) the actual value the binary numbers represent in excess-127
representation

3.3 Floating Point Numbers


In common real number expression the location of the radix point is indicated by placing a dot (or a
comma in some countries) character. For decimal number system, the radix point is commonly known as
decimal point and in binary number system it is named as binary point. The term radix point is used to
refer to this point irrespective of the number system.

In mathematics numbers are written in two common ways: either placing the radix point at a fixed
location or by locating the point at any location in the number and providing an extra data which
indicates the actual location of the radix point. With integers the radix point is implicitly known to be at
the right end of the number. This is an example of representation with the radix point located at a fixed
location in the number. In science very large and very small numbers are quite common. Writing them
with the radix point at its actual position is inconvenient. Therefore the common and convenient way of
writing such numbers is the scientific notation. In scientific notation, a number is written with the radix
point put after the first nonzero digit and the actual position of the radix point is indicated by the
exponent of the number. The scientific notation allows the radix point to “float” anywhere in the
number. This makes the representation of very small and very large numbers effective and efficient and
makes representation of numbers over a large range of magnitude. Manipulation of fixed point
representation is less costly than floating point numbers but the range of numbers that can be represented
by fixed point representation is quite narrow.

In computers, representation of numbers similar to the scientific notation is available and is known as
floating point number. Unlike the scientific notation, there is no radix point character in floating point
numbers. For the representation of integers, the only required information is the magnitude and the sign
of the number. To represent floating point numbers or in scientific notation we need the following
information:

 the magnitude of the number


 the sign of the number
 the magnitude of the exponent
 the sign of the exponent
 the base of the number

Department of C.S GTC 49


ITec381 Computer Organization and Architecture

Of these required pieces of information, only the first four are necessary to represent floating point
numbers in computers. The base of the number is not necessary for representation because it is always 2
as the number system of computers is binary.

Among the variety of floating number representations in computers, the IEEE 754 standard is the most
widely used. It is commonly referred as the IEEE floating point. Two formats of IEEE floating point are:

 Single precision: 32 bits wide representation with 23 bits significand, one sign bit, and 8 bits of
exponent. The accuracy of binary numbers with 23 bits significant is equivalent to accuracy of 7
digits of decimal number.
 Double precision: 64 bits wide representation of which 52 bits are for the signicand, one bit for
the sign of the number and 11 bits for the exponent. Double precision numbers are accurate to 16
digits of decimal number.

Modern high level languages have numerical data types for both single and double precision IEEE
floating point format. For example the float data type in C and C++ is for single precision representation
and the data type double is for the double precision representation. Figure 3-5 shows the layout of single
and double precision formats of the IEEE floating point representation. The two formats are exactly the
same except for the number of bits they have. Notice the layout of bits for the sign, exponent and
significand.
Figure 3-5: Single and double precision IEEE floating point formats
1 bit 8 bits 23 bits

Sign bit Exponent bits Significand bits

(a) Single precision

1 bit 11 bits 52 bits

Sign bit Exponent bits Significand bits

(b) Double precision

3.3.1 The IEEE Floating Point

It is already mentioned that to represent a floating point number, we need to have representation for the
magnitude of the number (significand), the sign of the number, the magnitude of the exponent, and the
sign of the exponent. Of these four items, only three have place in the IEEE floating point. The
magnitude of the exponent and its sign has to be combined together and placed within the exponent bits.
The magnitude of the number has its own place within the representation. The set of bits for the
magnitude is commonly referred as the significand. The word mantissa is also used as a synonym to the

Department of C.S GTC 50


ITec381 Computer Organization and Architecture

word significand. The sign of the number (the sign of the significand has a separate bit. A 0 on the sign
bit shows the number is positive while a 1 is used for negative numbers.

In line with the above discussion, sign-magnitude representation is used for the significand while for the
exponent a representation system without a separate sign bit is use. One’s complement, two’s
complement, and excess-N represent signed numbers without a separate sign bit. Of these three
alternatives, excess-N is used in representation of IEEE floating point. The single precision format is
uses exess-127 and the double precision format uses excess-1023 representation for the exponent.

The IEEE 754 standard requires the significand to be in normal form. A number is in normal form when
the first digit of the number is non-zero (to be exact, the first digit should be 1, since the only non-zero
digit in the binary number system is 1.) This implies that we need a special representation for 0. As the
first digit of the significand is always 1, we need not represent it explicitly. This gives us an extra 1 bit
data for the significand. Thus with the 23 bits for the significand in single precision, the actual number
of bits of the significand’s magnitude is 24 bits and that of the double precision is 53 bits. In other
words, there is an implied 1 at the beginning of every floating point number.

The range of numbers that can be represented with single precision is from 10 -38 to 1038 as the range of
the exponent of single precision is [-127, 127]. 2-127 is roughly 10-38 since -127 × log102 ≈ -38 and 2127 is
approximately 238. With similar argument, the range of values of double precision is roughly from 10 -308
to 2308. However, this range does not include all the possible values of numbers because there are some
numbers that cannot be represented exactly. A good example is the real number 2/3.

3.4 Other Binary Codes


3.4.1 Binary Coded Decimal (BCD)

For systems with much I/O and little computation of numbers, it is efficient to use a mechanism in
which each digit of a number is represented in their binary equivalent. The BCD (Binary Coded
Decimal), also called packed decimal, representation is based on this idea. In order to have
representations for the ten digits of the decimal number system, we need a four bit string. Thus 0 = 0000,
1 = 0001, 2 = 0010, …, and 9 = 1001. In BCD, multiples of 8 bits, in which the bits are grouped in 4, are
used to represent decimal numbers. Thus the decimal number 461 is represented as 0000 0100 0110
0001.

Although BCD avoids complex conversions of binary numbers to decimal and vice versa, it is inefficient
in memory space usage. As the total number of unique combinations with 4 bit strings is 16, 10 of them
are used to represent the numbers. Two more bit strings can be used to represent positive and negative
signs (although the bit string for 0 and 9 are used as positive and negative signs respectively). The
remaining bit strings are wasted. Performing arithmetic in BCD is similar to that of other representations
but the circuitry needs is also much complex. Sign-magnitude as well as complement systems can be
used in BCD. The complement system is either 9’s complement or 10’s complement. Since
complements have the advantage of simplicity in arithmetic, they are preferred.

Department of C.S GTC 51


ITec381 Computer Organization and Architecture

3.4.2 Characters

Text documents contain strings of characters. Characters refer to letters of the alphabet, the ten digits (0
through 9), punctuation marks, characters that are used to format the layout of text on pages such as the
newline, space, and tab characters, and other characters that are useful for communication. The most
widely used character code is the International Reference Alphabet (IRA). The American version of IRA
is called American Standard Code for Information Interchange (ASCII). Even though IRA used 8 bits,
each character is represented by 7 bits; hence a total of 128 characters are represented. The eighth bit is
used for parity (error detection). The ASCII character set is given in the Appendix.

Another character encoding system is the EBCDIC (Extended Binary Coded Decimal for Interchange
Code) is used on IBM mainframe. It uses 8 bits per character (and a ninth parity bit), thus represents 256
characters. As with IRA, EBCDIC is compatible with BCD. In the case of EBCDIC, the codes
11110000 through 11111001 represent the digits 0 through 9.

ASCII is a standard for use in the United States. Many countries adapted their own versions of ASCII.
There are also 8-bit versions of ASCII which allow having additional 128 rooms for more characters,
especially of those languages based on the Latin character. To allow encoding of characters of all the
languages in the world, a character set known as the Unicode is devised. The Unicode character has
variants known as UTF-8, UTF-16, and UTF-32. UTF-8 is the same as ASCII. UTF-16 and UTF-32 use
16-bit and 32-bit per character respectively, thus, have incorporated much more characters. For
backward compatibility with ASCII, the first 128 characters of UTF-16 and UTF-32 are similar to those
of ASCII.

3.5 Review Questions


1. Convert each of the following decimal numbers to binary, octal, and hexadecimal.
a) 25 b) 93 c) 266 d) 4

2. Convert each of the following binary numbers to decimal, octal, and hexadecimal.
a) 111000 b) 1010101 c) 1110111

3. Convert each of the following octal numbers to binary, decimal, and hexadecimal.
a) 567 b) 23 c) 400

4. Convert each of the following hexadecimal numbers to binary, octal, and decimal.
a) BAC b) 24 c) 7 d) D0

5. Convert the following decimal numbers into binary and write the sign-magnitude, one's complement,
and two's complement representations. Use 16 bits for the representation.
a) 78 b) -19 c) -134 d) 62

6. Write the single precision and double precision IEEE floating point representations for the following
binary real numbers.
a) 100001000111.11110001

Department of C.S GTC 52


ITec381 Computer Organization and Architecture

b) 0.00000001111
c) - 0.0010010001100000001
d) - 10000100000001111.0011
7. Write the BCD representation of the following decimal numbers.
a) 346 b) 2879

8. Write the ASCII code for the following words. Notice the capitalization of differences in the words.
▪ MARKOS
▪ Markos
▪ markos

Department of C.S GTC 53


ITec381 Computer Organization and Architecture

Chapter 4
Basic Computer Organization and Design
Objective:
After completing this chapter, you will be able to understand:
 Instruction codes
 Computer registers and their application
 Computer instructions
 Timing and control
 The instruction cycle
 Memory-reference instructions
 Input and output interrupt
 Design of basic computer

4.1 Instruction Codes


The internal organization of a digital system is defined by the sequence of micro-operations it performs
on data stored in its registers. The general purpose digital computer is capable of executing various
micro-operations and, in addition, can be instructed as to what specific sequence of operations it must
perform. The user of the computer can control the process by means of a program. A program is a set of
instructions that specify the operations, operands, and the sequence by which processing has to occur.

Depending on the complexityand the functions that the computer performs, a computer may be
classified as:

 A Special Purpose Computer is constructed to perform one or very few functions.


 A General Purpose Computer is constructed to perform many different functions.
 A program is a set of instructions that specify the operations, operands, and the sequence by
which processing has to occur.
 A computer instruction is a binary code that specifies a sequence of micro-operations for the
computer.

Instruction codes together with the data are stored in memory. The computer reads each instruction from
memory and places it in a control register. The control then interprets the binary code of the instruction
and proceeds to execute it by issuing a sequence of micro-operations.

An Instruction Code is a group of bits that instructs the computer to perform a specific operation. It is
usually divided into parts, each having its own particular interpretation. The most basic part of an
instruction code is its operation part.

The operation code of an instruction is a group of bits that defines such operations as Add, Subtract,
Multiply, Shift and Complement.

Department of C.S GTC 54


ITec381 Computer Organization and Architecture

The operation part of an instruction code specifies the operation to be performed. This operation must be
performed on some data stored in processor registers or inmemory.An instruction must specify not
only the operation but also the registers or the memory words where the operands are to be found.

Note: The number of bits required for the operation code of an instruction depends on the total number
of operations available in the computer.

4.1.1 Stored Program Organization


The simplest way to organize a computer is to have one processor register and an instruction code
format with two parts. The first specifies the operation to be performed and the second specifies an
address. The memory address tells the control where to find an operand in memory. This operand is read
from memory and used as the data to be operated on together with the data stored in processor register.
Instructions are stored in one section of memory and data in another. The operation is performed with
memory operands and the content of AC. Figure 4-1 below depicts this type of organization.

Figure 4-1:Stored program organization

Note: Computers that have a single processor register usually assign to the term Accumulator (AC).

Example: For a memory unit with 4096 words we need 12 bits to specify an address (Since 2 12 = 4096).
If we store each instruction in one 16 bit memory word, we have available 4-bits for the operation code
OP-Code. To specify one out of 16 possible operations and 12 bits specify the address of an operand.
The control reads a 16-bit instruction from the program portion of memory. It uses the 12-bit address
part of the instruction to read 16-bit operand from the data portion of memory. It then executes the
operation specified by the operation code.

Department of C.S GTC 55


ITec381 Computer Organization and Architecture

4.1.2 Direct and Indirect Addressing Modes

It is sometimes convenient to use the address bits of an instruction code not as an address but as an
actual operand. When the second part of an instruction code specifies an operand, the instruction is said
to have an immediate operand. When the second part specifies the address of an operand, the instruction
is said to have direct address. This is in contrast to a third possibility called indirect address, where
the bits in the second part of the instruction designate an address of memory word in which the address
of the operand is found.
Figure 4-2 illustrates the direct and indirect addressing modes. Note that one bit of the instruction code
is used to distinguish between a direct and an indirect address.

15 14 12 11 0
1 Opcode Address

(a) Instruction format

Memory Memory
22 0 ADD 457 35 1 ADD 300

300 1350

Operand
457 1350 Operand

+ +

AC AC

(a) Direct address (b) Indirect address

Figure 4-2:Direct and indirect addressing modes

4.2 Computer Registers


Computer instructions are usually stored in consecutive memory locations and executed sequentially one
at a time. The control reads an instruction from specific address in memory and executes it. It then
continues by reading the next instruction in sequence and executes it and so on. This type of instruction
sequencing needs a counter to calculate the address of the next instruction after execution of the current
instruction is completed. It is also necessary to provide a register in a control unit for storing the

Department of C.S GTC 56


ITec381 Computer Organization and Architecture

instruction code after it is read from the memory. The computer needs processor registers for
manipulating data and a register for holding a memory address. These requirements dictate the register
configuration shown in Figure 4-3. The registers are also listed in Table 4-1.

The memory word has a capacity of 4096 words and each word contains 16 bits. 12 bits of an instruction
word are needed to specify the address of the operand. This leaves three bits for the operation part of the
instruction and a bit to specify a direct or indirect address. The data register (DR) holds the operand read
from memory. The accumulator (AC) register is a general purpose processing register. The instructions
read from memory are placed in instruction register (IR). The temporary register (TR) is used for
holding temporary data during the processing.

Table 4-1: List of registers for the basic computer


Register Number of Register Name Function
Symbol Bits
DR 16 Data register Holds memory operand
AR 12 Address Register Holds address of memory
AC 16 Accumulator Processor register
IR 16 Instruction Register Holds instruction code
PC 12 Program Counter Holds address of instruction
TR 16 Temporary Register Holds temporary data
INPR 8 Input Register Holds input character
OUTR 8 Output Register Holds Output character

The memory address register (AR) has 12 bits since this is the width of a memory address. The program
counter (PC) also has 12 bits and it holds the address of the next instruction to read from memory after
the current instruction is executed. The PC goes through a counting sequence and causes the computer to
read sequential instructions previously stored in memory. Instruction words are read and executed in
sequence unless a branch instruction is encountered. A branch instruction calls for a transfer to a
nonconsecutive instruction in the program. The address part of a branch instruction is transferred to PC
to become the address of the next instruction. To read an instruction, the content of PC is taken as the
address for memory and memory read cycle is initiated. PC is then incremented by one, so it holds the
address of the next instruction in sequence.

Department of C.S GTC 57


ITec381 Computer Organization and Architecture

Figure 4-3: Basic Computer Registers and Memory


11 0
PC

11 0
AR Memory
4096 words
15 0
16 bits per word
IR

15 0 15 0
TR DR

7 0 7 0 15 0
OUTR INPR AC

Common Bus Systems

The basic computer has eight registers, a memory unit, and a control unit. Paths must be provided to
transfer information from one register to another and between memory and registers. Path between each
and every register would cause too many wires running around the registers. A better solution is the use
a common bus. A bus is a set of common lines, one for each bit

Control signals determine which register is connected to the bus at a given time. The connection of the
registers and memory of the basic computer to a common bus system is illustrated in Figure 4-4.

In Figure 4-4, the output of seven registers and memory are connected to a common bus. The specific
output that is selected for the bus lines at any given time is determined from the binary value of the
selection variables S2, S1, and S0. The number along each output shows the decimal equivalent of the
required binary selection. For example, the number along the output DR is 3. The 16-bit output of DR
are placed on the bus line when S2S1S0=011 since this is the binary value of decimal number 3. The lines
from the common bus are connected to the inputs of each register and the data inputs of the memory.
The particular register whose LD (load) in put is enabled receives the data from the bus during the next
clock pulse transition. The memory receives the content of the bus when its write input is activated. The
memory places its 16-bit output onto the bus when the read input is activated and S2S1S0=111.

Department of C.S GTC 58


ITec381 Computer Organization and Architecture

S2
s
2

S1
s
1 Bus
s
S0
0

Memory unit
4096×16
7
Address

Write Read
AR
1
LD INR CLR

PC 2
LD INR CLR

DR
3
LD INR CLR

Adder E
and AC 4
logic
LD INR CLR

INPR

IR 5
LD

TR
6
LD INR CLR

OUTR

Clock
LD
16-bit common bus

Figure 4-4:Basic computers connected to a common bus

Department of C.S GTC 59


ITec381 Computer Organization and Architecture

4.3 Computer Instructions


The basic computer has three instruction code formats, as shown in Figure 4-5 below. Each format has
16 bits. The operation code (opcode) part of the instruction contains three bits and the meaning of the
remaining depends on the operation code encountered. A memory-reference instruction uses 12 bits to
specify an address and one bit to specify the addressing mode I (I = 0 for direct address and I = 1 for
Indirect address).The register-reference instructions are recognized by the operation code 111 with a 0 in
the left most bit (bit 15) of the instruction. A register-reference instruction specifies an operation on or a
test of the AC register. Input-output instructions do not need to refer to the memory and are recognized
by operation code 111 with a 1 in the leftmost bit of the instruction. The remaining 12 bits are used to
specify the type of input-output operations or test performed.
Figure 4-5: Instruction format
15 14 12 11 0
1 1 1 1 Address (Opcode = 000 through 110)

(a) Memory – reference instruction

15 12 11 0
1 1 1 1 Register operation (Opcode = 111, I = 0)

(b) Register – reference instruction

15 12 11 0
1 1 1 1 I/O operation (Opcode = 111, I = 1)

(c) Input - output instruction

The instructions for the basic computer are listed in Table 4-2. The symbol designation is a three-letter
word and represents an abbreviation intended for programmers.

Instruction Set Completeness

Before investigating the operations performed by the instructions let us discuss the types of instructions
that must be included in a computer. A computer should have a set of instructions so that the user can
construct machine language program to evaluate any function that is known to be computable. The set of
instructions are said to be complete if the computer includes a sufficient number of instructions in each
of the following categories:

1. Arithmetic, Logical and Shift Instructions.


2. Arithmetic, Logic and Shift instructions provide computational capability of processing data.

Department of C.S GTC 60


ITec381 Computer Organization and Architecture

3. Instruction for moving information to and from memory and processor registers. Program control
instructions together with instructions that check status conditions.
4. Input and Output Instructions

Table 4-2: Basic computer instructions

Hexadecimal Code
Symbol I =0 I=1 Description
AND 0xxx 8xxx AND memory word to AC
ADD 1xxx 9xxx Add memory word to AC
LDA 2xxx Axxx Load memory word to AC
STA 3xxx Bxxx Store content of AC in memory
BUN 4xxx Cxxx Branch unconditionally
BSA 5xxx Dxxx Branch and save return address
ISZ 6xxx Exxx Increment and skip if zero
CLA 7800 Clear AC
CLE 7400 Clear E
CMA 7200 Complement AC
CME 7100 Complement E
CIR 7080 Circulate right AC and E
CIL 7040 Circulate left AC and E
INC 7020 Increment AC
SPA 7010 Skip next instruction if AC positive
SNA 7008 Skip next instruction if AC negative
SZA 7004 Skip next instruction if AC zero
SZE 7002 Skip next instruction if E is 0
HLT 7001 Halt computer
INP F800 Input character to AC
OUT F400 Output character from AC
SKI F200 Skip on input flag
SKO F100 Skip on output flag
ION F080 Interrupt on
IOF F040 Interrupt off

Department of C.S GTC 61


ITec381 Computer Organization and Architecture

Arithmetic, logical and shift instructions provide computational capabilities for processing the type of
data that the user may wish to employ. The bulk of the binary information in a digital computer is stored
in memory, but all computations are done in processor registers. Therefore, the user must have the
capability of moving information between these two units

Remark: Although the set of instruction for basic computer is complete, it is not sufficient because
frequently used operations are not performed rapidly. An efficient set of instructions must include such
instructions as subtraction, OR, and XOR.

Note: These operations must be programmed in basic computer.

4.4. Timing and Control


In order to control the steps of the instruction cycle, it is necessary to introduce a counter, whose output
is used as input to the control logic. We will utilize a Sequence Counter Register (SC). This is a register
that holds a count value, can be reset/cleared to zero and can be incremented (or decremented) by one

CLR
SC INC

……

Each instruction will require a specified number of time steps to complete a sequence of micro-
operations. Each step of the sequence is marked by a count value in SC. The SC outputs a string of bits
whose value is in the range from 0 to 2L-1. Eg.for L=3, from 0 to 7. We need a way of converting the bit
string value to single bit valued outputs labeled T0, T1, T2, T3, and so on, up to Tx (where x = 2 L-1). A
decoder serves our purpose, recalling that the output from the DEC is a 1 only on one line (the rest are
0`s)

Tx T2 T1 T0 CLR
…..
SC INC
L-to-2L DEC
……

The timing and control unit is the component that determines what the ALU should do at a given instant.
There are two kinds of control organization:

1. Hardwired control

Department of C.S GTC 62


ITec381 Computer Organization and Architecture

2. Microprogrammed control

Hardwired control: The control logic is implemented with digital circuits(decoders, flip flops, etc.) It
has the advantage that it can be optimized to produce a fast mode of operationbut requires changes in the
wiring among the various components if the design has to be modified or changed.

Microprogrammed control: Control information is stored in a control memory. Required


modifications can be done by updating the micro-program in control memory. Suppose we need to
implement the following:
XY: R1 ← R2
X′Y: R1 ← R1 + 1
XY′: R2 ← 0

X and Y are control variables. The load, clear and increment lines of the registers are a function of these
variables.
LD (R1) = XY (can also be further reduced),

INR(R1)= X′Y, CLR(R2)= XY′, bus-select= XY

Figure 4-6:Timing and control

Similarly for the basic computer

a. The control inputs of the registers LD, INR, CLR must be known

Department of C.S GTC 63


ITec381 Computer Organization and Architecture

b. The bus system must have control inputs to determine which component to select
c. The current instruction must be decoded to determine what is to be done.

4.5. Instruction Cycle


A program residing in the memory unit of the computer consists of a sequence of instructions. The
program is executed by going through a cycle for each instruction; the so called fetch-decode-execute
cycle. Each instruction cycle in turn is subdivided into a sequence of phases. In the basic computer each
instruction cycle consists of the following phases:

1. Fetch an instruction from memory


2. Decode the instruction
3. Read the effective address from memory if the instruction has an indirect address
4. Execute the instruction

Upon the completion of step 4, the control goes back to step 1 to fetch, decode and execute the next
instruction. This process continues indefinitely unless a HALT instruction is encountered.

I. Fetch the instruction

Initially, the program counter PC is loaded with the address of the first instruction in the program and
SC is cleared to 0, providing a decoded timing signal T0. After each clock pulse, SC is incremented by
one, so that the timing signals go through a sequence T0, T1, T2 and so on. Since the address lines of the
memory unit are hardwired to AR, address of the instruction to be fetched must first be placed in AR.
Thus the first register transfer in an instruction cycle should be transferring the content of PC (address of
the instruction to be brought in for execution) to AR
T0: AR ← PC

Bus selects PC, LD(AR) = 1. Next the current instruction is brought from memory into IR and PC is
incremented
T1: IR ← M[AR], PC ← PC + 1. Bus selects RAM, Memory read is set, LD(IR)=1,INR(PC)=1

II. Decode the instruction

T2: D0, ... , D7 ← Decode IR(12-14), AR ←IR(0-11),I ←IR(15)

Therefore, micro-operations for the fetch and decodephases can be specified by the following register
transferstatements.
T0: AR ← PC

Department of C.S GTC 64


ITec381 Computer Organization and Architecture

T1: IR ← M[AR], PC ← PC + 1
T2: D0, …, D7 ← Decode IR(12-14), AR ← IR(0-11), I ← IR(15)

4.6. Memory-Reference Instructions


In order to specify the micro-operations needed for the execution of each instruction, it is necessary that
the function they are intended to perform be defined precisely. Table 4-3 lists the seven memory-
reference instructions. The decoded output Di for i = 0, 1, 2, 3, 4, 5, and 6 from the operation.

Decoder that belongs to each instruction is included in the table. The effective address of the instruction
is in the address register AR and was placed there during timing signal T 2 when I = 0, or during timing
signal T3 when I=1. The execution of the memory-reference instruction starts with timing signal T 4. The
symbolic description of each is specified in the table in terms of register transfer notation. The actual
execution of the instruction in the bus system will require s sequence of micro-operations. This is
because data stored in memory cannot be processed directly. The data must be read from memory to a
register where they can be operated on with logic circuits.

Table 4-3: Memory-reference instructions


Symbo Operation Symbolic Description
l Decoder

AND D0 AC AC M[AR]


ADD D1 AC AC + M[AR], E Cout

LDA D2 AC  M[AR]

STA D3 M[AR]  AC

BUN D4 PC  AR

BSA D5 M[AR]  PC, PC  AR + 1

ISZ D6 If (M[AR] + 1= 0) then PC PC + 1

AND to AC: This is an instruction that performs the AND logic operation on pairs of bits in AC and the
memory word specified by the effective address. The result of the operation is transferred to AC. The
micro-operations that execute this instruction are:

D0T4: DR M[AR]

Department of C.S GTC 65


ITec381 Computer Organization and Architecture

D0T5: AC AC DR, SC  0

The control function for this instruction uses the operation decoder D 0 since this output of the decoder is
active when the instruction has an AND operation whose binary code value is 000.

ADD to AC: This instruction adds the content of the memory word specified by effective address to the
value of AC. The sum is transferred into AC and the output carry C out is transferred into E (extended
accumulator) flip-flop.

D1T4: DR M[AR]

D1T5: AC AC + DR, E Cout, SC  0

LDA: Load to AC: This instruction transfers the memory word specified by effective address to AC.
D2T4: DR M[AR]
D2T5: AC  DR, SC  0

STA: Store AC: this instruction stores the content of AC into the memory word specified by the
effective address. Since the output of AC is applied to the bus and the data input of memory is connected
to the bus, we can execute this instruction with one microoperation.

D3T4: M[AR]  AC, SC  0

BUN: Branch Unconditionally: This instruction transfers the program to the instruction specified by
the effective address. Remember that PC holds the address of the instruction to be read from memory in
the next instruction cycle. PC is incremented at T1 to prepare it for the address of the next instruction in
the program sequence. The BUN instruction allows the programmer to specify an instruction out of
sequence and we say that the program branches (jumps) unconditionally. The instruction is executed
with one microoperation:

D4T4: PC  AR, SC  0

The effective address from AR is transferred through the common bus to PC. Resetting SC to 0 transfers
control to T0. The next instruction is then fetched and executed from the memory address given by the
new value in PC.

BSA: Branch and save Return Address: This instruction is useful for branching to a portion of the
program called subroutine or procedure. When executed, the BSA instruction stores the address of the
next instruction in sequence (which is available in PC) into a memory location specified by the
effective-address. The effective address plus one is then transferred to PC to serve as the address of the
first instruction in the subroutine.

D5T4: M[AR]  PC, AR AR + 1


D5T5: PC  AR, SC  0

Department of C.S GTC 66


ITec381 Computer Organization and Architecture

Timing signal T4 initiates a memory write operation, places the content of PC onto the bus, and enables
the INR input of AR. The memory write operation is completed and AR is incremented by the time the
next clock transition occurs. The bus is used at T5 to transfer the content of AR to PC.

ISZ: Increment and Skip if Zero: This instruction increments the word specified by the effective
address, and if the incremented value is equal to 0, PC is incremented by 1. Since it is not possible to
increment a word inside the memory, it is necessary to read the word into DR, increment DR, and store
the word back into memory. This is done with the following sequence of micro-operations:

D6T4: DR M[AR]

Department of C.S GTC 67


ITec381 Computer Organization and Architecture

D6T5: DR DR + 1
D6T6: M[AR]  DR, if (DR = 0) then (PC PC + 1), SC  0

4.7 Input-Output and Interrupt


A computer instruction can serve no useful purpose unless it communicates with the external
environment. Instructions and data stored in memory must come from some input device. Computational
results must be transmitted to the user through some output device.

Input-Output Configuration
The terminal sends and receives serial information. Each quantity of information has eight bits of an
alphanumeric code. The serial information from the keyboard is shifted into the input register INPR. The
serial information for the printer is stored in the output register OUTR. These two register communicate
with a communication interface serially and with the AC in parallel. The input-output configuration is
shown in Figure 4-7.

Serial Computer
Input/output communication registers and flip-
terminal interface flops

FGO

Receiver OUTR
Printer
Interface

AC

Transmitter
Keyboard INPR
interface

FGI

Figure 4-7: Input-Output Configurations

The input register INPR consists of eight bits and holds alphanumeric input information. The 1-bit input
flag FGI is a control flip-flop. The flag bit is set to 1 when the new information is available in the input
device and is cleared to 0 when the information is accepted by the computer. The flag is needed to
synchronize the timing rate difference between the input devices and the computer. The process of
information transfer is as follows. Initially, the input flag (FGI) is cleared to 0. When a key is struck in

Department of C.S GTC 68


ITec381 Computer Organization and Architecture

the keyboard, an 8-bit alphanumeric code is shifted into INPR and FGI is set to 1. As long as the flag is
set, the information in INPR cannot be changed by striking another key. The computer checks the flag
bit; if it is 1, the information from INPR is transferred in parallel into AC and FGI is cleared to 0. Once
the flag is cleared, new information can be shifted into INPR by striking another key. The output register
OUTR works similarly but the direction of information flow is reversed.

Input-output Instructions
Input and output instructions are needed for transferring information to and from AC register, for
checking the flag bits, and for controlling the interrupt facility. Input-output instructions have an
operation code 1111 and are recognized by the control when D 7 = 1and I = 1. The remaining bits of the
instruction specify the particular operation. The control functions and micro-operations for the input-
output instructions are listed in Table 4-4. These signals are executed with the clock transition associated
with timing signal T3. Each control function needs a Boolean relation D7IT3, which we designate for
convenience by the symbol p. The control function is distinguished by one of the bits in IR(6-11). By
assigning the symbol Bi to bit i of IR, all control functions can be denoted by pB i where Bi represents
bits in IR, for i = 6 through 11.

Table 4-4: Input-output instructions

D7IT3 = p (common to all input-output instructions)


IR(i) = Bi [bit in IR(6-11) that specifies the instruction]

p SC  0 Clear sequence counter (SC)

INP pB11 AC(0-7)  INPR, FGI  0 Transfer input from INPR into the eight least
significant bits of AC and clear GFI

OUT pB10 OUTR  AC(0-7), FGO  0 Transfer the eight least significant bits of AC
to OUTR and clear FGO

SKI pB9 If (FGI = 1) then (PC PC + 1) Check if input flag (FGI) is set. If so, skip to
the next instruction

SKO pB8 If (FGO = 1) then (PC PC + 1) Check if output flag (FGO) is set. If so, skip
to the next instruction

ION pB7 IEN  1 Set the interrupt enable flag (IEN) on

IOF pB6 IEN  0 Set the interrupt enable flag (IEN) off

The input-output communication described above is known as programmed control transfer. Because of
the difference in the rate of information flow between I/O devices and the computer, this type of transfer
is inefficient. An alternative to programmed control is to let the interrupt facility. In the alternative,
while the computer is running a program, it does not check the status of flags. Instead, when a flag is set,

Department of C.S GTC 69


ITec381 Computer Organization and Architecture

the computer is momentarily interrupted from proceeding with the current program and is informed of
the fact that a flag has been set. The computer deviates momentarily from what it is doing to take care of
the input or output transfer. Once serving the interrupt, the computer returns to the task it was doing
before the interrupt arrives.

When the interrupt enable flip-flop IEN is cleared to 0 by the IOF instruction, the flags cannot interrupt
the computer. When IEN is set to 1 by the ION instruction, the computer can be interrupted. The
interrupt operation is done through the branch and save return address operation. The return address
available in PC is stored in a specific location where it can be found later when the program returns to
the instruction at which it was interrupted. This location can be a processor register, a memory stack, or
a specific memory address.

The interrupt cycle is initiated if the interrupt flip-flop R is equal to 1 after the last execute phase. This
flip-flop is set to 1 if IEN = 1 and either FGI or FGO are equal to 1. This can happen with any clock
transition except when timing signals T0, T1, or T2 are active. The condition for setting flip-flop R to 1
can be expressed with the following register transfer statement:

T′0T′1T′4 (IEN)(FGI + FGO): R  1

4.8 Design of Basic Computer


The basic computer consists of the following hardware components;
 A memory unit with 4096 words of 16-bits each
 Nine registers: AR, PC, DR, AC, IR, TR, OUTR, INPR, and SC
 Seven flip-flops: I, S, E, R, IEN, FGI, and FGO
 Two decoders: a 3x8 operation decoder and a 4x 16 timing decoder
 A 16-bit common bus
 Control logic gates
 Adder and logic circuit connected to the input of AC.

The memory unit is the standard component that can be obtained readily from a commercial source. The
registers are of the types shown in the previous lesson.

4.9 Review Questions


A computer uses a memory unit with 256k words of 32-bit each. A binary instruction code is stored in
one word of memory. The instruction has four parts: an indirect bit, and operation code, a register code
part to specify one of 64-registers and an address part.
A. How many bits are there in the operation code, the register code part, and the address part?
B. Draw the instruction word format and indicate the number of bits in each part.

Department of C.S GTC 70


ITec381 Computer Organization and Architecture

C. How many bits are there in the data and address inputs of the memory?

Department of C.S GTC 71


ITec381 Computer Organization and Architecture

Chapter 5
Central Processing unit
Objective:
After completing this chapter, you will be able to understand:
 General register organization
 Stack organization
 Instruction format
 Addressing modes
 Data transfer and manipulations
 Program control
 Reduced instruction set computers

5.1 Introduction
The part of the computer that performs the bulk of data processing operations is called the Central
processing unit and is referred to as the CPU. The CPU is made up of three major parts, as shown in
Figure 5-1. The register set stores intermediate data used during the execution of the instructions. The
Arithmetic and Logic unit (ALU) performs the required micro-operations for executing the instructions.
The control unit supervises the transfer of information among the registers and instructs the arithmetic
and logic units as to which operation to perform.

The CPU performs a variety of functions dictated by the type of instructions that are incorporated in the
computer. Computer architecture is some times defined as the computer structure and behavior as seen
by the programmer that uses machine language instructions. This includes the instruction formats,
addressing modes, the instruction sets and the general organizations of the CPU registers.

One boundary where the computer designer and the computer programmer see the same machine is the
part of the CPU associated with the set of the instruction. From the designers’ point of view, the
computer instruction set provides the specifications for the design of the CPU.

Register Set
Control

Arithmetic Logic
Unit(ALU)

Figure 5-1: Major components of CPU

Department of C.S GTC 72


ITec381 Computer Organization and Architecture

The design of a CPU is a task that in large part involves choosing the hard ware for implementing the
machine instructions .The user who programs the computer in machine or assembly language must be
aware of the register set, the memory structure, the type of data supported by the instructions, and the
function that each instruction performs.

5.2 General Register Organizations


In the programming examples of the previous chapter, we have shown that memory locations are needed
for storing pointers, counters, return addresses, temporary results and partial product during
multiplication. Having to refer to memory locations for such applications is time consuming because
memory access is the most time consuming operation in a computer. It is more convenient and more
efficient to store these intermediate values in processor register. When a large number of registers are
included in the CPU, It is most efficient to connect them through a common bus system. The registers
communicate with each other not only for direct data transfers, but also while performing various micro-
operations. Hence it is necessary to provide a common unit that can perform all the arithmetic, logic and
shift micro-operations in the processor.

A bus organization for 7-CPU registers is shown Figure 5-2. The output of each register is connected to
multiplexers (MUX) to form the two buses A and B. The selection lines in each multiplexer select one
register or the input data for the particular bus. The A and B busses form the inputs to ca common
arithmetic logic unit.

The operation selected in the ALU determines the arithmetic or Logic micro-operation that is to be
performed. The result of micro-operation is available for out put data and also goes in to the inputs of all
the registers. The register that receives the information from the output bus is selected by a decoder. The
decoder activates one of the register load inputs, thus providing a transfer path between the data in the
output bus and the inputs of the selected destination register.

The control unit that operates the CPU bus system directs the information flow through the register and
ALU by selecting the various components in the system. For example, to perform the operation
R1  R2 + R3,
the control must provide binary selection variables to the following selector inputs:

1. MUX A selector (SELA): to place the content of R2 into bus A.


2. MUX selector (SELB): to place the content of R3 into bus B.
3. ALU operation selector (OPR): to provide the arithmetic addition A +B.
4. Decoder destination selector (SELD): to transfer the content of output bus in to R1.

The four control selection variables are generated in the control unit and must be available at the
beginning of a clock cycle. The data from the two source registers propagates through the gates in the
multiplexers and the ALU, to the output bus, and into the inputs of the destination register, all during the
clock cycle interval. Then, when the next clock transition occurs, the binary information from the output
bus is transferred into R1. To achieve a fast response time, the ALU is constructed with high speed
circuits. The busses are implemented with multiplexers or three state gates.

Department of C.S GTC 73


ITec381 Computer Organization and Architecture

Department of C.S GTC 74


ITec381 Computer Organization and Architecture

Figure 5-2: Register set with common ALU

Control Word
There are 14 binary selection inputs in the unit, and their combined value specifies a control word. The
14-bit control word is illustrated in Figure 5-2 (b). It consists of four fields. Three fields contain three
bits each, and one field has five bits. The three bits SELA select a source register for the A input of the
ALU. The three bits of SELB select a register for the B input of the ALU. The Three bits of SELD
selects a destination register using the decoder and its seven load outputs. The five bits of the OPR select
one of the operations in the ALU. The 14-bit control word when applied to the selection inputs specify a
particular micro operation. The encoding of the register selection is specified in Table 5-1.
Table 5-1: Encoding of register selection fields
Binary code SELA SELB SELD

000 Input Input None

001 R1 R1 R1

010 R2 R2 R2

011 R3 R3 R3

100 R4 R4 R4

101 R5 R5 R5

110 R6 R6 R6

111 R7 R7 R7

The 3-bit binary code listed in the first column of Table 5-1 specifies the binary code for each of the
three fields. The register selected by fields SELA, SELB, and SELD is the one whose decimal number is
equivalent to the binary number in the code. When SELA or SELB is 000, the corresponding
multiplexer selects the external input data. When SELD=000, no destination register is selected but the
contents of the output bus are available in the external output.

The ALU provides arithmetic and logic operation. In addition, the CPU must provide shift operations.
The shifter may be placed in the input of the ALU to provide a pre-shift capability, or at the output of
the ALU to provide post-shifting capability. In some cases, the shift operations are included with the
ALU. The encoding of the ALU operations for the CPU is specified in Table 5-2.

A control word of 14-bits is needed to specify a micro-operation in the CPU. The control word for a
given micro-operation can be derived from the selection variables. For example, the subtract
microoperation given by the statement R1R2 - R3 specifies R2 for the input A of the ALU, R3 for the
Department of C.S GTC 75
ITec381 Computer Organization and Architecture

B inputs of the ALU, R1 for the destination register, and an ALU operation to subtract A - B. Thus, the
control word is specified by the four fields and the corresponding binary value for each field is obtained
from the encoding list in Table 5-1 and Table 5-2. The binary control word for the subtraction
microoperation is 010 011 001 00101 and is obtained as follows:

Table 5-2: Example of Micro operations


OPR
Operation Symbol
Select

00000 Transfer A TSFA

00001 Increment A INCA

00010 Add A + B ADD

00101 Subtract A - B SUB

00110 Decrement A DECA

01000 AND A and B AND

01010 OR A and B OR

01100 XOR A and B XOR

01110 Complement A COMA

10000 Shift right A SHRA

11000 Shift left A SHLA

Field: SELA SELB SELD OPR

Symbol: R2 R3 R1 SUB

Control word: 010 011 001 00101

The control word for this micro-operation and a few others are listed in Table 5-3. The increment and
transfer micro-operations do not use the B input of the ALU. For these cases, the B field is marked with
a dash.

Department of C.S GTC 76


ITec381 Computer Organization and Architecture

Table 5-3: Examples of micro-operations for the CPU


Symbolic Designation

Micro-operation SELA SELB SELD OPR Control Word

R1  R2 - R3 R2 R3 R1 SUB 010 011 001 00101

R4 R4 R5 R4 R5 R4 OR 100 101 100 01010

R6 R6 + 1 R6  R6 INCA 110 000 110 00001

R7  R1 R1  R7 TFSA 001 000 111 00000

Output  R2 R2  None TFSA 010 000 000 00000

Output  Input Input  None TFSA 000 000 000 00000

R4  sh1 R4 R4  R4 SHLA 100 000 100 11000

R5  0 R5 R5 R5 XOR 101 101 101 01100

5.3 Stack Organization


A useful feature that is included in the CPU of most computers is a stack or last- in, first-out (LIFO) list.
A stack is a storage device that stores information in such a manner that the item stored last is the first
item retrieved. The operation of a stack can be compared to a stack of trays. The last tray placed on top
of the stack is the first to be taken off.

The stack in digital computers is essentially a memory unit with an address register that can count only
(after an initial value is loaded into it). The register that holds the address for the stack is called a stack
pointer (SP) because its value always points at the top in the stack.

The two operations of a stack are the insertion and deletion of items. The operation of the insertion is
called PUSH (or push-down) because it can be thought of as the result of pushing a new item on top.
The operation of deletion is called POP (or pop-up) because it can be thought as the result of removing
one item so that the stack pops up. However, nothing is pushed or popped in a computer stack. In
computers, these operations are simulated by incrementing or decrementing the stack pointer register.

Register Stack

Department of C.S GTC 77


ITec381 Computer Organization and Architecture

A stack can be placed in a portion of a large memory or it can be organized as a collection of a finite
number of memory words or registers. Figure 5-3 shows the organizations of a 64-word register stack.
The stack pointer register SP contains a binary number whose value is equal to the address of the word
that is currently on top of the stack. Three items are placed in the stack: A, B and C in that order. Item C
is on top of the stack so that the content of SP is now 3. To remove the top item, the stack is popped by
reading the memory word at address 3 and decrementing the content of SP. Item B is now on top of the
stack since SP holds address 2. To insert a new item, the stack is pushed by incrementing SP and writing
a word in the next higher location in the stack. Note that item C has been read out but not physically
removed. This does not matter because when the stack is pushed, a new item is written in its place.

63 Address

FULL EMTY

SP C 3
B 2
A 1
DR
0

Figure 5-3: Block diagram of a 64-word stack

In a 64-word stack, the stack pointer contains 6-bits because 2 6 = 64. Since SP has only 6-bits it can not
exceed a number greater than 63 (111111 in binary). When 63 is incremented by 1, the result is 0 since
111111+ 1 = 1000000 in binary, but SP can accommodate only the 6 list significant bits. Similarly,
when 000000 is decremented by 1, the result is 111111. The 1-bit register FULL is set to 1 when the
stack is full, and the 1-bit register EMTY is set to 1 when the stack is empty of items. DR is the data
register that holds the binary data to be written in to or read out of the stack.

Initially, SP is cleared to 0, EMTY is set to 1 and FULL is cleared to 0, so that points to the word at
address 0 and the stack is marked empty and not full. If the stack is not full (if FULL = 0), a new item is
inserted with the PUSH operation. The PUSH operation is implemented with the following sequence of
micro-operations:

SP SP + 1 Increment stack pointer

M[SP]  DR Write item on top of the stack


Department of C.S GTC 78
ITec381 Computer Organization and Architecture

If (SP = 0) then (FULL  1) Check if stack is full

EMTY  0 Mark the stack not empty

The stack pointer is incremented so that it points to the address the next higher word. A memory write
operation inserts the word from DR in to the top of the stack. Note that SP holds the address of the top
of the stack and that M[SP] denotes the memory word specified by the address presently available in SP.
The first item stored in the stack is at address 1. The last item is stored at address 0. If SP reaches 0, the
stack is full of items, so FULL is set to 1. This condition is reached if the top item prior to the last push
was in location 63 and after incrementing SP, the last item is stored in location 0. Once an item is stored
in location 0, there are no more empty registers in the stack. If an item is written in the stack, obviously
the stack cannot be empty, so EMTY is cleared to 0.

A new item is deleted from the stack if the stack is not empty (if EMTY = 0). The pop operation consists
of the following sequences of micro-operations:

DR M[SP] Read item from the top of the stack

SP SP - 1 Decrement stack pointer

If (SP = 0) then (EMTY  1) Check if stack is empty

FULL  0 Mark the stack not full

The top item is read from the stack into DR. The stack pointer is then decremented. If its value reaches
0, the stack is empty, so EMTY is set to 1. This condition is reached if the item read was in location 1.
Once this item is read out, SP is decremented and reaches the value of 0, which is the initial value of SP.
Note that if a pop operation reads the item from location 0 and then SP is decremented, SP changes to
111111, which is equivalent to decimal 63. In this configuration the word in address 0 receives the last
item in the stack. Note also that an erroneous operation will result if the stack is pushed when FULL=1
or popped when EMTY=1.

Memory Stack
A stack can exist as standalone unit as in Figure 5-3 or can be implemented in random access memory
attached to a CPU. The implementation of a stack in the CPU is done by assigning a portion of memory
to stack operation and using a processor register as a stack pointer. Figure 5-4 shows a portion of
computer memory partitioned into three segments: Program, Data, and Stack. The program counter PC
points at the address of the next instruction in the program. The address register ER points at an array of
data. The stack pointer SP points at the top of the stack.

The three registers are connected to a common address bus, and either one can provide an address for
memory. PC is used during the fetch phase to read an instruction. AR is used during the execute phase to
read an operand. SP is used to push or POP items into or from the stack.

Department of C.S GTC 79


ITec381 Computer Organization and Architecture

Address
M em ory unit
1000
PC
Program
(instructions)

2000
AR
D ata
(operands)

3000
Stack

3997
SP 3998
3999
4000
4001

DR

Figure 5-4: Computer memory with program, data, and stack segments

As shown in Figure 5-4, the initial value of SP is 4001 and the stack grows with decreasing addresses.
Thus the first item stored in the stack is at address 4000, the second item is stored at address 3999, and
the last address that can be used for the stack is 3000. No provisions are available for stack limit checks.
We assume that the items in the stack communicate with a data register DR. A new item is inserted with
the push operation as follows:

SP SP - 1
M[SP]  DR

The stack pointer is decremented so that it points at the address of the next word. A memory write
operation inserts the word from DR into the top of the stack. A new item is deleted with the pop
operation as follows:

DR M[SP]
SP SP + 1

Department of C.S GTC 80


ITec381 Computer Organization and Architecture

The top item is read from the stack into DR. The stack pointer is then incremented to point at the next
item in the stack.

Most computers do not provide hardware to check for stack overflow (full stack) or underflow (empty
stack). The stack limits can be checked by using to processor registers: one to hold the upper limit (3000
in our example), and the other to hold the lower limit (4001 in the example). After a push operation, SP
is compared with the upper limit register and after a pop operation SP is compared with the lower limit
register.

The two micro-operations needed for either the push or pop are:
1. An access to memory through SP and
2. Updating SP.

5.4 Instruction Formats


The physical and logical structure of computers is normally described in reference manuals provided
with the system. Such manuals explain the internal construction of the CPU, including the processor
registers available and their logical capabilities. They list all hardware implemented instructions, specify
their binary code format and provide a precise definition of each instruction. A computer will usually
have a variety of instruction code formats.
It is the function of the control unit within the CPU to interpret each instruction code and provide the
necessary control functions needed to process the instruction. The format of an instruction is usually
depicted in a rectangular box symbolizing the bits of the instruction as they appear in memory word or
in a control register. The bits of the instruction are divided in to groups called Fields. The most common
fields found in instruction formats are:
1. An operation code filed that specifies the operation to be performed
2. An address field that designate a memory address or a processor register
3. A mode field that specifies the way the operand or the effective address is determined
Other special fields are sometimes employed under certain circumstances, for example a field that gives
the number of shifts in a shift type instruction. The operation code field of an instruction is a group of
bits that define various processor operations, such as add, subtract, complement, and shift. The most
common operations available in computer instructions are enumerated and discussed in Section 5.6. The
bits that define the mode field of an instruction code specify the variety of alternatives for choosing the
operand from the given address. The various addressing modes that have been formulated for digital
computers are presented in Section 5.5. In this section we are concerned with the address field of an
instruction format and consider the effect of including multiple address fields in an instruction.
Operations specified by computer instructions are executed on some data stored in memory or processor
registers. Operands residing in memory are specified by their memory address. Operands residing in
processor registers are specified with register address. A register address is a binary number of k bytes
that defines one of 2k registers in the CPU. Thus, a CPU with 16 processor registers R0 through R15 will
have a register address field of 4-bits. The binary number 0101, for example, will designate register R5.

Department of C.S GTC 81


ITec381 Computer Organization and Architecture

Computers may have instruction of several lengths containing varying number of addresses. The number
of address field in the instruction format of a computer depends on the internal organization of its
registers. Most computers fall into one of three types CPU organizations.
1. Single accumulator organization
2. General register organization
3. Stack organization
In an accumulator-type organization, all operations are performed with an implied accumulator register.
The instruction format in this type of computer uses one address field. For example, the instruction that
specifies an arithmetic addition is defined by an assembly language instruction as ADD X, where X is
the address of the operand. The ADD instruction in this case results in the operation AC AC + M[X].
AC is the accumulator register and M[X] symbolizes the memory word located at address X.
The instruction format in a computer with a general register organization type needs three register
address fields. Thus, the instruction for an arithmetic addition may be written in an assembly language
as ADD R1, R2, R3 to denote the operation R1  R2 + R3. The number of address field in the
instruction can be reduced from three to two if the destination register is the same as one of the source
register. Thus the instruction ADD R1, R2 would denote the operation R1 R1 + R2. Only register
addresses for R1 and R2 need be specified in this instruction.
Computers with multiple processor registers use the move instruction with a mnemonics MOV to
symbolize a transfer instruction. Thus the instruction MOV R1, R2 denotes the transfer R1  R2 (or R2
 R1) depending on the particular computer. Thus, transfer-type instructions need two address fields to
specify the source and the destination.
General register–type computers employ two or three address fields in their instruction format. Each
address field may specify a processor register or a memory word. An instruction symbolized by ADD
R1, X would specify the operation R1 R1 + M[x]. It has two address fields, one for register R1 and
the other for the memory address X.
Computers with stack-organization would have PUSH and POP instructions which require an address
field. Thus, the instruction PUSH X will push the word at address X to the top of the stack. The stack
pointer is updated automatically. Operation-type instructions do not need an address field in stack-
organized computers. This is because the operation is performed on the two items that are on top of the
stack. The instruction ADD in a stack-organized computer consists of an operation code only with no
address field. This operation has the effect of popping the two top numbers from the stack, adding the
numbers, and pushing the sum into the stack. There is no need to specify operands with an address filed
since all operands are implied to be in the stack.
Most computers fall in to one of the three types of organizations that have just been described. Some
computers combine features from more than one organizational structure. For example, the Intel 8080
microprocessor has seven CPU registers, one of which is an accumulator register. As a consequence, the
processor has some of the characteristics of a general register type and some of the characteristics of an
accumulator type. All arithmetic and logic instructions, as well as the load and store instructions, use the
accumulator register, so these instructions have only one address field. On the other hand, instructions
that transfer data among the seven processor registers have a format that contain two register address
fields.

Department of C.S GTC 82


ITec381 Computer Organization and Architecture

To illustrate the influence of the number of addresses on computer programs, we will evaluate the
arithmetic statement X = (A+B)  (C+D) using zero, one, two, or three address instructions. We will use
symbols ADD, SUB, MUL and DIV for four arithmetic operations; and LOAD and STORE for transfers
to and from memory and AC register. We will the operand assume that the operands are in memory
addresses A, B, C, and D, and the result must be stored in memory at address X.

Three-address Instruction
Computers with three-address instruction formats can use each address field to specify either a processor
register or a memory operand. The program in assembly language that evaluates X = (A+B)  (C+D) is
shown below, together with comments that explain the register transfer operation of each instruction.
ADD R1, A, B R1 M[A] + M[B]
ADD R2, C, D R2  M[C] + M[D]
MUL X, R1, R2 M[X]  R1  R2

It is assumed that the computer has two processor registers, R1 and R2. The symbol M[A] denotes the
operand at memory address symbolized by A. The advantage of three-address format is that it results in
short programs when evaluating arithmetic expressions. The disadvantage is that the binary-coded
instructions require too many bits to specify three addresses.

Two- Address Register


Two-address instructions are the most common in commercial computers. Here again each address field
can specify either a processor register or a memory word. The program to evaluate X = (A+B)  (C+D)
is shown as follows:
MOV R1, A R1 M[A]
ADD R1, B R1 R1 + R2
MOV R2, C R2  M[C]
ADD R2, D R2 R2 + M[D]
MUL R1, R2 R1 R1 R2
MOV X, R1 M[X]  R1

The MOV instruction moves or transfers the operands to and from memory and processor registers. The
first symbol listed in an instruction is assumed to be both a source and destination where the result of the
operation is transferred.

Department of C.S GTC 83


ITec381 Computer Organization and Architecture

One-Address Instruction
One-address instructions use an implied accumulator (AC) register for all data manipulations. For
multiplication and division there is a need for a second register. However, here we will neglect the
second register and assume that the AC contains the result of all operations. The program to evaluate X
= (A+B)  (C+D) is:
LOAD A AC M[A]
ADD B AC AC + M[B]
STORE T M[T]  AC
LOAD C AC  M[C]
ADD D AC AC + M[D]
MUL T AC ACM[T]
STORE X M[X]  AC

All operations are done between the AC register and a memory operand. T is the address of a temporary
memory location required for storing the intermediate result.

Zero-Address Instruction
A stack-organized computer does not use an address field for instructions ADD and MUL. The PUSH
and POP instructions, however, need an address field to specify the operand that communicates with the
stack. The following program shows how X = (A+B)  (C+D) will be written for a stack-organized
computer (TOS stands for top-of-stack)
PUSH A TOS  A
PUSH B TOS  B
ADD TOS  (A + B)
PUSH C TOS  C
PUSH D TOS  D
ADD TOS  (C + D)
MUL TOS  (C + D)  (A + B)
POP X M[X]  TOS

RISC Instructions
RISC stands for reduced instruction set computer. The instruction set of a typical RISC processor is
restricted to the use of load and store instructions when communicating between memory and CPU. All

Department of C.S GTC 84


ITec381 Computer Organization and Architecture

other instructions are executed within the registers of the CPU without referring to the memory. A
program for a RISC-type CPU consists of LOAD and STORE instructions that have one memory and
one register address and computational-type instructions that have three addresses with all three
specifying processor registers. The following is a program to evaluate X = (A+B)  (C+D).

LOAD R1, A R1 M[A]


LOAD R2, B R2 M[B]
LOAD R3, C R3  M[C]
LOAD R4, D R4 M[D]
ADD R1, R1, R2 R1 R1 + R2
ADD R3, R3, R4 R3 R3 + R4
MUL R1, R1, R3 R1 R1 R3
STORE X, R1 M[X]  R1

The load instructions transfer the operands from memory to CPU registers. The add and multiply
operations are executed with the data in the registers without accessing memory. The result of the
computations is stored in memory with the store instruction.

5.5 Addressing Mode


The operation field of an instruction specifies the operation to be performed. This operation must be
executed on some data stored in computer registers or memory words. The way the operands are chosen
during program execution is dependent on the addressing mode of the instruction. The addressing mode
specifies a rule for interpreting or modifying the address field of the instruction before the operand is
actually referenced. Computers use addressing mode techniques for the purpose of accommodating one
or both of the following provisions.

1. To give programming versatility to the user by providing such facilities as pointers to memory,
counters for loop control, indexing of data and program relocation.
2. To reduce the number of bits in the addressing field of the instruction

The availability of the addressing modes give the experienced assembly language programmer flexibility
for writing programs that are more efficient with respect to the number of instructions and execution
time. To understand the various addressing mode to be presented in this section, it is important that we
understand the basic operation cycle of the computer. The control unit of a computer is designed to go
through an instruction cycle that is divided into three major phases.

1. Fetch the instruction from memory


2. Decode the instruction
3. Execute the instruction

There is one register in the computer called the program counter (PC) that keeps track of the instructions
in the program stored in memory. PC holds the address of the instruction to be executed next and

Department of C.S GTC 85


ITec381 Computer Organization and Architecture

incremented each time an instruction is fetched from memory. The decoding done in step 2, determines
the operation to be performed, the addressing mode of the instruction, and the location of the operand.
The computer then executes the instruction and returns to step 1 to fetch the next instruction in
sequence.

In some computers the addressing mode of the instruction is specified with a distinct binary code, just
like the operation code is specified. Other computers use a singe binary code that designates the
operation and the mode of the instruction. Instructions may be defined with variety of addressing modes,
and sometimes, two or more addressing modes are combined in one instruction.

An example of an instruction format with a distinct addressing mode field is shown in Figure 5-6. The
operation code specifies the operation to be performed. The mode field is used to locate the operands
needed for the operation. There may or may not be an address field in the instruction. If there is an
address field, it may designate a memory address or a processor register. Moreover, the instruction may
have more than one address field, and each address field may be associated with its own particular
addressing mode.

Opcode Mode Address

Figure 5-6: Instruction format with mode field

Although, most addressing modes specify the address field of the instruction there are two modes that
need no address field at all. These are the implied and immediate mode.

Implied Mode: In this mode the operands are specified implicitly in the definition of the instruction. For
example, the instruction “complement accumulator” is an implied mode instruction because the operand
in the accumulator register is implied in the definition of the instruction. In fact, all register reference
instructions that use an accumulator are implied mode instructions. Zero-address instructions in stack-
organized are implied mode instructions since the operands are implied to be on top of the stack.

Immediate Mode: In this mode the operand is specified in the instruction itself. In other words, an
immediate mode instruction has an operand field rather than an address field. The operand field contains
the actual operand to be used in conjunction with the operation specified in the instruction. Immediate
mode instructions are useful for initializing registers to a constant value. When the address field
specifies a processor register, the instruction is said to be in the register mode.

Register Mode: In this mode the operands are in registers that reside within the CPU. The particular
register is selected from a register field in the instruction. A k-bit field can specify any of the 2 k
registers.

Register Indirect Mode: In this mode the instruction specifies a register in the CPU whose contents
give the address of the operand in memory. In other words, the selected register contains the address of
the operand rather than the operand itself. Before using a register indirect mode instruction, the
programmer must ensure that the memory address of the operand is placed in the processor register with
a previous instruction. A reference to the register is then equivalent to specifying a memory address. The

Department of C.S GTC 86


ITec381 Computer Organization and Architecture

advantage of the register indirect mode instruction is that the address field of the instruction uses fewer
bits to select a register than would have been required to specify a memory address directly.

Autoincrement and Autodecrement Mode: This is similar to the register indirect mode except that
the register is incremented or decremented after (or before) its value is used to access memory. When
the address stored in the register refers to a table of a data in memory, it is necessary to increment or
decrement the register after every access to the table. The address field of an instruction is used by the
control unit in CPU to obtain the operand from memory.
Sometimes the value given in the address field is the address of the operand, but sometimes it is just an
address from which the address of the operand is calculated. To differentiate among the various
addressing mode it is necessary to distinguish between the address part of the instruction and the
effective address used by the control when executing the instruction. The effective address is defined to
be the memory address obtained from the computation dictated by the given addressing mode. The
effective address is the address of the operand in a computational type instruction. It is the address
where control branches in response to a branch type instruction.

Direct Addressing Mode: In this mode the effective address is equal to the address part of the
instruction. The operand resides in memory and its address is given directly by the address field of the
instruction. In a branch type of instruction the address field specifies the actual branch address.

Indirect Addressing Mode: In this mode the address field of the instruction gives the address where
the effective address is stored in memory. Control fetches the instruction from memory and uses its
address part to access memory again to read the effective address.

Relative Addressing mode: In this mode the content of program counter is added to the address part of
the instruction in order to obtain the effective address. The address part of the instruction is usually a
signed number (in 2’s complement representation) which can be either negative or positive. When this
number is added to the content of the program counter, the result produces an effective address whose
position in memory is relative to the address of the next instruction.
Indexed Addressing Mode: In this mode the content of an index register is added to the address part of
the instruction to obtain the effective address. The address field of the instruction defines the beginning
address of a data array in memory. Each operand in the array is stored in memory relative to the
beginning address. The distance (offset) between the beginning address and the address of the operand is
the index value stored in the index register. In computers with a dedicated index register, the index
register is involved implicitly in index mode instructions.
Base Register Addressing Mode: In this mode the content of a base register is added to the address part
of the instruction to obtain the effective address. It is similar to the indexed addressing. The difference is
that the base register contains the base address and the address field of the instruction contains the
displacement (offset) relative to this base address. In contrast, in indexing mode, the index register
contains the offset relative to the address part of the instruction.

Department of C.S GTC 87


ITec381 Computer Organization and Architecture

5.6 Data Transfer and Manipulation


Computers provide an extensive set of instruction to give the user the flexibility to carry out various
computational tasks. The instruction set of different computers differ from each other mostly in the way
the operands are determined from the address and mode fields. The actual operations available in the
instruction set are not very different from one computer to another. It so happens that the binary code
assignments in the operation code field is different in different computers, even for the same operation.
It may also happen that the symbolic name given to instructions in the assembly language notation is
different in different computers, even for the same instruction.
Most computer instructions can be classified in to three categories.

1. Data transfer instruction


2. Data manipulation instruction
3. Program control instruction

Data transfer instruction


Data transfer instructions move data from place to place in the computer to another without changing the
data content. The most common transfers are between memory and processor registers, between
processor register and input or output, and between the processor registers themselves. Table 5-5 gives
lists of eight data transfer instructions used in many computers. Accompanying each instruction is
mnemonic symbol.

Table 5-5: Typical Data transfer Instructions


Name Mnemonic
Load LD
Store ST
Move MOV
Exchange XCH
Input IN
Output OUT
Push PUSH
Pop POP

The load instruction has been used mostly to designate a transfer from memory to a processor register,
usually an accumulator. The store instruction designates a transfer from a processor register into
memory. The move instruction has been used in computers with multiple CPU registers to designate a
transfer from one register into another. It has been used for data transfers between CPU registers and
memory or between two memory words. The exchange instruction swaps information between two
registers or a register and a memory word. The Input and output instructions transfer data among
processor register and input or output terminals. The push and pop instructions transfer data between

Department of C.S GTC 88


ITec381 Computer Organization and Architecture

processor registers and memory stack. Some assembly language conventions modify the mnemonic
symbol to differentiate between the different addressing modes. For example the mnemonic for the
loadimmediate becomes LDI. Other assembly language conventions use a special character to designate
the addressing mode. For example, the immediate mode is recognized from pound sign # placed before
the operand. In any case, the important thing to recognize is that each instruction can occur with a
variety of addressing modes. Table 5-6 lists the addressing mode for the load instruction.

Table 5-6: Eight addressing modes for the load operation


Assembly
Mode Register Transfer
Convention

Direct Address LD ADR AC  M[ADR]

Indirect Address LD @ ADR AC  M[M[ADR]]

Elative Address LD $ADR AC  M[PC + ADR]

Immediate Operand LD #NBR AC  NBR

Index addressing LD ADR(X) AC  M[ADR + XR]

Register LD R1 AC  R1

Register indirect LD (R1) AC  M[R1]

Autoincrement LD (R1)+ AC  M[R1], R1 R1 + 1

Data Manipulation

Data manipulation instructions perform operations on data and provide the computational capabilities
for the computer. The data manipulation instructions in a typical computer are usually divided into three
basic types.

1. Arithmetic instruction
2. Logical and bit manipulation
3. Shift instruction

Arithmetic Instruction
The four basic arithmetic operations are addition, subtraction, multiplication and division. Most
computers provide instructions for all four operations. Some small computers have only addition and

Department of C.S GTC 89


ITec381 Computer Organization and Architecture

possibly subtraction instructions. The multiplication and division must then be generated by means of
software subroutines. The four basic arithmetic operations are sufficient for formulating solution to
scientific problems when expressed in terms of numerical analysis methods.

A list of typical arithmetic instructions is given in Table 5-7. The increment instruction adds 1 to the
value stored in a register or memory word. One common characteristic of the increment operations when
executed in processor registers is that a binary number of all 1’s when incremented produces a result of
all 0’s. The decrement instruction subtracts 1 from a value stored in a register or memory word. A
number with all 0’s, when decremented, produces a number with all 1’s. The add, subtract, multiply and
divide instructions may be available for different types of data. The data type assumed to be in processor
registers during the execution of these arithmetic operations is included in the definition of the operation
code.

Table 5-7: Typical arithmetic operations

Name Mnemonic
Increment INC
Decrement DEC
Add ADD
Subtract SUB
Multiply MUL
Divide DIV
Add with Carry ADDC
Subtract with borrow SUBB
Negate(2’s complement) NEG

Logical and Bit manipulation Instructions


Logical instructions perform binary operations on strings of bits stored in registers. They are useful for
manipulating individual bits or a group of bits that represent binary–coded information. The logical
instructions consider each bit of the operands separately and treat it as a Boolean variable. By proper
application of the logical instructions it is possible to change bit values, to clear a group of bits or to
insert new bit values into the operands stored in registers or memory words. Some logical and bit
manipulation instructions are listed in Table 5-8 as follows.

Department of C.S GTC 90


ITec381 Computer Organization and Architecture

Table 5-8: Typical logical and bit manipulation instructions


Name Mnemonic
Clear CLR
Complement COM
AND AND
OR OR
Exclusive-OR XOR
Clear Carry CLRC
Set Carry SETC
Complement Carry COMC
Enable Interrupt EI
Disable Interrupt DI

The clear instruction causes the specified operand to be replaced by 0’s. The complement instruction
produces the 1’s complement by inverting all the bits of the operand. The AND, OR and XOR
instructions produces the corresponding logical operations on individual bits of the operands. Although
they perform Boolean operation, when used in computer instructions, the logical instructions should be
considered as performing bit manipulation operation. There are three bit manipulations: a selected bit
can be clear to 0, or can be set to 1, or can be complemented.

The AND instruction is used to clear a bit or a selected group of bits of an operand. For any Boolean
variable x, the relationship x ∙ 0 = 0 and x ∙ 1 = x dictate that a binary variable ANDed with 0 produces a
0, but the variable does not change in valued when ANDed with a 1. Therefore, the AND instruction
can be used to clear bits of an operand selectively by ANDing the operand with another operand that
has 0’s in the bit positions that must be cleared. The AND instruction is also called a mask because it
masks or inserts 0’s in a selected portion of the operand.

The OR instruction is used to set a bit or a selected group of bits of an operand. For any Boolean
variables x, the relationship x + 1 = 1 and x + 0 = x dictates that a binary variable ORed with a 1
produces a 1; but the variable bits of x does not change when ORed with a 0. Therefore, the OR
instruction can be used to selectively set the bits of an operand by ORing it with another operand with
1’s in the bit positions that must be set to 1.

Department of C.S GTC 91


ITec381 Computer Organization and Architecture

Similarly, the XOR instruction is used to selectively complement bits of an operand. This is because of
the Boolean relationships x  1 = x′ and x  0 = x. Thus a binary variable is complemented when
XORed with a 1 but does not change in value when XORed with a 0.

Shift Instruction
Instructions to shift the content of an operand are quite useful and are often provided in several
variations. Shifts are operations in which the bits of a word are moved to the left or right. The bit shifted
in at the end of the word determines the type of shift used. Shift instructions may specify either logical
shifts, arithmetic shifts, or rotate type operations. In either case the shift may be to the right or to the left.

Table 5-9 lists four types of shift instructions. The logical shift inserts 0 to the end bit position. The end
position is the leftmost bit for shift right and the rightmost bit position for the shift left. Arithmetic shifts
usually conform with the rules for signed 2’s complement. The arithmetic shift right instruction must
preserve the sign bit in the left most position. The sign bit is shifted to the right together with the rest of
the number, but the sign bit itself remains unchanged. This is a shift right operation with the end bit
remaining the same. The arithmetic shift-left instruction inserts 0 to the end position and is identical to
the logical shift-left instruction. For this reason many computers do not provide a distinct arithmetic
shift-left instruction when the logical shift-left instruction is already available.

The rotate instructions produce a circular shift. Bits shifted out at one end of the word are not lost as in
logical shift but are circulated back into the other end. The rotate through carry instruction treats a carry
bit as an extension of the register whose word is being rotated. Thus a rotate–left through carry
instruction transfers the carry bit into the right most bit position of the register, transfers the left most bit
position into the carry, and at the same time shifts the entire register to the left.

Department of C.S GTC 92


ITec381 Computer Organization and Architecture

Some computers have a multiple field format for the shift instructions. One field contains the operation
code and the others specify a type of shift and the number of times that an operand is to be shifted. A
possible instruction code format of a shift instruction may include five fields as follows:

OP REG TYPE RL COUNT

Here OP is the operation code field; REG is a register address that specifies the location of an operand;
TYPE is a 2-bit field specifying the four different types of shifts; RL is a 1-bit field specifying a shit-
right or shift-left, and COUNT is a k-bit field specifying up to 2 k-1 shifts. With such a format, it is
possible to specify the type of shifts, the direction, and the number of shifts all in one instruction

Table 5-9: Typical shift instructions


Name Mnemonic

Logical Shift-right SHR


Logical Shift-left SHL
Arithmetic Shift-right SHRA
Arithmetic Shift-Left SHLA
Rotate right ROR
Rotate left ROL
Rotate Right through Carry RORC
Rotate Left Through carry ROLC

5.7 Reduced Instruction Set computer (RISC)


An important aspect of computer architecture is the design of the instruction set for the processor. The
instruction set chosen for a particular computer determines the way that machine language programs are
constructed. Early computers had small and simple instruction sets, forced mainly by the need to
minimize the hardware to implement them. As digital hardware became cheaper with the advent of
integrated circuits, computer instructions tended to increase both in number and complexity. Many
computers have instruction sets that include more than 100 and sometimes even more than 200

Department of C.S GTC 93


ITec381 Computer Organization and Architecture

instructions. These computers also employ a variety of data types and a large number of addressing
modes. The trend into computer hardware complexity was influenced by various factors, such as
upgrading existing models, to provide more customer applications, adding instructions that facilitate the
translation from high-level language into machine language programs, and striving to develop machines
that move functions from software implementation into hardware implementation. A computer with a
large number of instructions is classified as a complex instruction set computer, abbreviated as CISC.
In the early 1980’s, a number of computer designers recommended that computers use fewer
instructions with simple constructs so they can be executed much faster within the CPU without having
to use memory as often. This type of computer is classified as a Reduced Instruction Set Computer
(RISC).

CISC characteristics
 large number of instructions – typically from 100 to 250 instructions
 some instructions that perform specialized tasks and are used infrequently
 a large variety of addressing modes – typically from 5 to 20 different modes
 variable length instruction formats
 instructions that manipulate operands in memory

RISC characteristics
 relatively few instructions
 relatively few addressing modes
 memory access limited to load and store instructions
 all operations done within the register of the CPU
 fixed length, easily decoded instruction format
 single cycle instruction execution
 hard-wired rather than micro-programmed control
 a relatively large number of registers in the processor
 use overlapped register windows to speed up procedure call and return
 efficient instruction pipeline
 compiler support for efficient transmission high-level language programs into machine language
programs

5.8 Review Questions


1. A bus organized CPU similar to Figure 5-2 has 16 registers with 32-bits in each, an ALU, and a
destination decoder.

Department of C.S GTC 94


ITec381 Computer Organization and Architecture

a) How many multiplexers are there in the A bus, and what is the size of each
multiplexers?
b) How many selection inputs are needed for a MUX-A and MUX-B?
c) How many inputs and outputs are there in the decoder?
d) How many inputs and outputs are there in the ALU for data, including inputs and
outputs carries?
e) Formulate your control word for the system assuming that the ALU has 35
operations.
2. A computer has 32-bit instructions and 12-bit addresses. If there are 250 2-address instructions,
how many one-address instructions can be formulated?
3. Give an example of a RISC instructions that will perform the following operations
A. Decrement a register C. Negate a register E. Divide a signed number by 4
B. Complement a register D. Clear a register to 0 F. No operation

Department of C.S GTC 95


ITec381 Computer Organization and Architecture

Chapter 6
Parallel Processing
Objective:
After completing this chapter you will be familiar with:
 The concept of parallel processing
 Pipelining
 Variety of ways through which pipelining processing can be accomplished
 Vector processing
 Array processor

6.1 Introduction to Parallel Processing


A traditional way to increase computer system performance is to use multiple processors that can
execute in parallel to support a given workload. The most common multiple-processor organizations is
symmetric multiprocessors (SMPs)

SMP consists of multiple similar processors within the same computer, interconnected by a bus or some
sort of switching arrangement. Each processor has its own cache and so it is possible for a given line of
data to bepresent in more than one cache.

When more than one processor is implemented on a single chip, the configuration is referred to as
multiprocessor system. This design scheme is used to replicate some of the components of a single
processor so that the processor can execute multiple threads/processes/tasks concurrently.

This view of the computer is upgraded to the micro-operation levels, multiple control signals are
generated at the same time. Instruction pipelining, at least to the extent of overlapping fetch and
execute operations, has been around for a long time. Both of these are examples of performing
functions in parallel. This approach is taken further with superscalar organization, which exploits
instruction-level parallelism. With a superscalar machine, there are multiple execution units within a
single processor, and these may execute multiple instructions from the same program in parallel. As
computer technology has evolved, and as the cost of computer hardware has dropped, computer
designers have sought more and more opportunities for parallelism.

Department of C.S GTC 96


ITec381 Computer Organization and Architecture

Parallel processing/computing, uses multiple processing elements simultaneously to solve a problem.


This is accomplished by breaking the problem/task into independent parts so that each processing
element can execute its part of the algorithm simultaneously with the others. The processing elements
can be diverse and include resources such as a single computer with multiple processors, several
networked computers, specialized hardware, or any combination of the above.

Parallel processing or parallel computing is a form of computation in which many calculations are
carried out simultaneously, operating on the principle that large problems can often be divided into
smaller ones, which are then solved concurrently ("in parallel").

The following are important points about parallel processing:


 A parallel processing system is able to perform concurrent data processing to achieve faster
execution time
 The system may have two or more ALUs and be able to execute two or more instructions at the
same time
 Also, the system may have two or more processors operating concurrently
 Goal is to increase the throughput – the amount of processing that can be accomplished during a
given interval of time
 A multifunctional organization is usually associated with a complex control unit to coordinate all
the activities among the various components.

Figure 6-1 shows one possible way of separating the execution unit into eight functional units operating
in parallel. The operands in the registers are applied to one of the units depending on the operation
specified by the instruction associated with the operands. The operation performed in each functional
unit is indicated in each block of the diagram. The adder and integer multiplier perform the arithmetic
operations with integer numbers. The floating point operations are separated into three circuits operating
in parallel.

 Parallel processing can be classified from:


o The internal organization of the processors
o The interconnection structure between processors
o The flow of information through the system
o The number of instructions and data items that are manipulated simultaneously
 The sequence of instructions read from memory is the instruction stream
 The operations performed on the data in the processor is the data stream
 Parallel processing may occur in the instruction stream, the data stream, or both

Department of C.S GTC 97


ITec381 Computer Organization and Architecture

Adder-subtractor

Integer multiply

Logic unit

To memory Processor Shift unit


registers

Incrementer

Floating point
add-subtract

Floating point
multiply

Floating point
divide

Figure 6-1: Processor with multiple functional units

 Computer can be classified as:


o Single instruction stream, single data stream – SISD
o Single instruction stream, multiple data stream – SIMD
o Multiple instruction stream, single data stream – MISD
o Multiple instruction stream, multiple data stream – MIMD
 SISD – Instructions are executed sequentially. Parallel processing may be achieved by means of
multiple functional units or by pipeline processing
 SIMD – Includes multiple processing units with a single control unit. All processors receive the
same instruction, but operate on different data.
 MISD-- The same data stream flows through a linear array of processors executing different
instruction streams.
 MIMD – A computer system capable of processing several programs at the same time.
 We will consider parallel processing under the following main topics:
Department of C.S GTC 98
ITec381 Computer Organization and Architecture

o Pipeline processing
o Instruction Pipeline
o Arithmetic pipeline
o RISC pipeline
o Vector processing
o Array processors

6.2Pipelining
Pipelining is an implementation technique where multiple instructions are overlapped in execution. The
computer pipeline is divided intostages. Each stage completes a part of an instruction in parallel. The
stages are connected one to the next to form a pipe - instructions enter at one end, progress through the
stages, and exit at the other end.

Pipelining refers to the technique in which a given task is divided into a number of subtasks that need to
be performed in sequence. Each subtask is performed by a given functional unit. The units are connected
in a serial fashion and all of them operate simultaneously. The use of pipelining improves the
performance compared to the traditional sequential execution of tasks.

In general, pipelining is a technique of decomposing a sequential process into sub operations, with each
sub process being executed in a special dedicated segment that operates concurrently with all other
segments.Each segment performs partial processing dictated by the way the task is partitioned.The result
obtained from the computation in each segment is transferred to the next segment in the pipeline.The
final result is obtained after the data have passed through all segments.

Example: Suppose that we want to perform the combined multiply and add operations with a stream of
numbers. In Figure 6-2, R1 through R5 are registers that receives new data with every clock pulse.
Ai Bi + Ci , for i = 1, 2, 3, …, 7
The suboperations performed in each segment are:
R1 ← Ai , R2 ← Bi Input Ai and Bi
R3 ← R1  R2, R4 ← Ci Multiply and input Ci
R5 ← R3 + R4 Add Ci to product

Department of C.S GTC 99


ITec381 Computer Organization and Architecture

Ai Bi Ci

R1 R2

Multiplier

R3 R4

Adder

R5
Figure 6-2: Example of pipeline processing

Table 6-2 shows the effect of each clock. The first clock pulse transfers A 1 and B1 into R1 and R2. The
second clock pulse transfers the product of R1 and R2 into R3 and C1 into R4. The same clock pulse
transfers A2 and B2 into R1 and R2 .The third clock pulse operates on all three segments simultaneously.
It places A3 and B3 into R1 and R2, transfer the product of R1 and R2 into R3, transfer C2 into R4, and
place the sum of R3 and R4 into R5.

Table 6-1: Content of registers in pipeline example


Clock Segment 1 Segment 2 Segment 3
Pulse
R1 R2 R3 R4 R5
Number
1 A1 B1   
2 A2 B2 A1 B1 C1 
3 A3 B3 A2 B2 C2 A1 B1 + C1
4 A4 B4 A3 B3 C3 A2 B2 + C2
5 A5 B5 A4 B4 C4 A3 B3 + C3
6 A6 B6 A5 B5 C5 A4 B4 + C4
7 A7 B7 A6 B6 C6 A5 B5 + C5
8   A7 B7 C7 A6 B6 + C6
9     A7 B7 + C7

Department of C.S GTC 100


ITec381 Computer Organization and Architecture

Any operation that can be decomposed into a sequence of sub operations of about the same complexity
can be implemented by a pipeline processor.The technique is efficient for those applications that need to
repeat the same task many times with different sets of data. A task is the total operation performed going
through all segments of a pipeline.

6.2.1Arithmetic Pipeline

Pipeline organization is applicable for arithmetic operations and fetching instructions. The principles
used in instruction pipelining can be used in order to improve the performance of computers in
performing arithmetic operations such as add, subtract, and multiply. In this case, these principles will
be used to realize the arithmetic circuits inside the ALU. In this section, we will elaborate on the use of
arithmetic pipeline as a means to speed up arithmetic operations. We will start with fixed-point
arithmetic operations and then discuss floating-point operations.

Pipeline arithmetic units are usually found in very high speed computers. They are used to implement
floating-point operations, multiplication of fixed-point numbers, and similar computations encountered
in scientific problems. Example for floating-point addition andsubtraction is shown below. The inputs
a b
are two normalized floating-point binary numbers X and Y where X = A x 2 and Y = B x 2

 A and B are two fractions that represent the mantissas


 a and b are the exponents
 Four segments are used to perform the following suboperations:
1. Compare the exponents
2. Align the mantissas
3. Add or subtract the mantissas
4. Normalize the result

Department of C.S GTC 101


ITec381 Computer Organization and Architecture

a b A B

R R

Compare Exponents Difference


Segment 1: by subtraction

Segment 2: Choose exponent Align mantissas

Add or subtract
Segment 3: mantissas

R R

Segment 4: Adjust exponent Normalize result

R R

Figure 6-3: Pipeline for floating point addition and subtraction


The following numerical example may clarify the above suboperations that add two floating point
numbersX and Y.

 X = 0.9504  103 and Y = 0.8200  102


 The two exponents are subtracted in the first segment to obtain 3 – 2=1
 The larger exponent 3 is chosen as the exponent of the result.
 Segment 2 shifts the mantissa of Y to the right to obtain Y = 0.0820  103
 The mantissas are now aligned
 Segment 3 produces the sum Z = 1.0324  103

Department of C.S GTC 102


ITec381 Computer Organization and Architecture

 Segment 4 normalizes the result by shifting the mantissa once to the right and incrementing the
exponent by one to obtain Z = 0.10324  104

6.2.2 Instruction Pipeline

Pipeline processing can occur not only in the data stream but also in the instruction stream as well.An
instruction pipeline reads consecutive instructions from memory while previous instructions are being
executed in other segments.This causes the instruction fetch and executes phases to overlap and perform
simultaneous operations.

Consider a computer with an instruction fetch unit and an instruction execution unit forming a two
segment pipeline.A FIFO (First In First Out) buffer can be used for the fetch segment. Thus, an
instruction stream can be placed in a queue, waiting for decoding and processing by the execution
segment.This reduces the average access time to memory for reading instructions.Whenever there is
space in the buffer, the control unit initiates the next instruction fetch phase.The following steps are
needed to process each instruction:
1. Fetch the instruction from memory
2. Decode the instruction
3. Calculate the effective address
4. Fetch the operands from memory
5. Execute the instruction
6. Store the result in the proper place

Example: Four-segment instruction pipeline

Assume that most of the instructions store the result in a register so that the execution and storing of the
result can be combined in one segment. This reduces the instruction pipeline into four segments. Figure
6-4 shows how the instruction cycle in the CPU can be processed with a four segment pipeline. While an
instruction is being executed in segment 4, the next instruction in sequence is busy fetching an operand
from memory in segment 3. Up to four suboperations in the instruction cycle can overlap and up to four
different instructions can be in progress of being processed at the same time.

Figure 6-4: Four-segment CPU pipeline

Fetch instruction
Department of C.S Segment 1: GTC 103
from memory

Decode instruction
effective address

ITec381 Computer Organization and Architecture


Yes
Branch

No

Fetch operand
Segmentcauses
There are three major difficultiesthat 3: instruction
from memorypipeline to deviate from its normal operation are:
1. Resource conflicts caused by access to memory by two segments at the same time.
2. Data dependency conflicts arise when an instruction depends on the result of a previous
Segment 4: Execute instruction
instruction, but this result is not yet available
3. Branch difficulties arise from program control instructions that may change the value of PC
Methods to handle data dependency:
Interrupt Yes
handling Interrupt?
 Hardware interlocks are circuits that detect instructions whose source operands are destinations of
prior instructions. Detection causes the hardware to insert the required delays without altering the
No
program sequence.
Update PC
 Operand forwarding uses special hardware to detect a conflict and then avoid it by routing the
data through special paths between pipeline segments. This requires additional hardware paths
Empty pipe
through multiplexers as well as the circuit to detect the conflict.
 Delayed load is a procedure that gives the responsibility for solving data conflicts to the compiler.
The compiler is designed to detect a data conflict and reorder the instructions as necessary to delay
the loading of the conflicting data by inserting no-operation instructions.
Methods to handle branch instructions:
 Prefetching the target instruction in addition to the next instruction allows either instruction to
be available.
 A branch target buffer is an associative memory included in the fetch segment of the branch
instruction that stores the target instruction for a previously executed branch. It also stores the
next few instructions after the branch target instruction. This way, the branch instructions that
have occurred previously are readily available in the pipeline without interruption.
 Branch prediction uses some additional logic to guess the outcome of a conditional branch
instruction before it is executed. The pipeline then begins prefetching instructions from the
predicted path.
 Delayed branch is used in most RISC processors so that the compiler rearranges the instructions
to delay the branch.

Department of C.S GTC 104


ITec381 Computer Organization and Architecture

6.2.3 RISC Pipelines

A RISC (Reduced Instruction Set Computer) processor pipeline operates in much the same way,
although the stages in the pipeline are different. While different processors have different numbers of
steps, they have basically variations of these five steps:
1. fetch instructions from memory
2. read registers and decode the instruction
3. execute the instruction or calculate an address
4. access an operand in data memory
5. write the result into a register

The length of the pipeline is dependent on the length of the longest step. Because RISC instructions are
simpler than those used in pre-RISC processors (now called CISC, or Complex Instruction Set
Computer), they are more conducive to pipelining. While CISC instructions varied in length, RISC
instructions are all the same length and can be fetched in a single operation. Ideally, each of the stages in
a RISC processor pipeline should take 1 clock cycle so that the processor finishes an instruction each
clock cycle and averages one cycle per instruction (CPI).

Among the characteristics attributed to RISC is its ability to use an efficient instruction pipeline. The
simplicity of the instruction set can be utilized to implement an instruction pipeline using small number
of suboperations, with each being executed in one clock cycle.

6.3Vector Processing
The part of a computer that allows it to function, carrying out the instructions of various programs, is the
central processing unit (CPU). The CPU, also called a processor, receives a program's instructions;
decodes those instructions, breaking them into individual parts; executes those instructions; and reports
the results, writing them back into memory. The format for that processor comes in one of two primary
types: vector and scalar. The difference between the two is that scalar processors operate on only one
data point at a time, while vector processors operate on an array of data.

Many scientific problems require arithmetic operations on large arrays of numbers. These numbers are
usually formulated as vectors and matrices of floating point numbers. A vector is an ordered set of a one

Department of C.S GTC 105


ITec381 Computer Organization and Architecture

dimensional array of data items. A vector V of length n is represented as a row vector by V= [V1, V2,
V3,...,Vn].It may be represented as a column vector if the data items are listed in a column.

A computer capable of vector processing eliminates the overhead associated with the time it takes to
fetch and execute the instructions in the program. Vector processing allows operations to be specified
with a single vector instruction of the form:
C (1:100) = A (1:100) + B (1:100)
This vector instruction includes the initial address of the operands, the length of the vectors, and the
operation to be performed, all in one composite instruction. The addition is done with a pipelined
floating point adder .Matrix multiplication is one of the most computational intensive operations
performed in computer with vector processors.

In general, vector processors operate on an array of data points. This means that rather than handling
each item individually, multiple items that all have the same instruction can be handled at once. This can
save time over scalar processing, but also adds complexity to a system, which can slow other functions.
Vector processing works best when there is a large amount of data to be processed, groups of which can
be handled by one instruction.

6.4Array Processor
Array processor, is a CPU design that is able to run mathematical operations on multiple data elements
simultaneously. Array processors were common in the scientific computing area, where they formed the
basis of most supercomputers. Today almost all commodity CPU designs include some array processing
instructions, typically known as SIMD.

An array processor is a processor that performs computations on large arrays of data. The term is used to
refer to two processors. An attached array processor is an auxiliary processor attached to a general
purpose computer. It is intended to improve the performance of the host computer in specific numerical
computation tasks.An SIMD array processor is a processor that has a Single Instruction Multiple Data
organization. It manipulates vector instructions by means of multiple instructional units responding to
common instruction.

6.5 Review Questions


1. What is the main goal of parallel processing?
2. What is pipelining? Discuss the various pipelining techniques.
3. What are the reasons that prevent pipelining from achieving the maximum point of parallelism?

Department of C.S GTC 106


ITec381 Computer Organization and Architecture

4. Describe vector processing and array processor.

Department of C.S GTC 107


ITec381 Computer Organization and Architecture

Chapter 7
Input/OutputOrganization
Objectives:
By the end of this chapter, you should be able to:
 Discuss the classification of external devices
 Discuss I/O problems and functions of I/O modules
 Explain the concept of programmed I/O.
 Explain the various I/O instructions
 Explain the concept of Interrupt driven I/O.
 Explain the DMA function, operation and configurations

7.1External Devices
The input/output subsystem of a computer, referred to as I/O, provides an efficient mode of
communication between the central system and the outside environment. Programs and data must be
entered into computer memory for processing and results obtained from computations must be recorded
or displayed for users. A computer serves no useful purpose without the ability to receive information
from an outside source and to transmit results in a meaningful form.

I/O operations are accomplished through a wide assortment of external devices that provide a means of
exchanging data between the external environment and the computer. An external device attaches to the
computer by a link to an I/O module as shown in figure below. The link is used to exchange control,
status, and data between the I/O module and the external device. An external device connected to an I/O
module also called Interface is often referred to as a peripheral device or simply a peripheral.

Classification of External devices

External devices broadly can be classified into three categories:


1. Human readable: suitable for communicating with the computer user.Examples: Screen,
keyboard, video display terminals (VDT) and printers.
2. Machine readable: suitable for communicating with equipments. Examples: magnetic disk &
tapes systems, Monitoring and control, sensors and actuators which are used in robotics.
3. Communication: These devices allow a computer to exchange data with remote devices, which
may be machine readable or human readable.Examples: Modem, Network Interface Card (NIC)

Department of C.S GTC 108


ITec381 Computer Organization and Architecture

I/O bus
Data
Processor Address
Control

Interface Interface Interface Interface I/O


Modules

Keyboard
and display Magnetic Magnetic
Printer
terminal disk tape

Figure 7-1: Connection of CPU, I/O module and peripheral devices

Input/output Problems

 Wide variety of peripherals


 Delivering different amounts of data per second
 Work at different speeds
 Send/receive data in different formats
 All slower than CPU and RAM

Hence I/O modulesare used as a solution.

7.2 Input/output Module


It is the entity within a computer that is responsible for the control of one or more external devices and
for the exchange of data between those devices and main memory and/or CPU. Thus I/O moduleis an:
 Interface to CPU and memory
 Interface to one or more peripherals

I/O Module Function

The major functions or requirements for an I/O module fall into the following five categories.
 Control & Timing
 CPU Communication
 Device Communication
 Data Buffering
Data Buffering

Department of C.S GTC 109


ITec381 Computer Organization and Architecture

During any period of time, the CPU may communicate with one or more external devices in
unpredictable patterns on the program’s need for I/O. The internal resources, main memory and the CPU
must be shared among number of activities including handling data I/O. Thus the I/O device includes a
control and timing requirement to coordinate the flow of traffic between internal resources and external
devices to the CPU. Thus CPU might involve in sequence of operations like:
 CPU checks I/O module device status
 I/O module returns device status
 If ready, CPU requests data transfer
 I/O module gets data from device
 I/O module transfers data to CPU
 Variations for output, DMA, etc.

I/O module must have the capability to engage in communication with the CPU and external device.
Thus CPU communication involves:
 Command decoding: The I/O module accepts commands from the CPU carried on the control
bus.
 Data: data are exchanged between the CPU and the I/O module over data bus
 Status reporting: Because peripherals are slow it is important to know the status of I/O device.
I/O module can report with the status signals common used status signals are BUSY or READY.
Various other status signals may be used to report various error conditions.
 Address recognition: just as each memory word has an address, there is address associated with
every I/O device. Thus I/O module must be recognized with an unique address for each
peripheral it controls.

The I/O module must also be able to perform device communication. This communication involves
commands, status information, and data. Some of the essentials tasks are listed below:
 Error detection: I/O module is often responsible for error detection and subsequently reporting
errors to the CPU.
 Data buffering: the transfer rate into and out of main memory or CPU is quite high, and the rate
is much lower for most of the peripherals. The data is buffered in the I/O module and then sent to
the peripheral device at its rate. In the opposite direction data are buffered so as not to tie up the
memory in a slow transfer operation. Thus I/O module must be able to operate at both device and
memory speeds.

Department of C.S GTC 110


ITec381 Computer Organization and Architecture

7.3Input/Output Techniques (Data transfer mode)


Three techniques are possible for I/O operations or data transfer mode. They are:
 Programmed I/O

 Interrupt driven
 Direct Memory Access (DMA)

7.3.1 Programmed I/O

With Programmed I/O, data are exchanged between the CPU and the I/O module. The CPU executes a
program that gives it direct control of the I/O operation, including sensing device status, sending a read
or write command and transferring data. When CPU issues a command to I/O module, it must wait until
I/O operation is complete. If the CPU is faster than I/O module, there is wastage of CPU time.

The I/O module does not take any further action to alert CPU. That is it doesn’t interrupt CPU. Hence it
is the responsibility of the CPU to periodically check the status of the I/O module until it finds that the
operation is complete.

The sequences of actions that take place with programmed I/O are:
 CPU requests I/O operation

 I/O module performs operation


 I/O module sets status bits
 CPU checks status bits periodically
 I/O module does not inform CPU directly
 I/O module does not interrupt CPU
 CPU may wait or come back later

I/O commands

To execute an I/O related instruction, the CPU issues an address, specifying the particular I/O module
and external device and an I/O command. Four types of I/O commands can be received by the I/O
module when it is addressed by the CPU. They are
 A control command: is used to activate a peripheral and tell what to do.Example: a magnetic
tape may be directed to rewind or move forward a record.

Department of C.S GTC 111


ITec381 Computer Organization and Architecture

 A test command: is used to test various status conditions associated with an I/O module and its
peripherals. The CPU wants to know the interested peripheral for use. It also wants to know the
most recent I/O operation is completed and if any errors have occurred.
 A read command: it causes the I/O module to obtain an item of data from the peripheral and
place it in an internal buffer. The CPU then gets the data items by requesting I/O module to place
it on the data bus.
 A write command: it causes the I/O module to take an item of data from the data bus and
subsequently transmit the data item to the peripheral.

I/O Mapping

When the CPU, main memory, and I/O module share a common bus two modes of addressing are
possible.

1. Memory mapped I/O


 Devices and memory share an address space
 I/O looks just like memory read/write
 No special commands for I/O
 Large selection of memory access commands available

2. Isolated I/O
 Separate address spaces
 Need I/O or memory select lines
 Special commands for I/O
 Limited set

7.3.2 Interrupt Driven I/O

Using Program-controlled I/O requires continuous involvement of the processor in the I/O activities. It
is desirable to avoid wasting processor execution time. An alternative is for the CPU to issue an I/O
command to a module and then go on other work. The I/O module will then interrupt the CPU
requesting service when it is ready to exchange data with the CPU. The CPU will then execute the data
transfer and then resumes its former processing. Based on the use of interrupts, this technique improves
the utilization of the processor.

Department of C.S GTC 112


ITec381 Computer Organization and Architecture

With Interrupt driven I/O, the CPU issues a command to I/O module and it does not wait until I/O
operation is complete but instead continues to execute other instructions. When I/O module has
completed its work it interrupts the CPU.

An interrupt is more than a simple mechanism for coordinating I/O transfers. In a general sense,
interrupts enable transfer of control from one program to another to be initiated by an event that is
external to a computer. Execution of the interrupted program resumes after completion of execution of
the interrupt service routine. The concept of interrupts is useful in operating systems and in many
control applications where processing of certain routines has to be accurately timed relative to the
external events.

Using Interrupt Driven I/O technique CPU issues read command. I/O module gets data from peripheral
while CPU does other work and I/O module interrupts CPU checks the status if no error that is the
device is ready then CPU requests data and I/O module transfers data. Thus CPU reads the data and
stores it in the main memory.

Basic concepts of an Interrupt

An interrupt is an exception condition in a computer system caused by an event external to the CPU.
Interrupts are commonly used in I/O operations by a device interface (or controller) to notify the CPU
that it has completed an I/O operation.

An interrupt is indicated by a signal sent by the device interface to the CPU via an interrupt request line
(on an external bus). This signal notifies the CPU that the signaling interface needs to be serviced. The
signal is held until the CPU acknowledges or otherwise services the interface from which the interrupt
originated.

Response of CPU to an Interrupt

The CPU checks periodically to determine if an interrupt signal is pending. This check is usually done at
the end of each instruction, although some modern machines allow for interrupts to be checked for
several times during the execution of very long instructions.

When the CPU detects an interrupt, it then saves its current state (at least the PC and the Processor
Status Register containing condition codes); this state information is usually saved in memory. After the
interrupt has been serviced, this state information is restored in the CPU and the previously executing
software resumes execution as if nothing had happened.

Department of C.S GTC 113


ITec381 Computer Organization and Architecture

Figure 7-2:Connections between devices and interrupt controller actually use the bus lines instead of
dedicated wires

7.3.3 Direct Memory Access (DMA)


 Interrupt driven and programmed I/O require active CPU intervention
 Transfer rate is limited
 CPU is tied up
 DMA is the solution for these problems

Direct Memory Access is capabilities provided by some computer bus architectures that allow data to be
sent directly from an attached device (such as a disk drive) to the memory on the computer’s
motherboard. The microprocessor (CPU) is freed from involvement with the data transfer, thus speeding
up overall computer operation.

Consider an example of input of a block of data using DMA access. The flow chart is given in Figure 7-3.

Department of C.S GTC 114


ITec381 Computer Organization and Architecture

Figure 7-3: Input block of data using DMA access

When the CPU wishes to read or write a block of data, it issues a command to the DMA module and
gives following information:

CPU tells DMA controller:


 Whether to read or write
 Device address
 Starting address of memory block for data
 Amount of data to be transferred
The CPU carries on with other work.
 Thus DMA controller steals the CPU’s work of I/O operation.
 The DMA module transfers the entire block of data,
 One word at a time, directly to or from memory, without going through CPU.
When the transfer is complete
 DMA controller sends interrupt when finished
Thus CPU is involved only at the beginning and at the end of the transfer.

DMA Configurations

The DMA mechanism can be configured in variety of ways. Some of the common configurations are
discussed here.

Department of C.S GTC 115


ITec381 Computer Organization and Architecture

1. Single Bus Detached DMA

In this configuration all modules share the same system bus. The block diagram of single bus detached
DMA is as shown in Figure 7-4. The DMA module that is mimicking the CPU uses the programmed I/O
to exchange the data between the memory and the I/O module through the DMA module. This scheme
may be inexpensive but is clearly inefficient. The features of this configuration are:
 Single Bus, Detached DMA controller

 Each transfer uses bus twice


 I/O to DMA then DMA to memory
 CPU is suspended twice

Figure 7-4: Single bus detached DMA


DMA I/O I/O Main
CPU Controller device device Memory

2. Single Bus, integrated DMA

Here, there is a path between DMA module and one or more I/O modules that do not include the system
bus. The block diagram of single bus Integrated DMA is as shown in Figure 7-5. The DMA logic can
actually be considered as a part of an I/O module or there may be a separate module that controls one
more I/O modules.
Figure 7-5: Block diagram of single bus, integrated DMA

DMA DMA Main


CPU Controller Controller Memory

I/O I/O I/O


device device device

The features of this configuration can be considered as:


 Single Bus, Integrated DMA controller
 Controller may support >1 device
 Each transfer uses the bus once
Department of C.S GTC 116
ITec381 Computer Organization and Architecture

 DMA to memory
 CPU is suspended once

3. DMA using an I/O bus

One further step of the concept of integrated DMA is to connect I/O modules to DMA controller using a
separate bus called I/O bus. This reduces the number of I/O interfaces in the DMA module to one and
provides for an easily expandable configuration. The block diagram of DMA using I/O bus is as shown
in Figure 7-6. Here the system bus that the DMA shares with CPU and main memory is used by DMA
module only to exchange data with memory. And the exchange of data between the DMA module and
the I/O modules takes place off the system bus that is through the I/O bus.
Figure 7-6: Diagram of DMA using I/O bus
System bus
DMA Main
CPU Controller Memory

I/O bus
I/O I/O I/O I/O
device device device device

The features of this configuration are:


 Separate I/O Bus
 Bus supports all DMA enabled devices
 Each transfer uses bus once
 DMA to memory
 CPU is suspended once

With both Programmed I/O and Interrupt driven I/O the CPU is responsible for extracting data from
main memory foroutput and storing data in main memory for input. Table 7-1 indicates the relationship
among the three techniques.

Table 7-1: I/O techniques


No interrupts Using interrupts
I/O-to-memory transfer through
Programmed I/O Interrupt driven I/O
processor

Direct I/O-to-memory transfer  Direct Memory Access (DMA)

Department of C.S GTC 117


ITec381 Computer Organization and Architecture

Advantages of DMA

DMA has several advantages over polling and interrupts. DMA is fast because a dedicated piece of
hardware transfers data from one computer location to another and only one or two bus read/write cycles
are required per piece of data transferred. In addition, DMA is usually required to achieve maximum
data transfer speed, and thus is useful for high speed data acquisition devices. DMA also minimizes
latency in servicing a data acquisition device because the dedicated hardware responds more quickly
than interrupts and transfer time is short. Minimizing latency reduces the amount of temporary storage
(memory) required on an I/O device. DMA also off-loads the processor, which means the processor does
not have to execute any instructions to transfer data. Therefore, the processor is not used for handling
the data transfer activity and is available for other processing activity. Also, in systems where the
processor primarily operates out of its cache, data transfer is actually occurring in parallel, thus
increasing overall system utilization.

7.4 Review Questions


1. Discuss the different sequence of operations the CPU gets involved
2. List the typical data rates of Telephone channel, mouse, PCI Bus and USB
3. Discuss the working of I/O technique that do not require interrupts
4. Write a note on Memory mapped I/O and Isolated I/O
5. Discuss the working principle of an interrupt
6. Explain how CPU responses to an interrupt
7. Explain the different configuration of DMA

Department of C.S GTC 118


ITec381 Computer Organization and Architecture

Chapter 8
Memory Organization
Objectives:
By the end of this chapter, you should be able to:
 Define the different types of memory /storage.
 Explain the various accessing methods
 Explain memory hierarchy.
 With neat diagram explain the cache memory. Also explain its interaction with the processor.
 Explain the cache read operation

8.1 Characteristics of Memory Systems


Memory is unit is used for storage, and retrieval of data and instructions. A typical computer system is
equipped with a hierarchy of memory subsystems, some internal to the system and some external.
Internal memory systems are accessible by the CPU directly and external memory systems are
accessible by the CPU through an I/O module.

Memory systems are classified according to their key characteristics. The most important are listed
below:

Location

The classification of memory is done according to the location of the memory as:
 Registers: The CPU requires its own local memory in the form of registers and also control unit
requires local memories which are fast accessible.
 Internal (main): is often equated with the main memory (RAM)
 External (secondary): consists of peripheral storage devices like Hard disks, magnetic tapes, etc.

Capacity

Storage capacity is one of the important aspects of the memory. It is measured in bytes. Since the
capacity of memory in a typical memory is very large, the prefixes kilo (K), mega (M), and giga (G). A
kilobyte is 210 bytes, a mega byte is 220 bytes, and a giga byte is 230 bytes.

Department of C.S GTC 119


ITec381 Computer Organization and Architecture

Unit of Transfer
Unit of transfer for internal memory is equal to the number of data lines into and out of memory module.
 Word: For internal memory, unit of transfer is equal to the number of data lines into and out of
the memory module.
 Block: For external memory, data are often transferred in much larger units than a word, and
these are referred to as blocks.

Access Method
 Sequential: Tape units have sequential access. Data are generally stored in units called
“records”. Data is accessed sequentially; the records may be passed (or rejected) until the record
that is searched is found.
 Random: Each addressable location in memory has a unique addressing mechanism. The time to
access a given location is independent of the sequence of prior accesses and constant. Any
location can be selected at random and directly addressed and accessed. Main memory and cache
systems are random access.

Performance
 Access time: For random-access memory, this is the time it takes to perform a read or write
operation: that is, the time from the instant that an address is presented to the memory to the
instant that data have been stored or made available for use. For non-random-access memory,
access time is the time it takes to position the read-write mechanism at the desired location.
 Transfer rate: This is the rate at which data can be transferred into or out of a memory unit.

Physical Type
 Semiconductor: Main memory, cache. RAM, ROM.
 Magnetic: Magnetic disks (hard disks), magnetic tape units.
 Optical: CD, DVD.

Physical Characteristics
 Volatile/nonvolatile: In a volatile memory, information decays naturally or is lost when
electrical power is switched off.
 Erasable/nonerasable:Nonerasable memory cannot be altered (except by destroying the storage
unit). ROM’s are nonerasable.

Department of C.S GTC 120


ITec381 Computer Organization and Architecture

8.2 Memory Hierarchy


A computer system is equipped with a hierarchy of memory subsystems. There are several memory
types with very different physical properties. The important characteristics of memory devices are cost
per bit, access time, data transfer rate, alterability and compatibility with processor technologies. Figure
8-1 shows the hierarchy of memory in a typical memory with a trend in access time, amount of storage,
and cost per byte.

Figure 8-1: Memory hierarchy

Design constraints: How much? How fast? How expensive?


 Faster access time, greater cost per bit
 Greater capacity, smaller cost per bit,
 Greater capacity, slower access time.

8.3 Types of Storage devices


8.3.1 Main Memory

The main memory (RAM) stores data and instructions. Main memories are usually built from dynamic
IC’s known as dynamic RAMs. These semiconductor ICs can also implement static memories referred
to as static RAMs (SRAMs). SRAMs are faster but cost per bit is higher. These are often used to build
caches.

Types of Random-Access Semiconductor Memory

Semiconductor memories in which the storage cells are small transistor circuits are used for high speed
CPU registers. Single chip RAMs can be manufactured in sizes ranging from a few hundred bits to 1 GB
or more. Semiconductor memories fall into two categories, SRAMs (static RAMs) and DRAMs

Department of C.S GTC 121


ITec381 Computer Organization and Architecture

(dynamic RAMs). Both bipolar transistor and MOS ((Metal Oxide Semiconductor) transistor are used
for designing RAMs but MOS is dominant technology for designing large RAMs.

A semiconductor memory constructed using bipolar transistors or MOS transistor stores information in
the form of a flip-flop voltage levels. These voltage levels are not likely to be discharged. Such
memories are called static memories. In these memories information remains constant for longer period
of time. Semiconductor memory designed using a MOS with a capacitor stores the information in the
form of charge on a capacitor. The stored charge has a tendency to leak away. A stored ‘1′ will become
‘0′ if no precautions are taken. These types of memory are called dynamic memories.

Dynamic Verses Static RAM


 Dynamic RAM: for example: Charge in capacitor. It requires periodic refreshing.

 Static RAM: for example: Flip-flop logic-gate. Applying power is enough (no need for
refreshing).

 Dynamic RAM is simpler and hence smaller than the static RAM. Therefore it is denser and less
expensive. But it requires supporting refresh circuitry.

 Static RAM’s are faster than dynamic RAM’s.

8.3.2 Types of ROM


ROM: The data is actually wired in the factory. The data can never be altered.
PROM: Programmable ROM. It can only be programmed once after its fabrication. It requires special
device to program.
EPROM: Erasable Programmable ROM. It can be programmed multiple times. Whole capacity need to
be erased by ultraviolet radiation before a new programming activity. It cannot be partially programmed.
EPROM Electrically Erasable Programmable ROM. Erased and programmed electrically. It can be
partially programmed. Write operation takes considerably longer time compared to read operation.

Connection of RAM and CPU


Basic element of semiconductor memory is the memory cell. All semiconductor memory cells have
certain properties:
 Have two stable states that represent binary 0 and 1.
 Capable of being written into (at least once), to set the state.
 Capable of being read to sense the state.

Department of C.S GTC 122


ITec381 Computer Organization and Architecture

From the system standpoint, the memory unit can be viewed as a “black box”. Data transfer between the
main memory and the CPU register takes place through two registers namely MAR (memory address
register) and MDR (memory data register). If MAR is k bits long and MDR is n bits long, the main
memory unit can contain upto 2k addressable locations. During a ‘memory cycle’ n bits of data are
transferred between main memory and CPU. This transfer takes place over the processor bus, which has
k address lines and n data lines.

Figure 8-2: Connection of main memory to CPU

The CPU initiates a memory operation by loading the appropriate data into registers MDR and MAR,
and setting either Read or Write memory control line to 1. When the required operation is completed the
memory control circuitry sends Memory Function Completed (MFC) signal to CPU.

The time that elapses between the initiation of an operation and completion of that operation is called
memory access time. The minimum time delay between two successive memory operations is called
memory cycle time. The cycle time is usually slightly lower than the access time.

8.3.3 Cache Memory


For all instructions cycles, the CPU accesses the memory at least once to fetch the instruction and
sometimes again accesses memory to fetch the operands. The rate at which the CPU can execute
instructions is limited by the memory cycle time. This limitation is due to the mismatch between the
memory cycle time and processor cycle time. Ideally the main memory should be built with the same
technology as that of CPU registers, giving memory cycle times comparable to processor cycle times but
this is too expensive strategy.

Department of C.S GTC 123


ITec381 Computer Organization and Architecture

The solution is to exploit the principle of locality by providing a small, fast memory between the CPU
and the main memory. This memory is known as cache memory. Thus the intention of using cache
memory is to give memory speed approaching the speed of fastest memories available, and at the same
time provide a large memory size at the price of less expensive types of semiconductor memory.

Principles of Cache Memory

Figure 8-3: Cache and main memory

The concept is illustrated in Figure 8-3. The cache memory contains the copy of portions of main
memory. When CPU attempts to read a word from main memory, a check is made to find if the word is
in cache memory. If so cache and then the word are delivered to the CPU. The purpose of reading a
block from main memory is that it is likely future references will be to other words in the block.

Structure of Cache and Main Memory

The structure of the cache and main memory is as shown in Figure 8-4. The figure shows a main
memory consisting of 2n addressable words, each word having a unique n bit address. This memory is
considered to consist of a number of fixed length blocks of size K words each. That is M=2 n/K blocks.
Cache consists of C lines of K words each. The number of lines (C) of cache is much less than the
number of main memory blocks (M).

Department of C.S GTC 124


ITec381 Computer Organization and Architecture

Slot Memory
number Tag Block address2
0 0
1 1
2 Block
. 3 (K words)
.
.
.
C-1
Block length
.
(K words) .

(a) Cache

Block

2n-1
Word
length

(b) Main memory


Figure 8-4: Cache and main memory

Cache Read Operation

Now let us see how cache read operation is executed. The flow chart illustrating the steps of cache read
operation is illustrated in Figure 8-5. The processor generates the address ‘RA’ of a word to be read. If
the word is contained in the cache, it is delivered to the processor. Otherwise, the block containing that
word is loaded into the cache and simultaneously the word is delivered to the processor from main
memory. These last two operations occur in parallel.

Transferring data in blocks between the main memory and the cache enables an interleaved memory to
operate at its maximum possible speed.

Department of C.S GTC 125


ITec381 Computer Organization and Architecture

Figure 8-5: Cache read operation

Elements of Cache Design


 Cache Size: Small cache => performance problems, large cache => too expensive.
 Mapping Function: Will discussed below.
 Write Policy: Before a block that is resident in the cache can be replaced, it is necessary to
consider whether it has been altered in the cache but not in the main memory.
 Write back: Writes are only done to cache. There is an UPDATE bit set when there is a write.
Before a cache block (line) is replaced, if UPDATE bit is set, its content is written to main
memory. Access to main memory by I/O modules can only be allowed through the cache.

Department of C.S GTC 126


ITec381 Computer Organization and Architecture

 Line Size: Greater line size => more hit (+) but also more line replacements (-). Too small line
size => less chance of hit for some parts of the block.
 Number of caches: Two levels of cache: Cache internal to processor is called Level-1 (L1)
cache.external cache is called Level-2 (L2) cache.

Mapping Functions

The correspondence between the main memory and CPU are specified by a mapping function. There are
three standard mapping functions namely:
 Direct mapping
 Associative mapping
 Block set associative mapping

In order to discuss these methods consider a cache consisting of 128 blocks of 16 words each. Assume
that main memory is addressable by a 16 bit address. For mapping purpose main memory is viewed as
composed of 4K blocks.

1. Direct Mapping Technique

This is a simplest mapping technique. In this case, block K of the main memory maps onto block K
modulo 128 of the cache. Since more than one main memory block is mapped onto a given cache block
position, contention may arise for that position even when the cache is not full. This is overcome by
allowing the new block to overwrite the currently resident block.

A main memory address can be divided into three fields, TAG, BLOCK and WORD as shown in Figure
8-6. The TAG bit is required to identify a main memory block when it is resident in the cache. When a
new block enters the cache the 7-bit cache block field determines the cache position in which this block
must be stored. The tag field of that block is compared with tag field of the address. If they match, then
the desired word is present in that block of cache. If there is no match, then the block containing the
required word must be first read from the main memory and then loaded into the cache.

Department of C.S GTC 127


ITec381 Computer Organization and Architecture

Figure 8-6: Direct mapping cache

2. Associative Mapping Technique

This is a much more flexible mapping technique. Here any main memory block can be loaded to any
cache block position. Associative mapping is illustrated in Figure 8-7. In this case 12 tag bits are
required to identify a main memory block when it is resident in the cache.

Department of C.S GTC 128


ITec381 Computer Organization and Architecture

Figure 8-7: Associative mapping cache

The tag bits of an address received from the CPU are compared with the tag bits of each cache block to
see if the desired block is present in the cache. Here we need to search all 128 tag patterns to determine
whether a given block is in the cache. This type of search is called associative search. Therefore the cost
of implementation is higher. Because of complete freedom in positioning, a wide range of replacement
can be used.

3. Block Set Associative Mapping

This is a combination of two techniques discussed above. In this case blocks of the cache are grouped
into sets and the mapping allows a block of main memory to reside in any block of a particular set.
Block Set associative mapping is illustrated in Figure 8-8.

Department of C.S GTC 129


ITec381 Computer Organization and Architecture

Figure 8-8: Block set associative mapping cache with two blocks per set

Consider an example, a cache with two blocks per set. The 6 bit set field of the address determines
which set of the cache might contain the addressed block. The tag field of the address must be
associatively compared to the tags of the two blocks of the set to check if the desired block is present.
 The number of blocks per set can be selected according to the requirements of a particular
computer system. Four blocks per set can be accommodated by a 5 bit set field, eight blocks per
set by a 4-bit set field and so on. The extreme conditions are one block per set (direct mapping
method) and 128 blocks per set (fully associative technique).
 Each block must be provided with a valid bit which indicates whether the block contains valid
data. When the main memory is loaded with new programs and data from mass storage devices,
valid bit of a particular cache block is set to 1. It stays at 1 unless a main memory is updated by a
source that bypasses the cache. In this case, a check is made to determine whether the block is
currently in the cache. If it is, its valid bit is set to 0.

Department of C.S GTC 130


ITec381 Computer Organization and Architecture

Replacement Algorithms of Cache Memory


For and set associative mapping a replacement algorithm is needed. The most common algorithms are
discussed here. When a new block is to be brought into the cache and all the positions that it may
occupy are full, then the cache controller must decide which of the old blocks to overwrite. Because the
programs usually stay in localized areas for a reasonable period of time, there is high probability that the
blocks that have been referenced recently will be referenced again soon. Therefore when a block is to be
overwritten, the block that has not referenced for the longest time is overwritten. This block is called
least-recently-used block (LRU), and the technique is called the LRU replacement algorithm. In order
to use the LRU algorithm, the cache controller must track the LRU block as computation proceeds.
There are several replacement algorithms that require less overhead than LRU method. One method is to
remove the oldest block from a full set when a new block must be brought in. This method is referred to
as FIFO. In this technique no updating is needed when hit occurs. However, because the algorithm does
not consider the recent patterns of access to blocks in the cache, it is not effective as LRU approach in
choosing the best block to remove. There is another method called least frequently used (LFU) that
replaces that block in the set which has experienced the fewer references. It is implemented by
associating a counter with each slot. Yet another simplest algorithm called random, is to choose a block
to be overwritten in random.

8.3.4 External Memory

Magnetic Disk

A disk is a circular platter constructed of metal or of plastic coated with a magnetic material. Data are
recorded on and later retrieved from the disk via a conducting coil named the head. During a read or
write operation, the head is stationary while the platter rotates beneath it. Writing is achieved by
producing a magnetic field which records a magnetic pattern on the magnetic surface.

Data Organization and Formatting of Magnetic disk

Figure 8-9 depicts the data layout of a disk. The head is capable of reading or writing from a portion of
the platter rotating beneath it. This gives rise to organization of data on the platter in a concentric set of
rings called Tracks. Each track is the same width as the head. Adjacent tracks are separated by gaps that
minimize errors due to misalignment of head. Data is transferred to and from the disk in blocks. And the
block is smaller than the capacity of a track. Data is stored in block regions which is an angular part of a
track and is referred to as a sector. Typically 10-100 sectors are there per track. These may be either of
fixed or variable length.

Department of C.S GTC 131


ITec381 Computer Organization and Architecture

Figure 8-9: Disk data layout

RAID(Redundant Array of Independent Disks)


1. RAID is a set of physical disk drives viewed by the operating system as a single logical drive.
2. Data are distributed across the physical drives of an array.
3. Redundant disk capacity is used to store parity information, which guarantees data recoverability
in case of a disk failure.

Optical Memory & Magnetic Tape are other two external memories (Recall your Introduction to IT
(ITec201) course).

Department of C.S GTC 132


ITec381 Computer Organization and Architecture

8.3.5 Virtual memory

Virtual (or logical) memory is a concept that, when implemented by a computer and its operating
system, allows programmers to use a very large range of memory or storage addresses for stored data.
The computing system maps the programmer’s virtual addresses to real hardware storage addresses.
Usually, the programmer is freed from having to be concerned about the availability of data storage.

In addition to managing the mapping of virtual storage addresses to real storage addresses, a computer
implementing virtual memory or storage also manages storage swapping between active storage (RAM)
and hard disk or other high volume storage devices. Data is read in units called “pages” of sizes ranging
from a thousand bytes (actually 1,024 decimal bytes) up to several megabytes in size. This reduces the
amount of physical storage access that is required and speeds up overall system performance.

The memory control circuitry translates the address specified by the program into an address that can be
used to access the physical memory. This address is called logical address or virtual address. A set of
virtual addresses constitute the virtual address space.

The mechanism that translates virtual addresses into physical address is usually implemented by a
combination of hardware and software components. If a virtual address refers to a part of the program or
data space that is currently in the physical memory, then the contents of the appropriate physical
location in the main memory are accessed. Otherwise its contents must be brought into a suitable
location in the main memory. The mapping function is implemented by a special memory control unit
called as memory management unit. Mapping function can be changed during the program execution
according to the system requirement.

The simplest method of translation assumes that all programs and data are composed of fixed length
units called pages. Each page consists of a block of words that occupy contiguous locations in the main
memory or in the secondary storage. Page normally ranges from 1K to 8K bytes in length. They form
the basic unit of information that is transmitted between main memory and secondary storage devices.

Virtual memory increases the effective size of the main memory. Only the active space of the virtual
address space is mapped onto locations in the physical main memory, whereas the remaining virtual
addresses are mapped onto the bulk storage devices used. During a memory cycle the addressing spacing
mechanism (hardware or software) determines whether the addressed information is in the physical main
memory unit. If it is, the proper information is accessed and the execution proceeds. If it is not, a
contiguous block of words containing the desired information are transferred from the bulk storage to
main memory displacing some block that is currently inactive.

Department of C.S GTC 133


ITec381 Computer Organization and Architecture

8.4 Memory Management in Operating Systems


In virtual memory concept we assumed that only one large program is being executed. If program and
data does not fit into the physical main memory, then a secondary storage should hold the overflow. The
operating system automatically swaps the programs between the main memory and secondary storage.

Management routines are the parts of operating system of the computer. Virtual address space is divided
into two parts system space and user space. Operating system routines reside in system space and user
application programs reside in the user space.

In a multiuser environment each user will have a separate user space with a separate page table. The
memory management unit uses a page table base register to determine the address of table to be used in
the translation process. Hence by changing the contents of this register operating system can switch from
one space to another. The physical main memory is thus shared by the active pages of the system space
and several user spaces. However, only the pages that belong to one of these spaces are accessible at any
given time. In a multiuser environment, no program should be allowed to modify either data or
instructions of other programs in the main memory. Hence protection should be given.

Such protection can be provided in several ways. Let us first consider the most basic form of protection.
Recall that in the simplest case the processor has two states namely, the supervisor state and user state.
The processor is placed in the supervisor state while operating system routines are being executed and in
the user state to execute user programs. In the user state some machine instructions cannot be executed.
These privileged instructions which include operations such as modifying the page table base register
cannot be executed in the user state. Hence a user program is prevented from accessing the page tables
of other user spaces or of the system space.

It is sometimes desirable for one application program to have access to certain pages belonging to
another program. The operating system can arrange this by causing these pages to appear in both spaces.
The shared pages will therefore have entries in two different page tables. The control bits in each table
entry can be set differently to control the Read/Write access privileges granted to each program. The
following types of partitioning methods are Fixed-size and unequal-size partitions and dynamic
partitioning. Paging is more efficient than partitioning. Programs are divided to “logical” chunks known
as “pages”, assigned to available chunks of memory known as “frames” or “page frames”.

Department of C.S GTC 134


ITec381 Computer Organization and Architecture

8.5 ReviewQuestions
1. Explain cache replacement algorithms.
2. What is the reason for having so many types (hierarchy) of memory?
3. Discuss the physical characteristics of a disk.
4. What is RAID? What is its advantage over ordinary disks?
5. What is virtual memory? Why is it important?

Department of C.S GTC 135


ITec381 Computer Organization and Architecture

Appendix: The ASCII Character Set


The first 32 characters of ASCII are non-printable control codes which are used for communication
control.

DECIMAL OCTAL HEX BINARY Symbol Description


0 000 00 00000000 NUL Null char
1 001 01 00000001 SOH Start of Heading
2 002 02 00000010 STX Start of Text
3 003 03 00000011 ETX End of Text
4 004 04 00000100 EOT End of Transmission
5 005 05 00000101 ENQ Enquiry
6 006 06 00000110 ACK Acknowledgment
7 007 07 00000111 BEL Bell
8 010 08 00001000 BS Back Space
9 011 09 00001001 HT Horizontal Tab
10 012 0A 00001010 LF Line Feed
11 013 0B 00001011 VT Vertical Tab
12 014 0C 00001100 FF Form Feed
13 015 0D 00001101 CR Carriage Return
14 016 0E 00001110 SO Shift Out / X-On
15 017 0F 00001111 SI Shift In / X-Off
16 020 10 00010000 DLE Data Line Escape
17 021 11 00010001 DC1 Device Control 1 (oft. XON)
18 022 12 00010010 DC2 Device Control 2
19 023 13 00010011 DC3 Device Control 3 (oft. XOFF)
20 024 14 00010100 DC4 Device Control 4
21 025 15 00010101 NAK Negative Acknowledgement
22 026 16 00010110 SYN Synchronous Idle
23 027 17 00010111 ETB End of Transmit Block
24 030 18 00011000 CAN Cancel
25 031 19 00011001 EM End of Medium
26 032 1A 00011010 SUB Substitute
27 033 1B 00011011 ESC Escape
28 034 1C 00011100 FS File Separator
29 035 1D 00011101 GS Group Separator
30 036 1E 00011110 RS Record Separator
31 037 1F 00011111 US Unit Separator

Codes from 32 to 127 are used for printable characters.

DECIMAL OCTAL HEX BINARY Symbol Description

Department of IT, DebreMarkos University I


ITec381 Computer Organization and Architecture

32 040 20 00100000   Space


33 041 21 00100001 ! Exclamation mark
34 042 22 00100010 " Double quotes
35 043 23 00100011 # Number
36 044 24 00100100 $ Dollar
37 045 25 00100101 % Procenttecken
38 046 26 00100110 & Ampersand
39 047 27 00100111 ' Single quote
40 050 28 00101000 ( Open parenthesis (or open bracket)
41 051 29 00101001 ) Close parenthesis (or close bracket)
42 052 2A 00101010 * Asterisk
43 053 2B 00101011 + Plus
44 054 2C 00101100 , Comma
45 055 2D 00101101 - Hyphen
46 056 2E 00101110 . Period, dot or full stop
47 057 2F 00101111 / Slash or divide
48 060 30 00110000 0 Zero
49 061 31 00110001 1 One
50 062 32 00110010 2 Two
51 063 33 00110011 3 Three
52 064 34 00110100 4 Four
53 065 35 00110101 5 Five
54 066 36 00110110 6 Six
55 067 37 00110111 7 Seven
56 070 38 00111000 8 Eight
57 071 39 00111001 9 Nine
58 072 3A 00111010 : Colon
59 073 3B 00111011 ; Semicolon
60 074 3C 00111100 < Less than (or open angled bracket)
61 075 3D 00111101 = Equals
62 076 3E 00111110 > Greater than (or close angled bracket)
63 077 3F 00111111 ? Question mark
64 100 40 01000000 @ At symbol
65 101 41 01000001 A Uppercase A
66 102 42 01000010 B Uppercase B
67 103 43 01000011 C Uppercase C
68 104 44 01000100 D Uppercase D
69 105 45 01000101 E Uppercase E
70 106 46 01000110 F Uppercase F
71 107 47 01000111 G Uppercase G
72 110 48 01001000 H Uppercase H

Department of IT, DebreMarkos University II


ITec381 Computer Organization and Architecture

73 111 49 01001001 I Uppercase I


74 112 4A 01001010 J Uppercase J
75 113 4B 01001011 K Uppercase K
76 114 4C 01001100 L Uppercase L
77 115 4D 01001101 M Uppercase M
78 116 4E 01001110 N Uppercase N
79 117 4F 01001111 O Uppercase O
80 120 50 01010000 P Uppercase P
81 121 51 01010001 Q Uppercase Q
82 122 52 01010010 R Uppercase R
83 123 53 01010011 S Uppercase S
84 124 54 01010100 T Uppercase T
85 125 55 01010101 U Uppercase U
86 126 56 01010110 V Uppercase V
87 127 57 01010111 W Uppercase W
88 130 58 01011000 X Uppercase X
89 131 59 01011001 Y Uppercase Y
90 132 5A 01011010 Z Uppercase Z
91 133 5B 01011011 [ Opening bracket
92 134 5C 01011100 \ Backslash
93 135 5D 01011101 ] Closing bracket
94 136 5E 01011110 ^ Caret - circumflex
95 137 5F 01011111 _ Underscore
96 140 60 01100000 ` Grave accent
97 141 61 01100001 a Lowercase a
98 142 62 01100010 b Lowercase b
99 143 63 01100011 c Lowercase c
100 144 64 01100100 d Lowercase d
101 145 65 01100101 e Lowercase e
102 146 66 01100110 f Lowercase f
103 147 67 01100111 g Lowercase g
104 150 68 01101000 h Lowercase h
105 151 69 01101001 i Lowercase i
106 152 6A 01101010 j Lowercase j
107 153 6B 01101011 k Lowercase k
108 154 6C 01101100 l Lowercase l
109 155 6D 01101101 m Lowercase m
110 156 6E 01101110 n Lowercase n
111 157 6F 01101111 o Lowercase o
112 160 70 01110000 p Lowercase p
113 161 71 01110001 q Lowercase q

Department of IT, DebreMarkos University III


ITec381 Computer Organization and Architecture

114 162 72 01110010 r Lowercase r


115 163 73 01110011 s Lowercase s
116 164 74 01110100 t Lowercase t
117 165 75 01110101 u Lowercase u
118 166 76 01110110 v Lowercase v
119 167 77 01110111 w Lowercase w
120 170 78 01111000 x Lowercase x
121 171 79 01111001 y Lowercase y
122 172 7A 01111010 z Lowercase z
123 173 7B 01111011 { Opening brace
124 174 7C 01111100 | Vertical bar
125 175 7D 01111101 } Closing brace
126 176 7E 01111110 ~ Equivalency sign - tilde
127 177 7F 01111111 Delete

Character codes 128 to 255 are known as the extended character set.

DECIMAL OCTAL HEX BINARY Symbol Description


128 200 80 10000000 € Euro sign
129 201 81 10000001    
130 202 82 10000010 ‚ Single low-9 quotation mark
131 203 83 10000011 ƒ Latin small letter f with hook
132 204 84 10000100 „ Double low-9 quotation mark
133 205 85 10000101 … Horizontal ellipsis
134 206 86 10000110 † Dagger
135 207 87 10000111 ‡ Double dagger
136 210 88 10001000 ˆ Modifier letter circumflex accent
137 211 89 10001001 ‰ Per mille sign
138 212 8A 10001010 Š Latin capital letter S with caron
139 213 8B 10001011 ‹ Single left-pointing angle quotation
140 214 8C 10001100 Œ Latin capital ligature OE
141 215 8D 10001101    
142 216 8E 10001110 Ž Latin captial letter Z with caron
143 217 8F 10001111    
144 220 90 10010000    
145 221 91 10010001 ‘ Left single quotation mark
146 222 92 10010010 ’ Right single quotation mark
147 223 93 10010011 “ Left double quotation mark
148 224 94 10010100 ” Right double quotation mark
149 225 95 10010101 • Bullet
150 226 96 10010110 – En dash

Department of IT, DebreMarkos University IV


ITec381 Computer Organization and Architecture

151 227 97 10010111 — Em dash


152 230 98 10011000 ˜ Small tilde
153 231 99 10011001 ™ Trade mark sign
154 232 9A 10011010 š Latin small letter S with caron
Single right-pointing angle quotation
155 233 9B 10011011 ›
mark
156 234 9C 10011100 œ Latin small ligature oe
157 235 9D 10011101    
158 236 9E 10011110 ž Latin small letter z with caron
159 237 9F 10011111 Ÿ Latin capital letter Y with diaeresis
160 240 A0 10100000   Non-breaking space
161 241 A1 10100001 ¡ Inverted exclamation mark
162 242 A2 10100010 ¢ Cent sign
163 243 A3 10100011 £ Pound sign
164 244 A4 10100100 ¤ Currency sign
165 245 A5 10100101 ¥ Yen sign
166 246 A6 10100110 ¦ Pipe, Broken vertical bar
167 247 A7 10100111 § Section sign
168 250 A8 10101000 ¨ Spacing diaeresis - umlaut
169 251 A9 10101001 © Copyright sign
170 252 AA 10101010 ª Feminine ordinal indicator
171 253 AB 10101011 « Left double angle quotes
172 254 AC 10101100 ¬ Not sign
173 255 AD 10101101 Soft hyphen
174 256 AE 10101110 ® Registered trade mark sign
175 257 AF 10101111 ¯ Spacing macron - overline
176 260 B0 10110000 ° Degree sign
177 261 B1 10110001 ± Plus-or-minus sign
178 262 B2 10110010 ² Superscript two - squared
179 263 B3 10110011 ³ Superscript three - cubed
180 264 B4 10110100 ´ Acute accent - spacing acute
181 265 B5 10110101 µ Micro sign
182 266 B6 10110110 ¶ Pilcrow sign - paragraph sign
183 267 B7 10110111 · Middle dot - Georgian comma
184 270 B8 10111000 ¸ Spacing cedilla
185 271 B9 10111001 ¹ Superscript one
186 272 BA 10111010 º Masculine ordinal indicator
187 273 BB 10111011 » Right double angle quotes
188 274 BC 10111100 ¼ Fraction one quarter
189 275 BD 10111101 ½ Fraction one half
190 276 BE 10111110 ¾ Fraction three quarters
191 277 BF 10111111 ¿ Inverted question mark

Department of IT, DebreMarkos University V


ITec381 Computer Organization and Architecture

192 300 C0 11000000 À Latin capital letter A with grave


193 301 C1 11000001 Á Latin capital letter A with acute
194 302 C2 11000010 Â Latin capital letter A with circumflex
195 303 C3 11000011 Ã Latin capital letter A with tilde
196 304 C4 11000100 Ä Latin capital letter A with diaeresis
197 305 C5 11000101 Å Latin capital letter A with ring above
198 306 C6 11000110 Æ Latin capital letter AE
199 307 C7 11000111 Ç Latin capital letter C with cedilla
200 310 C8 11001000 È Latin capital letter E with grave
201 311 C9 11001001 É Latin capital letter E with acute
202 312 CA 11001010 Ê Latin capital letter E with circumflex
203 313 CB 11001011 Ë Latin capital letter E with diaeresis
204 314 CC 11001100 Ì Latin capital letter I with grave
205 315 CD 11001101 Í Latin capital letter I with acute
206 316 CE 11001110 Î Latin capital letter I with circumflex
207 317 CF 11001111 Ï Latin capital letter I with diaeresis
208 320 D0 11010000 Ð Latin capital letter ETH
209 321 D1 11010001 Ñ Latin capital letter N with tilde
210 322 D2 11010010 Ò Latin capital letter O with grave
211 323 D3 11010011 Ó Latin capital letter O with acute
212 324 D4 11010100 Ô Latin capital letter O with circumflex
213 325 D5 11010101 Õ Latin capital letter O with tilde
214 326 D6 11010110 Ö Latin capital letter O with diaeresis
215 327 D7 11010111 × Multiplication sign
216 330 D8 11011000 Ø Latin capital letter O with slash
217 331 D9 11011001 Ù Latin capital letter U with grave
218 332 DA 11011010 Ú Latin capital letter U with acute
219 333 DB 11011011 Û Latin capital letter U with circumflex
220 334 DC 11011100 Ü Latin capital letter U with diaeresis
221 335 DD 11011101 Ý Latin capital letter Y with acute
222 336 DE 11011110 Þ Latin capital letter THORN
223 337 DF 11011111 ß Latin small letter sharp s - ess-zed
224 340 E0 11100000 à Latin small letter a with grave
225 341 E1 11100001 á Latin small letter a with acute
226 342 E2 11100010 â Latin small letter a with circumflex
227 343 E3 11100011 ã Latin small letter a with tilde
228 344 E4 11100100 ä Latin small letter a with diaeresis
229 345 E5 11100101 å Latin small letter a with ring above
230 346 E6 11100110 æ Latin small letter ae
231 347 E7 11100111 ç Latin small letter c with cedilla
232 350 E8 11101000 è Latin small letter e with grave

Department of IT, DebreMarkos University VI


ITec381 Computer Organization and Architecture

233 351 E9 11101001 é Latin small letter e with acute


234 352 EA 11101010 ê Latin small letter e with circumflex
235 353 EB 11101011 ë Latin small letter e with diaeresis
236 354 EC 11101100 ì Latin small letter i with grave
237 355 ED 11101101 í Latin small letter i with acute
238 356 EE 11101110 î Latin small letter i with circumflex
239 357 EF 11101111 ï Latin small letter i with diaeresis
240 360 F0 11110000 ð Latin small letter eth
241 361 F1 11110001 ñ Latin small letter n with tilde
242 362 F2 11110010 ò Latin small letter o with grave
243 363 F3 11110011 ó Latin small letter o with acute
244 364 F4 11110100 ô Latin small letter o with circumflex
245 365 F5 11110101 õ Latin small letter o with tilde
246 366 F6 11110110 ö Latin small letter o with diaeresis
247 367 F7 11110111 ÷ Division sign
248 370 F8 11111000 ø Latin small letter o with slash
249 371 F9 11111001 ù Latin small letter u with grave
250 372 FA 11111010 ú Latin small letter u with acute
251 373 FB 11111011 û Latin small letter u with circumflex
252 374 FC 11111100 ü Latin small letter u with diaeresis
253 375 FD 11111101 ý Latin small letter y with acute
254 376 FE 11111110 þ Latin small letter thorn
255 377 FF 11111111 ÿ Latin small letter y with diaeresis

Department of IT, DebreMarkos University VII


ITec381 Computer Organization and Architecture

References
1. M, Moris Mano; Computer System Architecture (3rd Edition),Prentice Hall,

2. William Stallings;Computer Organization and Architecture: Designing for Performance (8th Edition),
Prentice Hall,2010

Department of IT, DebreMarkos University VIII

You might also like