0% found this document useful (0 votes)
78 views7 pages

CO Solution 1-4

The document contains exercises and solutions related to computer architecture concepts, including definitions of translators, interpreters, and virtual machines. It discusses the implications of compilers generating output at different architecture levels, the design of encoders and decoders, and the performance of ALUs. Additionally, it covers Hamming codes, memory timing, and assembly code optimizations in a programming context.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views7 pages

CO Solution 1-4

The document contains exercises and solutions related to computer architecture concepts, including definitions of translators, interpreters, and virtual machines. It discusses the implications of compilers generating output at different architecture levels, the design of encoders and decoders, and the performance of ALUs. Additionally, it covers Hamming codes, memory timing, and assembly code optimizations in a programming context.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Chapter 1

1. Explain each of the following terms in your own words:


a. Translator.
b. Interpreter.
c. Virtual machine.
Solution:
a. A translator converts programs in one language to another.
b. An interpreter carries out a program instruction by instruction.
c. A virtual machine is a conceptual machine, one that does not exist.
2. Is it conceivable for a compiler to generate output for the microarchitecture level instead
of for the ISA level? Discuss the pros and cons of this proposal.
Solution:
It is possible, but there are problems. One difficulty is the large amount of code produced.
Since one ISA instruction does the work of many microinstructions, the resulting program will be
much bigger. Another problem is that the compiler will have to deal with a more primitive output
language, hence it, itself, will become more complex. Also, on many machines, the microprogram
is in ROM. Making it user-changeable would require putting it in RAM, which is much slower
than ROM. On the positive side, the resulting program might well be much faster, since the
overhead of one level of interpretation would be eliminated.
3. Can you imagine any multilevel computer in which the device level and digital logic levels
were not the lowest levels? Explain.
Solution:
During the detailed design of a new computer, the device and digital logic levels of the new
machine may well be simulated on an old machine, which puts them around level 5 or 6.
4. What is the difference between interpretation and translation?
Solution
An interpreter executes a program by fetching the first instruction, carrying it out, then fetching
the next one, and so on. A translator first converts the original program into an equivalent one in
another language and then runs the new program.
5. Some instructions at the operating system machine level are identical to ISA language
instructions. These instructions are carried out directly by the microprogram or hardware
rather than by the operating system. In light of your answer to the preceding problem, why
do you think this is the case?
Solution:
Each additional level of interpretation costs something in time. If it is not needed, it should be
avoided.
Chapter 2
1. Consider the operation of a machine with the data path of Fig. 2-2. Suppose that loading
the ALU input registers takes 5 nsec, running the ALU takes 10 nsec, and storing the result
back in the register scratchpad takes 5 nsec. What is the maximum number of MIPS this
machine is capable of in the absence of pipelining?
Solution:
The data path cycle is 20 nsec. The maximum number of data path cycles/sec is thus 50 million.
The best the machine could do is thus 50 MIPS.
2. What is the purpose of step 2 in the list of Sec. 2.1.2? What would happen if this step were
omitted?
Solution:
The program counter must be incremented to point to the next instruction. If this step were omitted,
the computer would execute the initial instruction forever.
3. On computer 1, all instructions take 10 nsec to execute. On computer 2, they all take 5 nsec
to execute. Can you say for certain that computer 2 is faster? Discuss.
Solution:
You cannot say anything for sure. If computer 1 has a five-stage pipeline, it can issue up to
500million instructions/second. If computer 2 is not pipelined, it cannot do any better than 200
million instructions/sec. Thus without more information, you cannot say which is faster.
12. Devise a 7-bit even-parity Hamming code for the digits 0 to 9.
Solution:
From 0 to 9 the codes are: 0000000, 1101001, 0101010, 1000011, 1001100, 0100101, 1100110,
0001111, 1110000, and 0011001.
13. Devise a code for the digits 0 to 9 whose Hamming distance is 2.
Solution:
There are many possible answers. The simplest forms would be to count from 0 to 9 using 4-bit
binary number, and add a simple parity bit to the end of each. You need 3 redundancy bits for a
distance of 2 which allows you to correct single bit errors. Since it takes 4 bits to represent the
numbers 0 through 9 you need a 7-bit code.
Just add a parity bit: 00000, 00011, 00101, 00110, 01001, 01010, 01100, 01111, 10001, and 10010

14. In a Hamming code, some bits are ‘‘wasted’’ in the sense that they are used for checking
and not information. What is the percentage of wasted bits for messages whose total length
(data + check bits) is 2n − 1? Evaluate this expression numerically for values of n from 3 to
10.
Solution:
If the total length is 2n − 1 bits, there are n check bits. Consequently, the percentage of wasted
bits is n /(2n − 1) × 100%. Numerically for n from 3 to 10, we get: 42.9%, 26.7%, 16.1%, 9.5%,
5.5%, 3.1%, 1.8%, and 1.0%.

Chapter 3
10. Draw the logic diagram of a 2-bit encoder, a circuit with four input lines, exactly one of
which is high at any instant, and two output lines whose 2-bit binary value tells which input
is high.
Solution:
The truth table for an encoder (showing only the valid input values) is:

From the truth table, we can see that


out1 = in2 OR in3
out0 = in1 OR in3
Again, this assumes only valid input values. The logic diagram is below.

11. Draw the logic diagram of a 2-bit de-multiplexer, a circuit whose single input line is
steered to one of the four output lines depending on the state of the two control lines.
Solution:
16. The ALU of Fig. 3-19 is capable of doing 8-bit 2’s complement additions. Is it also capable
of doing 2’s complement subtractions? If so, explain how. If not, modify it to be able to do
subtractions.
Solution:
The ALU is already capable of subtraction. Note that in 2’s complement, B − A = B + (−A). To
get − A, we can use Ā + 1. Thus to subtract A from B we need to add B, Ā and then increment the
sum. The circuit can already do this by asserting INVA and INC and then adding the two inputs.
17. A 16-bit ALU is built up of 16 1-bit ALUs, each one having an add time of 10 nsec. If
there is an additional 1-nsec delay for propagation from one ALU to the next, how long does
it take for the result of a 16-bit add to appear?
Solution:
A basic cycle is 11 nsec, including propagation. Sixteen cycles take 176 nsec. However, the last
propagation is not needed, so the answer is available175 nsec after the start.
18. Sometimes it is useful for an 8-bit ALU such as Fig. 3-19 to generate the constant – 1 as
output. Give two different ways this can be done. For each way, specify the values of the six
control signals.
Solution:
One way is to negate ENB, forcing B to 0, then choosing function code 01 to select B as the ALU
function. The 1’s complement of 0 is − 1 in 2’s complement (all 1 bits). As long as INC is negated,
the control lines for A do not matter. A second way is to negate and invert A, putting all 1s on the
A input. Then negate ENB to force B to 0. With A equal to − 1 and B equal to 0, we can either add
or OR them together.
21. The 4 × 3 memory of Fig. 3-28 uses 22 AND gates and three OR gates. If the circuit were
to be expanded to 256 × 8, how many of each would be needed?
Solution:
The design uses two AND gates for chip enable logic plus two AND gates per word select line
plus one AND gate per data bit. For a 256 × 8 memory this comes to 2 + 512 + 2048 = 2562 AND
gates. The circuit also uses one OR gate for each bit in the word; hence eight of them would be
needed.
25. Referring to the timing diagram of Fig. 3-38, suppose that you slowed the clock down to
a period of 20 nsec instead of 10 nsec as shown but the timing constraints remained
unchanged. How much time would the memory have to get the data onto the bus during T3
after MREQ was asserted, in the worst case?
Solution:
With a 20-nsec clock period, MREQ might be asserted as late as 13 nsec into T1. The data are
required 2 nsec before the high-to-low transition in T3, which occurs 10 nsec after the start of the
cycle. From the midpoint of T1 to the midpoint of T3 is 40 nsec. Since the memory cannot start
until 3 nsec after the transition in the first cycle and has to be done 2 nsec before the transition in
the third cycle, in the worst case the memory has only 35 nsec in which to respond.
26. Again referring to Fig. 3-38, suppose that the clock remained at 100 MHz, but TDS was
increased to 4 nsec. Could 10-nsec memory chips be used?
Solution:
The memory would now have 20 − 3 − 4 = 13 nsec to respond after MREQ and RD are asserted.
There is not a lot of margin left, but if all the memory chips can always respond in 10 nsec, the
system will still work.
27. In Fig. 3-38(b), T ML is specified to be at least 3 nsec. Can you envision a chip in which
it is negative? Specifically, could the CPU assert MREQ before the address was stable? Why
or why not?
Solution:
In principle it could be negative, but it has to be asserted after the address is stable because on a
write, the memory must not start changing the bytes addressed until the address is valid. On reads
it does not matter.
28. Assume that the block transfers of Fig. 3-42 were done on the bus of Fig. 3-38. How much
more bandwidth is obtained by using a block transfer over individual transfers for long
blocks? Now assume that the bus is 32 bits wide instead of 8 bits wide. Answer the question
again.
Solution:
In regular mode, it takes three cycles per transfer. In block mode, it takes one cycle per transfer
once the transfer has started. Therefore, block transfers have almost three times as much
bandwidth. The bus width does not matter, so it is still three times more, even for 32-bit transfers.

Chapter 4
2. In Fig. 4-6, the B bus register is encoded in a 4-bit field, but the C bus is represented as a
bit map. Why?
Solution:
Only one register may be gated out onto the B bus, so a single 4-bit field is sufficient to specify
which one. The C bus may have many registers selected to be loaded, so a full bit map is required.
3. In Fig. 4-6 there is a box labeled ‘‘High bit.’’ Give a circuit diagram for it.
Solution:
The Boolean function that is needed is
F = (JAMZ AND Z) OR (JAMN AND N) OR HIBIT
where HIBIT is the high-order bit of the NEXT ADDRESS field. This can be implemented with
two AND gates that feed an OR gate.
4. When the JMPC field in a microinstruction is enabled, MBR is ORed with NEXT
ADDRESS to form the address of the next microinstruction. Are there any circumstances in
which it makes sense to have NEXT ADDRESS be 0x1FF and use JMPC?
Solution:
No. No matter what is in MBR, the next microinstruction will come from 0x1FF. The contents of
MBR are completely irrelevant. There is no point ORing MBR with 0x1FF since the result will
always be 0x1FF, no matter what is in MBR. The OR does nothing.
5. Suppose that in the example of Fig. 4-14(a) the statement
k = 5;
is added after the if statement. What would the new assembly code be? Assume that the
compiler is an optimizing compiler.
Solution:
BIPUSH 5
ISTORE k
at the label L2. However, an optimizing compiler will notice that i is never used after if statement.
Therefore, the fourth and fifth instructions (ISTORE and ILOAD) are not needed. The code
becomes
ILOAD j
ILOAD k
IADD
BIPUSH 3
I F_CMPEQ L1
ILOAD j
BIPUSH 1
ISUB
ISTORE j
GOTO L2
L1: BIPUSH 0
ISTORE k
L2: BIPUSH 0
ISTORE i
6. Give two different IJVM translations for the following Java statement:
i = k + n + 5;
Solution:
Here are two ways to do it:
ILOAD k ILOAD k
ILOAD n ILOAD n
IADD BIPUSH 5
BIPUSH 5 IADD
IADD IADD
ISTORE i ISTORE i
Here we have used the associative law, that is the fact that (j + k) + 4 is the same as j + (k + 4), at
least if the possibility of overflow is ignored.
7. Give the Java statement that produced the following IJVM code:
ILOAD j
ILOAD n
ISUB
BIPUSH 7
ISUB
DUP
IADD
ISTORE i
Solution: The statement is something like i = (j − n − 7) + (j − n − 7).

You might also like