0% found this document useful (0 votes)

19 views102 pages

Co MODULE 3 - Merged

Module 3 covers various concepts related to RISC and CISC architectures, pipeline processing, and associated hazards. Key topics include the distinction between RISC and CISC, the explanation of load/store architecture, pipeline execution stages, and methods to mitigate hazards such as data and branch hazards. Additionally, it includes practical problems and solutions related to instruction execution in pipelined processors.

Uploaded by

sothuu10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views102 pages

Co MODULE 3 - Merged

Uploaded by

sothuu10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 102

Module 3

1. Distinguish between RISC and CISC

2. Explain RISC load and store architecture
3. Define Pipeline processing.
4. Why pipelining is needed?
5. What are the various stages in a Pipeline execution?
6. Draw the hardware organization of two stage pipeline.
7. Explain different types of hazards that occur in a pipeline.
8. Explain instruction pipelining.
9. What is branch hazard? Describe the method for dealing with the branch hazard?
10. What is data hazard? Explain the methods for dealing with data hazard?
11. Explain Arithmetic pipelining
12. “Increasing the number of pipeline stages will decrease the execution time of the program”.
True or False? Justify your answer.
13. What is operand forwarding? What is its significance?
14. Discuss various types data hazards in a RISC Instruction pipeline with appropriate
examples.
15. Consider an instruction pipeline with four stages with the stage delays 5 nsec, 6 nsec, 11
nsec, and 8 nsec respectively. The delay of an inter-stage register stage of the pipeline is 1
nsec. What is the approximate speedup of the pipeline in the steady state under ideal
conditions as compared to the corresponding non-pipelined implementation?
16. Discuss structural hazards and control hazards with examples
17. A 5-stage pipelined processor has the stages: Instruction Fetch (IF), Instruction Decode
(ID), Operand Fetch (OF), Execute (EX) and Write Operand (WO). The IF, ID, OF, and
WO stages take 1 clock cycle each for any instruction. The EX stage takes 1 clock cycle for
ADD and SUB instructions, 3 clock cycles for MUL instruction, and 6 clock cycles for
DIV instruction. Operand forwarding is used in the pipeline (for data dependency, OF stage
of the dependent instruction can be executed only after the previous instruction completes
EX). What is the number of clock cycles needed to execute the following sequence of
instructions?

MUL R2,R10,R1

DIV R5,R3,R4

ADD R2,R5,R2

SUB R5,R2,R6
18. A5-stage pipelined processor has Instruction Fetch(IF), Instruction Decode (ID), Operand
Fetch (OF), Perform Operation (PO) and Write Operand (WO) stages. The IF, ID, OF and
WO stages take 1 clock cycle each for any instruction. The PO stage takes 1 clock cycle for
ADD and SUB instructions, 3 clock cycles for MUL instruction, and 6 clock cycles for
DIV instruction respectively. Operand forwarding is used in the pipeline. What is the
number of clock cycles needed to execute the following sequence of instructions?
Instruction Meaning of instruction

I0 :MUL R2 ,R0 ,R1 R2 = R0 *R1

I1 :DIV R5 ,R3 ,R4 R5 = R3/R4

I2 :ADD R2 ,R5 ,R2 R2 = R5+R2

I3 :SUB R5 ,R2 ,R6 R5 = R2-R6

19. The instruction pipeline of a RISC processor has the following stages: Instruction Fetch
(IF), Instruction Decode (ID), Operand Fetch (OF), Perform Operation (PO) and Writeback
(WB), The IF, ID, OF and WB stages take 1 clock cycle each for every instruction.
Consider a sequence of 100 instructions. In the PO stage, 40 instructions take 3 clock
cycles each, 35 instructions take 2 clock cycles each, and the remaining 25 instructions take
1 clock cycle each. Assume that there are no data hazards and no control hazards. How
many clock cycles are required for completion of execution of the sequence of instruction?
20. With a neat diagram, explain the classic five stage pipeline for a RISC Processor.
21. Explain various types of hazard mitigation techniques
22. List characteristics of RISC?
23. List different types of pipeline hazards?
24. Consider the following sequence of instructions being processed on the pipelined 5-stage
RISC processor. Add R4, R2, R3 Store R5, #100(R4) Load R6, #200(R4) Subtract R7, R5,
R6 Identify all the data dependencies in the above instruction sequence. For each
dependency, indicate the two instructions and the register that causes the dependency
25. Explain about data hazards with an example. Illustrate forwarding method to minimize data
hazard .
26. Explain three classes of instructions in RISC with example.
27. What is pipelining? Explain five stage pipeline for a RISC processor with an example .
28. Explain pipelined data path and control
29. Explain different pipeline hazards with example.
problem1
Ans:8
Problem 2
• The instruction pipeline of a RISC processor has the
following stages: Instruction Fetch (IF), Instruction
Decode (ID), Operand Fetch (OF), Perform Operation
(PO) and Writeback (WB). The IF, ID, OF and WB stages
take 1 clock cycle each for every instruction. Consider a
sequence of 100 instructions. In the PO stage, 40
instructions take 3 clock cycles each, 35 instructions
take 2 clock cycles each, and the remaining 25
instructions take 1 clock cycle each. Assume that there
are no data hazards and no control hazards. The
number of clock cycles required for completion of
execution of the sequence of instructions is ______.
• Explanation: Given, total number of instructions (n) =
100
Number of stages (k) = 5
Since, if n instructions take c cycle, so (c-1) stalls will
occur for these instructions.
• Therefore, the number of clock cycles required = Total
number of cycles required in general case + Extra cycles
required (here, in PO stage)
= (n + k – 1) + Extra cycles
= (100 + 5 -1) + 40*(3-1)+35*(2-1)+20*(1-1)
= (100 + 4) + 40*2+35*1+20*0
= 104 + 115
= 219 cycles
problem3
• A5-stage pipelined processor has Instruction Fetch(IF), Instruction
Decode (ID), Operand Fetch (OF), Perform Operation (PO) and Write
Operand (WO) stages. The IF, ID, OF and WO stages take 1 clock
cycle each for any instruction. The PO stage takes 1 clock cycle for
ADD and SUB instructions, 3 clock cycles for MUL instruction, and 6
clock cycles for DIV instruction respectively. Operand forwarding is
used in the pipeline. What is the number of clock cycles needed to
execute the following sequence of instructions? Instruction
Meaning of instruction
• I0 :MUL R2 ,R0 ,R1 R2 = R0 *R1
• I1 :DIV R5 ,R3 ,R4 R5 = R3/R4
• I2 :ADD R2 ,R5 ,R2 R2 = R5+R2
• I3 :SUB R5 ,R2 ,R6 R5 = R2-R6
solution
Problem 4
• A 5-stage pipelined processor has the stages: Instruction Fetch (IF),
Instruction Decode (ID), Operand Fetch (OF), Execute (EX) and Write
Operand (WO). The IF, ID, OF, and WO stages take 1 clock cycle each
for any instruction. The EX stage takes 1 clock cycle for ADD and
SUB instructions, 3 clock cycles for MUL instruction, and 6 clock
cycles for DIV instruction. Operand forwarding is used in the
pipeline (for data dependency, OF stage of the dependent
instruction can be executed only after the previous instruction
completes EX). What is the number of clock cycles needed to
execute the following sequence of instructions?
• MUL R2,R10,R1
• DIV R5,R3,R4
• ADD R2,R5,R2
• SUB R5,R2,R6
solution
Problem 5
• Consider an instruction pipeline with four
stages with the stage delays 5 nsec, 6 nsec, 11
nsec, and 8 nsec respectively. The delay of an
inter-stage register stage of the pipeline is 1
nsec. What is the approximate speedup of the
pipeline in the steady state underideal
conditions as compared to the corresponding
non-pipelined implementation?
solution
• Consider an instruction pipeline with four
stages (S1, S2, S3 and S4) each with
combinational circuit only. The pipeline
registers are required between each stage and
at the end of the last stage. Delays for the
stages and for the pipeline registers are as
given in the figure:
•
• Explanation:
• Pipeline registers overhead is not counted in normal
time execution
• So the total count will be
• 5+6+11+8= 30 [without pipeline]
• Now, for pipeline, each stage will be of 11 n-sec (+ 1 n-
sec for overhead). and, in steady state output is
produced after every pipeline cycle. Here, in this case
11 n-sec. After adding 1n-sec overhead, We will get 12
n-sec of constant output producing cycle.
• dividing 30/12 we get 2.5
Practice problem
solution
Practice problem
• Solution-
•
• Given-
• Four stage pipeline is used
• Delay of stages = 60, 50, 90 and 80 ns
• Latch delay or delay due to each register = 10 ns
•
• Part-01: Pipeline Cycle Time-
•
• Cycle time
• = Maximum delay due to any stage + Delay due to its register
• = Max { 60, 50, 90, 80 } + 10 ns
• = 90 ns + 10 ns
• = 100 ns
•
• Part-02: Non-Pipeline Execution Time-
•
• Non-pipeline execution time for one instruction
• = 60 ns + 50 ns + 90 ns + 80 ns
• = 280 ns
• Part-03: Speed Up Ratio-
•
• Speed up
• = Non-pipeline execution time / Pipeline execution time
• = 280 ns / Cycle time
• = 280 ns / 100 ns
• = 2.8
•
• Part-04: Pipeline Time For 1000 Tasks-
•
• Pipeline time for 1000 tasks
• = Time taken for 1st task + Time taken for remaining 999 tasks
• = 1 x 4 clock cycles + 999 x 1 clock cycle
• = 4 x cycle time + 999 x cycle time
• = 4 x 100 ns + 999 x 100 ns
• = 400 ns + 99900 ns
• = 100300 ns
• Part-05: Sequential Time For 1000 Tasks-
•
• Non-pipeline time for 1000 tasks
• = 1000 x Time taken for one task
• = 1000 x 280 ns
• = 280000 ns
•
• Part-06: Throughput-
•
• Throughput for pipelined execution
• = Number of instructions executed per unit time
• = 1000 tasks / 100300 ns
•
Practice problem
solution
• Solution-
•
• Given-
• Four stage pipeline is used
• Delay of stages = 150, 120, 160 and 140 ns
• Delay due to each register = 5 ns
• 1000 data items or instructions are processed
•
• Cycle Time-
•
• Cycle time
• = Maximum delay due to any stage + Delay due to its register
• = Max { 150, 120, 160, 140 } + 5 ns
• = 160 ns + 5 ns
• = 165 ns
• Pipeline Time To Process 1000 Data Items-
•
• Pipeline time to process 1000 data items
• = Time taken for 1st data item + Time taken for
remaining 999 data items
• = 1 x 4 clock cycles + 999 x 1 clock cycle
• = 4 x cycle time + 999 x cycle time
• = 4 x 165 ns + 999 x 165 ns
• = 660 ns + 164835 ns
• = 165495 ns
• = 165.5 μs
Pipelining
Out line
Definition of pipeline
Advantages and disadvantage
Type of pipeline (h/w) and (s/w)
Latency and throughput
hazards
pipeline
It is technique of decomposing a sequential
process into suboperation, with each
suboperation completed in dedicated
segment that operates concurrently with
all other segments.
Pipeline is commonly known as an assembly
line operation.
Example

Each sub operation is to be performed in

a segment within a pipeline. Each
segment has one or two registers and a
combinational circuit.
The sub operations in each segment of the

pipeline are as follows:

Latency and throughput
Latency
Each instruction takes a certain time to
complete.
latency for that operation is how long does it
take to execute single instruction in the
pipeline.
Throughput
The number of instructions that complete
per second.
advantages
1- Pipelining is widely used in modern processors .
2- Quicker time of execution large number of
instruction.
3- More efficient use of processor.
4- Arrange the hardware so that more than one
operation can be performed at the same time.
5- This technique is efficient for applications that need
to repeat the same task in many time with different
set of data.
Disadvantages
1- Pipelined organization requires complex
compilation techniques.
2- pipelining involves adding hardware, then
cost of the system increases.
Idea of pipelining in computer
The processor execute the program by
fetching and executing instructions. One after
the other.

Let Fi and Ei refer to the fetch and execute

steps for instruction Ii
Stages of pipelining

Fetch(F)- read the instruction from the memory

Decode(D)- Decode the instruction and fetch the
source operand
Execute(E)- perform the operation specified by the
instruction
Write(W)- store the result in the destination location
Use the Idea of Pipelining in a
Computer
Fetch + Execution
Time
I1 I2 I3
Time
Clock cycle 1 2 3 4
F1 E1 F2 E2 F3 E3
Instruction

I1 F1 E1
(a) Sequential
execution

I2 F2 E2
Interstage buffer
B1
I3 F3 E3

Instruction Execution
fetch unit (c) Pipelined
unit execution

Figure 8.1. Basic idea of instruction pipelining.

(b) Hardware
organization
Use the Idea of Pipelining in a
Computer Clock cycle 1 2 3 4 5 6 7
Time

Instruction

I1 F1 D1 E1 W1
Fetch + Decode
+ Execution + Write I2 F2 D2 E2 W2

I3 F3 D3 E3 W3

I4 F4 D4 E4 W4

(a) Instruction execution divided into

four steps

Interstage buffers

D : Decode
F : Fetch instruction E: Execute W : Write
instruction and fetch operation results
operands
B1 B2 B3

(b) Hardware
organization

Figure 8.2. A 4stage pipeline.

Use the Idea of Pipelining in a
Computer
Computer that has two separate hardware
units, one for fetching and another for
executing them.
the instruction fetched by the fetch unit is
deposited in an intermediate buffer B1.

This buffer needed to enable the execution

unit while fetch unit fetching the next
instruction.
Role of Cache Memory
Each pipeline stage is expected to complete in one
clock cycle.
The clock period should be long enough to let the
slowest pipeline stage to complete.
Faster stages can only wait for the slowest one to
complete.
Since main memory is very slow compared to the
execution, if each instruction needs to be fetched
from main memory, pipeline is almost useless.[ten
times greater than the time needed to perform
pipeline stage]
Fortunately, we have cache.
Types of pipeline
1) Software Pipelining
1) Can Handle Complex Instructions.
2) Allows programs to be reused.
2)Hardware Pipelining
1) Help designer manage complexity – a complex
task can be divided into smaller, more
manageable pieces.
2) Hardware pipelining offers higher performance.
Types of pipeline
Arithmetic Pipeline : Pipeline arithmetic units are
usually found in very high speed computers.
Floating–point operations, multiplication of fixed-
point numbers, and similar computations in scientific
problem.
Instruction Pipeline: Pipeline processing can occur
also in the instruction stream. An instruction pipeline
reads consecutive instructions from memory while
previous instructions are being executed in other
segments. This causes the instruction fetch and
execute phases to overlap and perform
simultaneous operations.
Arithmetic Pipeline
Floating-point adder/subtracter Exponents Mantissas
a b A B
1 Compare the exponents
R R
2 Align the mantissa
3 Add/sub the mantissa Segment 1: Compare
exponents
Difference
by subtraction
4 Normalize the result
R
X = A x 10a = 0.9504 x 103
Segment 2: Choose exponent Align mantissa
Y = B x 10b = 0.8200 x 102
R
1) Compare exponents :
3-2=1 Segment 3: Add or subtract
mantissas
2) Align mantissas
X = 0.9504 x 103 R R

Y = 0.08200 x 103
Adjust Normalize
3) Add mantissas Segment 4:
exponent result
Z = 1.0324 x 103
R R
4) Normalize result
Z = 0.10324 x 104
INSTRUCTION CYCLE
Six Phases* in an Instruction Cycle
1 Fetch an instruction from memory
2 Decode the instruction
3 Calculate the effective address of the operand
4 Fetch the operands from memory
5 Execute the operation
6 Store the result in the proper place

* Some instructions skip some phases

* Effective address calculation can be done in the part of the decoding phase
*Storage of the operation result into a register is done automatically in the execution
phase

==> 4-Stage Pipeline

1 FI: Fetch an instruction from memory

2 DA: Decode the instruction and calculate the effective address of the operand
3 FO: Fetch the operand
4 EX: Execute the operation
Example: Four-Segment
Instruction Pipeline
Figure 9-8 shows the operation of the instruction pipeline.
The time in the axis is divided into steps of equal duration.
The four segments are represented in the diagram with an
abbreviated symbol.
1. Fl is the segment that fetches an instruction.
2. DA is the segment that decodes the instruction and
calculates the effective address.
3. FO is the segment that fetches the operand.
4. EX is the segment that executes the inst
Example: Four-Segment
Instruction Pipeline
Example: Four-Segment
Instruction Pipeline
Pipeline Performance
The potential increase in performance resulting from pipelining is
proportional to the number of pipeline stages.
However, this increase would be achieved only if all pipeline stages
require the same time to complete, and there is no interruption
throughout program execution.
Unfortunately, this is not true.
Floating point may involve many clock cycle
Stalling involves halting the flow of instructions until the required
result is ready to be used. However stalling wastes processor time
by doing nothing while waiting for the result.
Pipeline stall causes degradation in pipeline
performance.
Pipeline Performance
Any condition that causes a pipeline to stall is called
a hazard.
Data hazard – A data hazard is any condition in which either the
source or the destination operands of an instruction are not
available at the time expected in the pipeline. As a result some
operations has to be delayed , and the pipe line stalls.

Instruction (control) hazard – a delay in the availability of

an instruction causes the pipeline to stall.[cache miss]

Structural hazard – the situation when two instructions

require the use of a given hardware resource at the same time.
Pipeline performance
Pipeline performance
Pipeline performance

Again, pipelining does not result in individual

instructions being executed faster; rather, it is the
throughput that increases.
Throughput is measured by the rate at which
instruction execution is completed.
Pipeline stall causes degradation in pipeline
performance.
We need to identify all hazards that may cause the
pipeline to stall and to find ways to minimize their
impact.
Data Hazards
We must ensure that the results obtained when instructions are
executed in a pipelined processor are identical to those obtained
when the same instructions are executed sequentially.
Hazard occurs
A ← 3 +A
B ← 4 ×A
No hazard
A←5×C
B ← 20 + C
When two operations depend on each other, they must be
executed sequentially in the correct order.
Another example:
Mul R2, R3, R4
Add R5, R4, R6
Data Hazards
Mul R2, R3, R4
Add R5, R4, R6
Time
Clock cycle 1 2 3 4 5 6 7 8 9

Instruction

I1 (Mul) F1 D1 E1 W1

I2 (Add) F2 D2 D2A E2 W2

I3 F3 D3 E3 W3

I4 F4 D4 E4 W4

Figure 8.6. Pipeline stalled by data dependency between D2 and W1.

Figure 8.6. Pipeline stalled by data dependency between D2 and W1.
TYPES OF DATA HAZARDS

1. Read after Write (RAW) :

It is also known as True dependency or Flow dependency.
It occurs when the value produced by an instruction is
required by a subsequent instruction. For example,
ADD R1, --, --;
SUB --, R1, --;
Stalls are required to handle these hazards.
TYPES OF DATA HAZARDS

2. Write after Read (WAR) :

It is also known as anti dependency. These hazards occur
when the output register of an instruction is used right after
read by a previous instruction. For example,
ADD --, R1, --; SUB R1, --, --;
TYPES OF DATA HAZARDS

3. Write after Write (WAW) :

It is also known as output dependency. These hazards
occur when the output register of an instruction is used for
write after written by previous instruction. For example,
ADD R1, --, --; SUB R1, --, --;
TYPES OF DATA HAZARDS
Handling Data Hazards :

Handling Data Hazards :

These are various methods we use to handle hazards:

1. Data Forwarding,
2. Code reordering
3. Stall insertion.
Operand Forwarding

Instead of from the register file, the second instruction

can get data directly from the output of ALU after the
previous instruction is completed.
A special arrangement needs to be made to “forward”
the output of ALU to the input of ALU.
Operand Forwarding
Handling Data Hazards in
Software
Let the compiler detect and handle the hazard:
I1: Mul R2, R3, R4
NOP
NOP
I2: Add R5, R4, R6
The compiler can reorder the instructions to perform
some useful work during the NOP slots.
Data dependency solutions
Code reordering : We need a special type of software to
reorder code. We call this type of software a hardware-
dependent compiler.
Operand forwarding : uses special h/w to detect a conflict and
then avoid it by routing the data through special paths between
pipeline segments .
Stall Insertion : it inserts one or more installs (no-op instructions) into
the pipeline, which delays the execution of the current instruction until
the required operand is written to the register file, but this method
decreases pipeline efficiency and throughput.
Example
I1: Mul R2, R3, R4
NOP
NOP
I2: Add R5, R4, R6
Instruction Hazards(control hazard
orOne
branch hazard)
of the major problems in operating an instruction
pipeline is the occurrence of branch instructions.
1- Unconditional branch always change the sequential
program flow by loading the program counter with the
target address.
2- Conditional branch the control selects the target
instruction if the condition is satisfied or the next
sequential instruction if the condition is not satisfied.
1-Unconditional Branches
Time
Clock cycle 1 2 3 4 5 6

Instruction
I1 F1 E1

I2 (Branch) F2 E2 Execution unit idl

I3 F3 X

Ik Fk Ek

Ik+1 Fk+1 Ek+1

Figure 8.8. An idle cycle caused by a branch instruction.

Unconditional Branches
The time lost as a result of a branch
instruction is referred to as the branch
penalty.
The previous example instruction I3 is
wrongly fetched and branch target address k
will discard the i3.
Typically the Fetch unit has dedicated h/w
which will identify the branch target address
as quick as possible after an instruction is
fetched.
Instruction Queue and Prefetching
branch instruction stalls the pipeline.
Many processor employs dedicated fetch unit
which will fetch the instruction and put them
into a queue.
It can store several instruction at a time.
A separate unit called dispatch unit, takes
instructions from the front of the queue and
send them to the execution unit.
Instruction Queue and
Prefetching
Instruction fetch unit
Instruction queue
F : Fetch
instruction

D : Dispatch /
Decode E : Execute W : Write
unit instruction results

Figure 8.10. Use of an instruction queue in the hardware organization of Figure 8.2b.
2- Conditional Braches
A conditional branch instruction introduces
the added hazard caused by the dependency
of the branch condition on the result of a
previous instruction.
The decision to branch cannot be made until
the execution of that instruction has been
completed.
Delayed Branch
LOOP Shift_left R1
Decrement R2
Branch=0 LOOP
NEXT Add R1,R3

(a) Original program

loop

LOOP Decrement R2
Branch=0 LOOP
Shift_left R1
NEXT Add R1,R3

(b) Reordered instructions

Figure 8.12. Reordering of instructions for a delayed branch.

DELAYED BRANCH
Incorrectly Predicted Branch
Branch Prediction

Better performance can be achieved if we arrange for

some branch instructions to be predicted as taken and
others as not taken.
Use hardware to observe whether the target address is
lower or higher than that of the branch instruction.
Let compiler include a branch prediction bit.
So far the branch prediction decision is always the same
every time a given instruction is executed – static
branch prediction.
RISC pipeline
• RISC (Reduced Instruction Set Computer)
• 1- To use an efficient instruction pipeline
• a) to implement an instruction pipeline using a small number of
• suboperations, with each begin executed in one cycle.
• b) because the fixed length instruction format , the decoding of
the
• operation can occur at the same time as register selection.
• 2 Data transfer instruction in RISC are limited to load and store
• instruction.by using cache memory.
• 3 One of major advantage of RISC is ability to execute instruction
• at the rate of one per clock cycle that can achieve pipeline
• segments requiring just one clock cycle.
• 4 The compiler supported that translates the high-level language
• program into machine language program.
• .
RISC pipeline
Instruction Cycle of Three-Stage Instruction Pipeline.

I: Instruction Fetch from program memory

A: Decode, Read Registers, ALU Operation

E: Transfer the output of ALU to a register, Transfer EA to a data

memory for loading or storing , Transfer branch address to the

program counter.

Types of instructions
- 1- Data Manipulation Instructions
- 2- Load and Store Instructions
- 3- Program Control Instructions
RiSC instruction
classification
Data Manipulation Instructions − Manage the data in
processor registers.
Data Transfer Instructions − These are load and store
instructions that use an effective address that is obtained by
adding the contents of two registers or a register and a
displacement constant provided in the instruction.
Program Control Instructions − These instructions use
register values and a constant to evaluate the branch
address, which is transferred to a register or the program
counter (PC).
Datapath and Control Considerations
RISC

MODULE 3
INTRODUCTION TO RISC
INSTRUCTION SET
• Processors are broadly classified into RISC and
CISC architecture based upon the implementation
of various instruction set.
• Reduced Instruction Set Architecture (RISC) –
The main idea behind is to make hardware
simpler by using an instruction set composed of a
few basic steps for loading, evaluating, and
storing operations just like a load command will
load data, store command will store the data.
INTRODUCTION TO RISC
INSTRUCTION SET
• Complex Instruction Set Architecture (CISC) –
The main idea is that a single instruction will do all loading,
evaluating, and storing operations just like a multiplication
command will do stuff like loading data, evaluating, and storing it,
hence it’s complex.
• Both approaches try to increase the CPU performance

• RISC: Reduce the cycles per instruction at the cost of the number of
instructions per program.

• CISC: The CISC approach attempts to minimize the number of

instructions per program but at the cost of increase in number of
cycles per instruction.
Characteristic of RISC –
•
Simpler instruction, hence simple instruction decoding.

• Instruction comes undersize of one word.

• Instruction takes a single clock cycle to get executed.

• More number of general-purpose registers.

• Simple Addressing Modes.

• Less Data types.

• Pipeline can be achieved.

Characteristic of CISC –
• Complex instruction, hence complex instruction decoding.

• Instructions are larger than one-word size.

• Instruction may take more than a single clock cycle to get

executed.

• Less number of general-purpose registers as operation get

performed in memory itself.

• Complex Addressing Modes.

• More Data types.

Example
• Example – Suppose we have to add two 8-bit number:

• CISC approach: There will be a single command or instruction for

this like ADD which will perform the task.

• RISC approach: Here programmer will write the first load command
to load data in registers then it will use a suitable operator and then
it will store the result in the desired location.

• So, add operation is divided into parts i.e. load, operate, store due
to which RISC programs are longer and require more memory to get
stored but require fewer transistors due to less complex command.
comparison
RISC
Risc example
Load-store architecture
• MIPS is a load-store architecture. What is a
load-store architecture?
• Only load and store instructions access the
memory, all other instructions use registers as
operands. What is the motivation? Primary
motivation is speedup –registers are faster.
• Reduced Instruction Set Computers (RISC) The
instruction set has only a small number of
frequently used instructions. This lowers
processor cost, without much impact on
performance. All instructions have the same
length. Load-store architecture.
• Non-RISC machines are called CISC (Complex
Instruction Set Computer). Example: Pentium
Example
LOAD/STORE architecture
• The microcontroller architecture that utilizes small and
highly optimized set of instructions is termed as the
Reduced Instruction Set Computer or simply called as RISC.
It is also called as LOAD/STORE architecture.

• In the late 1970s and early 1980s, RISC projects were

primarily developed from Stanford, UC-Berkley and IBM.
The John Coke of IBM research team developed RISC by
reducing the number of instructions required for processing
computations faster than the CISC. The RISC architecture is
faster and the chips required for the manufacture of RISC
architecture is also less expensive compared to the CISC
architecture.
RISC ARCHITECTURE
Typical Features of RISC Architecture

• Pipelining technique of RISC, executes multiple parts or stages of

instructions simultaneously such that every instruction on the CPU
is optimized. Hence, the RISC processors have Clock per Instruction
of one cycle, and this is called as One Cycle Execution.
• It optimizes the usage of register with more number of registers in
the RISC and more number of interactions within the memory can
be prevented.
• Simple addressing modes, even complex addressing can be done by
using arithmetic AND/OR logical operations.
• It simplifies the compiler design by using identical general purpose
registers which allows any register to be used in any context.
• For efficient usage of the registers and optimization of the
pipelining uses, reduced instruction set is required.
• The number of bits used for the opcode is reduced.
• In general there are 32 or more registers in the RISC.
Advantages of RISC processor
architecture
• Because of the small set of instructions of RISC, high-level language
compilers can produce more efficient code.
• RISC allows freedom of using the space on microprocessor because
of its simplicity.
• Instead of using Stack, many RISC processors use the registers for
passing arguments and holding the local variables.
• RISC functions uses only a few parameters, and the RISC processors
cannot use the call instructions, and therefore, use a fixed length
instructions which are easy to pipeline.
• The speed of the operation can be maximized and the execution
time can be minimized.
• Very less number of instruction formats (less than four), a few
number of instructions (around 150) and a few addressing modes
(less than four) are needed.

Pipeline Tut Solution
No ratings yet
Pipeline Tut Solution
6 pages
CO Gate 2023
No ratings yet
CO Gate 2023
6 pages
COA Tute 8 Main
No ratings yet
COA Tute 8 Main
3 pages
PIPELINE
No ratings yet
PIPELINE
13 pages
Pipeline Processing
No ratings yet
Pipeline Processing
43 pages
Unit II Numericals
No ratings yet
Unit II Numericals
5 pages
Homework 2
No ratings yet
Homework 2
8 pages
HPC Question Bank
No ratings yet
HPC Question Bank
5 pages
Computer Architecture Homework
No ratings yet
Computer Architecture Homework
41 pages
ITT204 - Ktu Qbank
No ratings yet
ITT204 - Ktu Qbank
8 pages
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
No ratings yet
Reduced Instruction Set Computer (Risc) Complex Instruction Set Computer (Cisc)
7 pages
Lect3 Pipeline
No ratings yet
Lect3 Pipeline
4 pages
Design of 32bit MIPS Processor
No ratings yet
Design of 32bit MIPS Processor
23 pages
Pipeline: A Simple Implementation of A RISC Instruction Set
100% (1)
Pipeline: A Simple Implementation of A RISC Instruction Set
16 pages
CSN-221 Pipelines-Quiz: Enrollment No.: 18114031 Name - Hemil Panchiwala
No ratings yet
CSN-221 Pipelines-Quiz: Enrollment No.: 18114031 Name - Hemil Panchiwala
6 pages
GATE Pipelining Questions
No ratings yet
GATE Pipelining Questions
51 pages
COA Practice Problems
No ratings yet
COA Practice Problems
59 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
No ratings yet
3 Pipelining Pipeline:: "Folder" Takes 20 Minutes
8 pages
Computer Architecture MA 305: Dr. Daya Sagar Gupta
No ratings yet
Computer Architecture MA 305: Dr. Daya Sagar Gupta
10 pages
Computer Architecture Exercises
No ratings yet
Computer Architecture Exercises
3 pages
High Performance Computing - CS 3010 - MID SEM Question by Subhasis Dash With Solution
No ratings yet
High Performance Computing - CS 3010 - MID SEM Question by Subhasis Dash With Solution
12 pages
4-The Processors
No ratings yet
4-The Processors
3 pages
Comp Arch Nptel Questions
No ratings yet
Comp Arch Nptel Questions
13 pages
Pipeline PYQs
No ratings yet
Pipeline PYQs
38 pages
Assignment 2 Solution
0% (1)
Assignment 2 Solution
4 pages
Module 3-Part 2
No ratings yet
Module 3-Part 2
50 pages
Lecture: Pipelining Basics
No ratings yet
Lecture: Pipelining Basics
28 pages
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
No ratings yet
Pipelining & Riscs: Pipelining Used Key Implementation Technique To Build Fast Processors. It
6 pages
Pipeline Very Useful
No ratings yet
Pipeline Very Useful
8 pages
ECE 341 Final Exam Solution: Problem No. 1 (10 Points)
No ratings yet
ECE 341 Final Exam Solution: Problem No. 1 (10 Points)
9 pages
Lecture 32 Pipelined Execution Structural and Data Hazards
No ratings yet
Lecture 32 Pipelined Execution Structural and Data Hazards
30 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
49 pages
Chapter4 2
No ratings yet
Chapter4 2
34 pages
ILP - Appendix C PDF
No ratings yet
ILP - Appendix C PDF
52 pages
Pipeline Processing Explained
No ratings yet
Pipeline Processing Explained
47 pages
Advanced CPU Pipeline Techniques
No ratings yet
Advanced CPU Pipeline Techniques
17 pages
Coa Lecture Unit 3 Pipelining
No ratings yet
Coa Lecture Unit 3 Pipelining
95 pages
Pipeline
No ratings yet
Pipeline
36 pages
Chapter 4
No ratings yet
Chapter 4
4 pages
Unit 3 Problems
No ratings yet
Unit 3 Problems
18 pages
Lec11 Pipeline 1 Notes
No ratings yet
Lec11 Pipeline 1 Notes
26 pages
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
No ratings yet
Computer Architecture: Pipelining: Dr. Ashok Kumar Turuk
136 pages
Computer Organization: Ahmed Hashim
No ratings yet
Computer Organization: Ahmed Hashim
48 pages
Parallel Processing Essentials
No ratings yet
Parallel Processing Essentials
32 pages
Ex4 Updated
No ratings yet
Ex4 Updated
4 pages
Bản Sao Của Lecture 9 - Pipelined Processor Design
No ratings yet
Bản Sao Của Lecture 9 - Pipelined Processor Design
11 pages
Pipelining Concepts and Problems
No ratings yet
Pipelining Concepts and Problems
33 pages
Comparison Between Pipelining
No ratings yet
Comparison Between Pipelining
9 pages
CS17303 Computer Architecture Notes On Lesson Unit IV - Sumathi
No ratings yet
CS17303 Computer Architecture Notes On Lesson Unit IV - Sumathi
24 pages
CAO-II Module 2 Complete
100% (1)
CAO-II Module 2 Complete
32 pages
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
No ratings yet
Instruction Pipeline Design, Arithmetic Pipeline Deign - Super Scalar Pipeline Design
34 pages
Unit 3 - Advanced Computer Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Advanced Computer Architecture - WWW - Rgpvnotes.in
15 pages
Advanced Pipelining Techniques
No ratings yet
Advanced Pipelining Techniques
75 pages
Lecture 7 - PIPELINING
No ratings yet
Lecture 7 - PIPELINING
16 pages
Pipe Lining
No ratings yet
Pipe Lining
14 pages
2 Marks MPMC QB With Answers
No ratings yet
2 Marks MPMC QB With Answers
22 pages
2022 Microprocessor and Interfacing
No ratings yet
2022 Microprocessor and Interfacing
10 pages
Internal Architecture of 8085 Microprocessor: A. Control Unit
No ratings yet
Internal Architecture of 8085 Microprocessor: A. Control Unit
17 pages
Module2 - 8051 Instruction Set
No ratings yet
Module2 - 8051 Instruction Set
33 pages
Microcontroller AT89S52: Important Features and Applications
No ratings yet
Microcontroller AT89S52: Important Features and Applications
17 pages
TMS320C5x: By-D.Jenny Simpsolin
No ratings yet
TMS320C5x: By-D.Jenny Simpsolin
28 pages
Microprocessor and Microcontroller Based Systems (For Cse & It)
No ratings yet
Microprocessor and Microcontroller Based Systems (For Cse & It)
29 pages
Unit3 LDCA
No ratings yet
Unit3 LDCA
98 pages
8085 Microprocessor Basics
No ratings yet
8085 Microprocessor Basics
43 pages
8086 Instruction Set Overview
100% (1)
8086 Instruction Set Overview
104 pages
Unit 2 - Microcontrollers and Programming
No ratings yet
Unit 2 - Microcontrollers and Programming
93 pages
Full Download ARM Architecture Reference Manual 2nd Edition David Seal PDF
100% (21)
Full Download ARM Architecture Reference Manual 2nd Edition David Seal PDF
50 pages
Lecture-1 (Introduction, Hardwired Logic) : MPI Unit 1
No ratings yet
Lecture-1 (Introduction, Hardwired Logic) : MPI Unit 1
34 pages
Final Exam Solution - Test Paper Final Exam Solution - Test Paper
No ratings yet
Final Exam Solution - Test Paper Final Exam Solution - Test Paper
15 pages
Microprocessor 8086 Cheat Sheet
No ratings yet
Microprocessor 8086 Cheat Sheet
3 pages
Chapter 4 Addressing Mode
No ratings yet
Chapter 4 Addressing Mode
43 pages
Solutions Mazidi - x86 5e - IRM
93% (56)
Solutions Mazidi - x86 5e - IRM
127 pages
Computer Programming Instructions
No ratings yet
Computer Programming Instructions
17 pages
Hum XXX Pro
No ratings yet
Hum XXX Pro
51 pages
Addressing Modes
No ratings yet
Addressing Modes
7 pages
CPE103 Finals Reviewer
No ratings yet
CPE103 Finals Reviewer
3 pages
CS3351 Dpco Iat2
No ratings yet
CS3351 Dpco Iat2
1 page
Read Chapter 3, The 8051 Microcontroller Architecture, Programming and Applications by Kenneth .J.Ayala
No ratings yet
Read Chapter 3, The 8051 Microcontroller Architecture, Programming and Applications by Kenneth .J.Ayala
32 pages
8086 Instruction & Directive Guide
No ratings yet
8086 Instruction & Directive Guide
214 pages
7.microcontroller and Interfacing Lab
No ratings yet
7.microcontroller and Interfacing Lab
7 pages
Sample Papers
No ratings yet
Sample Papers
6 pages
Assembly Arrays & Addressing Modes
No ratings yet
Assembly Arrays & Addressing Modes
7 pages
Question Bank Insem
No ratings yet
Question Bank Insem
2 pages
Intro to Motorola 68HC11 Basics
No ratings yet
Intro to Motorola 68HC11 Basics
36 pages
Assembly Notes
No ratings yet
Assembly Notes
18 pages

Co MODULE 3 - Merged

Uploaded by

Co MODULE 3 - Merged

Uploaded by

Module 3

1. Distinguish between RISC and CISC

I0 :MUL R2 ,R0 ,R1 R2 = R0 *R1

I1 :DIV R5 ,R3 ,R4 R5 = R3/R4

I2 :ADD R2 ,R5 ,R2 R2 = R5+R2

I3 :SUB R5 ,R2 ,R6 R5 = R2-R6

Each sub operation is to be performed in

pipeline are as follows:

Let Fi and Ei refer to the fetch and execute

Fetch(F)- read the instruction from the memory

Figure 8.1. Basic idea of instruction pipelining.

(a) Instruction execution divided into

Figure 8.2. A 4stage pipeline.

This buffer needed to enable the execution

* Some instructions skip some phases

==> 4-Stage Pipeline

1 FI: Fetch an instruction from memory

Instruction (control) hazard – a delay in the availability of

Structural hazard – the situation when two instructions

Again, pipelining does not result in individual

Figure 8.6. Pipeline stalled by data dependency between D2 and W1.

1. Read after Write (RAW) :

2. Write after Read (WAR) :

3. Write after Write (WAW) :

Handling Data Hazards :

Instead of from the register file, the second instruction

I2 (Branch) F2 E2 Execution unit idl

Ik+1 Fk+1 Ek+1

Figure 8.8. An idle cycle caused by a branch instruction.

(a) Original program

(b) Reordered instructions

Figure 8.12. Reordering of instructions for a delayed branch.

Better performance can be achieved if we arrange for

Better performance can be achieved if we arrange for

I: Instruction Fetch from program memory

E: Transfer the output of ALU to a register, Transfer EA to a data

memory for loading or storing , Transfer branch address to the

• CISC: The CISC approach attempts to minimize the number of

• Instruction comes undersize of one word.

• Instruction takes a single clock cycle to get executed.

• More number of general-purpose registers.

• Simple Addressing Modes.

• Less Data types.

• Pipeline can be achieved.

• Instructions are larger than one-word size.

• Instruction may take more than a single clock cycle to get

• Less number of general-purpose registers as operation get

• Complex Addressing Modes.

• More Data types.

• CISC approach: There will be a single command or instruction for

• In the late 1970s and early 1980s, RISC projects were

• Pipelining technique of RISC, executes multiple parts or stages of

You might also like