0% found this document useful (0 votes)

27 views46 pages

Moduel 5

Module 5 covers processor structure, register organization, instruction cycles, and pipelining techniques. It discusses the functions of processors, types of registers, instruction execution characteristics, and the benefits of reduced instruction set architecture (RISC). Additionally, it addresses challenges such as pipeline hazards and optimization strategies for efficient instruction execution.

Uploaded by

saviosabu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views46 pages

Moduel 5

Uploaded by

saviosabu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Processor Structure and Reduced

Instruction Set
MODULE 5
Module 5 Processor Structure and Reduced Instruction Set

• Processor organization,
• Register organization
• Instruction cycle
• Instruction pipelining
• Processor Organization for Pipelining
• Instruction Execution Characteristics
• The Use of a Large Register File,
• Compiler-Based Register Optimization
• Reduced Instruction Set Architecture
Processor Functions
• A processor must perform several functions:
• Fetch instruction: Reads instructions from memory.
• Interpret instruction: Decodes the instruction to
determine the operation.
• Fetch data: Reads data from memory or I/O devices.
• Process data: Performs arithmetic or logical operations.
• Write data: Stores results in memory or sends them to an
I/O device.
• To achieve these tasks efficiently, the processor contains
registers, an ALU (Arithmetic and Logic Unit), and a Control
Unit (CU).
Register Organization
• Registers act as high-speed memory within the processor. They can be
categorized into:

1. User-Visible Registers:
• Used in assembly-level programming to minimize memory access.

• Types:
• General-purpose registers: Store operands for any operation.

• Data registers: Hold integer or floating-point values.

• Address registers: Store memory addresses for instructions.

• Condition code registers: Store flags like zero, carry, overflow, etc.
Register Organization
2. Control and Status Registers:
• Used by the processor and OS to manage execution.

• Common registers:
• Program Counter (PC): Stores the address of the next instruction.

• Instruction Register (IR): Holds the fetched instruction.

• Memory Address Register (MAR): Holds the address of data in memory.

• Memory Buffer Register (MBR): Temporarily stores data read from or written to memory.

• Program Status Word (PSW): Holds condition codes, interrupt enable/disable bits, and execution mode.
Instruction Cycle
The instruction cycle consists of fetch, decode, execute, and interrupt handling stages:

1. Fetch: Reads the instruction from memory into the Instruction Register (IR).

2. Decode: Determines the operation and required operands.

3. Execute: Performs the operation using the ALU or registers.

4. Interrupt Handling (if needed): Saves the current state and executes the interrupt
service routine.
• If indirect addressing is used, an indirect cycle fetches the actual operand address.

• The cycle repeats for each instruction in the program.

Data Flow in the Processor
Data moves between the PC, MAR, MBR, IR, and ALU during execution:

1. Fetch Cycle:
• PC → MAR → Address Bus

• Memory → MBR → IR (Instruction is fetched)

• PC is incremented for the next instruction

2. Indirect Cycle (if needed):

• MBR (stores address) → MAR → Memory read

• MBR (stores final operand address)

Data Flow in the Processor
Data moves between the PC, MAR, MBR, IR, and ALU during execution:

3. Execute Cycle:
• ALU operates on data from registers or memory.

• The result is stored in a register or memory location.

4. Interrupt Cycle:
• PC (saved in memory) → New PC loaded with interrupt routine address
Instruction Pipelining
Pipelining Strategy

• Instruction pipelining is a technique used to improve CPU performance by overlapping

instruction execution, much like an assembly line in a factory.

• Instead of executing one instruction at a time, the processor breaks the execution into stages,
with multiple instructions being processed simultaneously.

• Basic Two-Stage Pipeline

1. Fetch Instruction (FI): Reads the instruction from memory.
2. Execute Instruction (EI): Decodes and executes the instruction.

• This approach improves speed by allowing a new instruction to be fetched while another is
being executed. However, execution time varies, and branch instructions cause delays.
Instruction Pipelining
Six-Stage Instruction Pipeline

To optimize performance, instruction processing can be broken into more stages:

1. Fetch Instruction (FI): Reads the instruction into a buffer.

2. Decode Instruction (DI): Determines the opcode and operands.

3. Calculate Operands (CO): Computes effective addresses.

4. Fetch Operands (FO): Retrieves operands from memory or registers.

5. Execute Instruction (EI): Performs the computation.

6. Write Operand (WO): Stores the result.

• With a six-stage pipeline, multiple instructions are in different stages simultaneously. If each stage takes
an equal time, execution time is significantly reduced.

• Challenges: Memory conflicts, branch instructions, and interrupts can stall the pipeline.
Pipeline Performance
•
Pipeline Hazards
Hazards occur when instruction dependencies prevent continuous execution.
Three types of hazards exist:

1. Resource Hazards (Structural Hazards)

• Occurs when multiple instructions require the same hardware resource.

• Example: If a memory read and an instruction fetch cannot occur simultaneously, the
pipeline must stall.

• Solution: Increase hardware resources (e.g., multiple memory ports, multiple ALUs).
Pipeline Hazards
2. Data Hazards
• Occurs when an instruction depends on the result of a previous instruction still in the pipeline
• Types:
• Read After Write (RAW) Hazard: A register read occurs before the previous instruction writes to it.
• Write After Read (WAR) Hazard: A write occurs before a previous instruction reads from the same location.
• Write After Write (WAW) Hazard: Two instructions write to the same location out of order.
• Example of RAW Hazard:
• ADD EAX, EBX ; EAX = EAX + EBX
• SUB ECX, EAX ; ECX = ECX - EAX (EAX is not ready)
• The pipeline must stall for EAX to be updated before being used.
• Solution:
• Forwarding (Bypassing): Pass data directly to dependent instructions.
• Pipeline Stalling: Delay execution until data is ready.
Pipeline Hazards
3. Control Hazards (Branch Hazards)
• Occur when the pipeline fetches the wrong instruction after a branch (jump, if-else, loops,
etc.)

• Until the branch is executed, the pipeline does not know which instruction to fetch next.

• Penalty: Flushing incorrect instructions from the pipeline causes delays.

Handling Branch Hazards
1. Multiple Streams: Fetch both possible branch targets.
• Problem: Wastes resources and increases complexity.

2. Prefetch Branch Target: Fetch the next instruction and the branch target in
parallel.
• Used in IBM 360/91.

3. Loop Buffer: A small cache that stores recently executed instructions.

• If a loop repeats, it fetches instructions from the buffer instead of memory.

• Used in CDC Star-100, CRAY-1.

4. Branch Prediction: The CPU predicts whether a branch will be taken.

Handling Branch Hazards
4. Branch Prediction: The CPU predicts whether a branch will be taken.
• Static Prediction:
• Always Not Taken: Assume the branch is never taken.
• Always Taken: Assume the branch is always taken.
• Opcode-based Prediction: Certain opcodes predict branch behavior.

• Dynamic Prediction:
• Uses history to make better guesses.
• Taken/Not Taken Switch: Stores whether a branch was taken previously.
• Branch History Table (BHT): A cache that stores past branch decisions.

5. Delayed Branching:
• The CPU reorders instructions to execute useful instructions before resolving the branch.
Problem
• Pipelined processor has a clock rate of 2.5 GHz and executes a program with 1.5
million instructions. The pipeline has five stages, and instructions are issued at a
rate of one per clock cycle. Ignore penalties due to branch instructions and out-
of- sequence executions.
• a. What is the speedup of this processor for this program compared to a nonpipelined
processor?
• b. What is throughput (in MIPS) of the pipelined processor?
• Solution:
• Given:
• clock_rate_ghz = 2.5 # GHz
• instructions = 1.5e6 # 1.5 million instructions
• pipeline_stages = 5 # 5-stage pipeline
• instruction_issue_rate = 1 # One instruction per clock cycle
Problem
•
Problem
• A nonpipelined processor has a clock rate of 2.5 GHz and an average CPI (cycles
per instruction) of 4. An upgrade to the processor introduces a five- stage
pipeline. However, due to internal pipeline delays, such as latch delay, the clock
rate of the new processor has to be reduced to 2 GHz.
• a. What is the speedup achieved for a typical program?
• b. What is the MIPS rate for each processor?
• Solution:
• Given:
• clock_rate_non_pipelined = 2.5e9 # 2.5 GHz
• CPI_non_pipelined = 4
• clock_rate_pipelined = 2.0e9 # 2 GHz
• pipeline_stages = 5
• CPI_pipelined = 1 # Ideal pipeline CPI
Problem
•
Problem
•
Instruction Execution Characteristics
• They are the patterns and behaviors observed during the execution of
high-level language (HLL) programs when compiled to machine-level code.
These characteristics help architects understand:
• What types of instructions occur most frequently
• How operands are used
• How control flows (e.g., branches, loops)
• Which parts of programs consume the most time
1. Operations Performed:
• Studies show that most frequently executed operations in compiled HLL programs are:

• Data movement and control flow dominate, not complex arithmetic.

Instruction Execution Characteristics
2. Operand Usage
• Operand = the data on which operations are performed.
• Findings:
• Most operands are simple scalar variables (e.g., integers, chars)
• Around 80% are local to the procedure/function
• Arrays, structures, and pointers are used less frequently
• Since most data is local, registers are ideal for holding them.

3. Execution Sequencing
• Most instructions are simple (e.g., add, load, branch)
• Procedure calls and returns are frequent and expensive
• Branch instructions (like if, for, while) are common and affect pipeline flow
• Efficient support for procedure calls, register use, and branch prediction is critical.
Instruction Execution Characteristics
• Example Insight (from studies like Patterson's and Hennessy’s):

• Even though CALL/RETURN occurs less frequently, it consumes a

disproportionate amount of time, due to saving/restoring context.
Instruction Execution Characteristics
• Instruction execution characteristics helps architects design better CPUs:
• Add more registers for fast operand access

• Use simple instruction formats for fast decoding

• Design better pipelines and branch prediction

• Optimize hardware for realistic program behavior, not hypothetical workloads

Use of a Large Register File
• Large register file — a fast, on-chip storage space used to hold operands and
temporary results.
• This design choice is driven by the desire to minimize costly memory accesses
and maximize processor speed.
• Why Use a Large Register File?
1. Registers are faster than memory
• Accessing data in a register is much quicker than accessing cache or main memory.
• Keeping operands in registers significantly speeds up instruction execution.
2. High frequency of scalar and local variable access
• Studies show most high-level language (HLL) variables are:
• Scalars (e.g., integers, characters)
• Local to procedures (used within a function)
• Therefore, keeping these frequently used variables in registers is efficient.
Use of a Large Register File
• Why Use a Large Register File?
3. Reduces load/store operations
• RISC design limits memory access to only LOAD and STORE instructions.

• With enough registers, most operations can be done register-to-register, reducing memory traffic.

• Register Windows
• Each procedure needs its own set of registers.

• Calling another procedure (or returning from one) would typically require
saving/restoring registers to/from memory — slow.

• Hence use register windows — overlapping sets of registers assigned to each procedure.
Use of a Large Register File
• Global Variables
• Register windows are great for local variables, but global variables (shared across
functions) can't be held in these rotating windows.
• Solutions:
1. Assign global variables to memory (traditional)
2. Use fixed “global registers” — a small set of registers always accessible to all procedures
3. For frequently used, local variables, a register file is faster and more efficient than cache.
Use of a Large Register File
• Benefits of a Large Register File

• Reduced memory access = better performance

• Enables faster procedure calls with register windows

• Improves instruction pipelining efficiency

• Allows more operand storage for HLL programs

Compiler-Based Register Optimization
• In RISC architecture, the number of physical (hardware) registers is limited
(e.g., 16, 32). But high-level language (HLL) programs use many variables. So the
compiler is responsible for:
• Keeping as many frequently-used variables in registers as possible

• Minimizing load/store instructions to/from memory

• Reusing registers when possible without conflicts

• This is called register allocation and optimization.

Compiler-Based Register Optimization
• Graph Coloring Algorithm
• This is the most common algorithm used in compilers to perform register allocation.
• HLL programs refer to variables symbolically (e.g., a, b, sum)
• Compiler maps these symbolic variables to virtual registers
• Then it tries to assign virtual registers to physical registers in the most efficient way
• If registers run out, some variables must be "spilled" to memory (less efficient)

• Reason for Register Optimization

• Memory access is slow compared to registers
• Good register allocation = faster code, smaller binaries, less energy
• In RISC, since most operations are register-to-register, this becomes even more critical
Compiler-Based Register Optimization
• More Registers vs Compiler Optimization
• If you have many registers (like in some RISC CPUs), optimization becomes easier
• But even with few registers, a smart compiler can do a great job with optimization
• Studies show that beyond 32–64 registers, performance gains taper off unless your
compiler is very poor
Instruction Set Architecture
• Instruction Set Architecture (ISA) is the part of a computer architecture that
defines the interface between software and hardware. It includes:
• The set of instructions the processor can execute (e.g., arithmetic, logical, data transfer).

• Instruction formats, addressing modes, and data types.

• Registers, memory organization, and I/O mechanisms.

• Interrupts and exception handling mechanisms.

• ISA serves as the programmer’s view of the machine, defining what the
processor can do—not how it does it. It acts as a bridge between software and
the underlying hardware implementation.
RISC Architecture – Reduced Instruction Set Computer
• RISC is a computer architecture that uses a small, highly optimized set of
instructions, all designed to be executed very quickly — usually one instruction
per clock cycle.
Instruction Format – Reduced Instruction Set Computer
• RISC architectures typically use simple and fixed-length instruction formats to
facilitate fast decoding and efficient pipelining. Here's an overview of the
common formats:

• Usually, Instruction Format is of 3 types:

• R-Type (Register Type)
• I-Type (Immediate Type)
• J-Type (Jump Type)

• Instructions facilitate Simplified decoding

• Registers are used predominantly for operands

Instruction Format – Reduced Instruction Set Computer
• R-Type (Register Type)
• Used for arithmetic and logical operations.

Field Opcode rs rt rd shamt funct

Bit Length 6 5 5 5 5 6

• rs, rt : Source register

• rd : destination register
• shamt: shift amount
• funct: further specify the operation
• Example:
• ADD R1, R2, R3 Add contents of R2 and R3, store in R1
Instruction Format – Reduced Instruction Set Computer
• I-Type (Immediate Type)

• Used for data transfer, arithmetic with constants, and branching.

Field Opcode rs rt Immediate
Bit Length 6 5 5 16

• rs : Source register

• rt : destination register

• Immediate: constant value or address offset

• Example:
• ADDI R1, R2, #10 Add immediate value 10 to R2, store in R1
Instruction Format – Reduced Instruction Set Computer
• J-Type (Jump Type)

• Used for unconditional jumps.

Field Opcode Address
Bit Length 6 26

• Address : Jump target - usually combined with upper bits from the PC

• Example:
• JMP 10000 Jump to instruction at address 10000
Functional Elements – Reduced Instruction Set
Computer
1. Instruction Fetch Unit (IFU)
• Function: Fetches the next instruction from memory.
• Works with the Program Counter (PC) to determine the address of the next instruction.
• Uses instruction cache to reduce fetch time.
• First stage in the pipeline.

2. Instruction Decode Unit (IDU)

• Function: Decodes the fetched instruction into control signals.
• Identifies the operation (opcode), source and destination registers.
• Checks for operand readiness and forwards to the execution unit.
• Since instruction formats are simple and fixed, decoding is fast and efficient.
• Second stage in the pipeline.
Functional Elements – Reduced Instruction Set
Computer
3. Register File
• Function: Stores general-purpose data for computation.
• Contains a large number of registers (e.g., 32 or more).
• Most operands are read/written directly to/from registers, not memory.
• Registers are used for storing:
• Operands for ALU
• Intermediate values
• Function parameters

• Reduces memory access, speeds up computation.

4. Arithmetic and Logic Unit (ALU)

• Function: Performs arithmetic and logic operations like:
• ADD, SUB, AND, OR, XOR
• Comparisons (for branches)

• Works on data from registers, not memory.

• Third stage of the pipeline (Execute).
Functional Elements – Reduced Instruction Set
Computer
5. Control Unit
• Function: Directs the operation of the processor.
• Generates control signals based on the instruction type.
• Manages instruction flow, pipelining, branching, etc.
• Often hardwired instead of microprogrammed (unlike CISC).
• Enables fast instruction execution.

6. Load/Store Unit (Memory Access)

• Function: Handles memory access operations:
• LOAD: Read data from memory into a register
• STORE: Write data from a register to memory

• Only these instructions access memory.

• Separates memory from computation (Load/Store architecture).
Functional Elements – Reduced Instruction Set
Computer
7. Pipeline Registers
• Function: Hold intermediate data between pipeline stages.
• Allow overlapping execution of multiple instructions.
• Boosts throughput via instruction pipelining.

8. Program Counter (PC)

• Function: Holds the address of the next instruction to fetch.
• Automatically updated after each instruction.
• Can be modified by branch/jump instructions.

9. Instruction and Data Cache

• Function: Stores frequently accessed instructions and data.
• Reduces latency of memory operations.
• Helps maintain performance despite slower main memory.
• Instructions are fixed length (usually 32 bits) and simple in format.
Functional Elements – Reduced Instruction Set
Computer
Functional Element Functionality
Instruction Fetch Unit Fetches instructions
Instruction Decoder Decodes and prepares instructions
Register File Holds operands and results
ALU Performs arithmetic/logical operations
Control Unit Manages execution and pipelining
Load/Store Unit Handles memory reads/writes
Program Counter (PC) Tracks next instruction address
Pipeline Registers Buffer between pipeline stages
Instruction/Data Cache Fast access to code/data
Branch Prediction Unit Reduces branch penalties
Pipelining – Reduced Instruction Set Computer
• RISC designs are ideal for instruction pipelining, which overlaps execution stages of
different instructions. Typical RISC pipeline:
1. IF - Instruction Fetch

2. ID - Instruction Decode

3. EX – Execute

4. MEM - Memory Access (if needed)

5. WB - Write Back

• This allows RISC CPUs to execute 1 instruction per cycle after the pipeline fills up.
Execution – Reduced Instruction Set Computer
1. Fixed-Length Instructions:
• All instructions are 32-bit wide, which simplifies the fetch stage and allows easy decoding.

2. Few Addressing Modes:

• Only register and immediate addressing modes are supported, which simplifies the effective
address calculation stage in the pipeline.

3. Register-to-Register Operations:
• Operands come from fast-access registers, reducing memory access time and improving
overall execution speed.

4. Simple Control Logic:

• Because of uniform instruction formats and simple operations, the control unit can be
hardwired rather than microprogrammed, improving performance.
Advantage – Reduced Instruction Set Computer
• Simpler control logic → faster execution
• Easier to implement pipelining → higher instruction throughput
• Compiler optimization is easier
• Lower power consumption
• Better performance with fewer transistors → cost-effective

• Examples:
• ARM (used in most smartphones & embedded devices)
• MIPS
• RISC-V (open-source architecture)
• SPARC (used in servers)
• PowerPC

Chapter 4
No ratings yet
Chapter 4
78 pages
CH 12.ppt Type I
No ratings yet
CH 12.ppt Type I
54 pages
Pipelinehazard For Class
No ratings yet
Pipelinehazard For Class
61 pages
Pipelinehazard 160823134502
No ratings yet
Pipelinehazard 160823134502
61 pages
HRY-312 Computer Organization Introduction To Pipelining
No ratings yet
HRY-312 Computer Organization Introduction To Pipelining
30 pages
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Processor Structure and Function
74 pages
Computer Architecture Insights
100% (1)
Computer Architecture Insights
55 pages
Chapter 5
No ratings yet
Chapter 5
38 pages
12 - Processor Structure and Function
No ratings yet
12 - Processor Structure and Function
73 pages
CPU Architecture Essentials
No ratings yet
CPU Architecture Essentials
40 pages
IIC1082 Chapter8
No ratings yet
IIC1082 Chapter8
24 pages
10 Pipelining
No ratings yet
10 Pipelining
44 pages
Module 5 - Processor Structure and Function
No ratings yet
Module 5 - Processor Structure and Function
74 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
55 pages
CH10-Processor Structure and Function
No ratings yet
CH10-Processor Structure and Function
14 pages
Ca 5
No ratings yet
Ca 5
12 pages
CEA201 - Chapter 14 - Processor Structure and Function
No ratings yet
CEA201 - Chapter 14 - Processor Structure and Function
42 pages
Unit V
No ratings yet
Unit V
23 pages
11 Processor Structure and Function 20 3 18
No ratings yet
11 Processor Structure and Function 20 3 18
27 pages
Chapter 3 PPTV 31 Sem IIv 31
No ratings yet
Chapter 3 PPTV 31 Sem IIv 31
40 pages
Pipelining Basic Concept
No ratings yet
Pipelining Basic Concept
23 pages
Lecture Notes Pipelining Stages 7B
No ratings yet
Lecture Notes Pipelining Stages 7B
7 pages
Processor Organization & Instruction Cycle
No ratings yet
Processor Organization & Instruction Cycle
31 pages
CA Lecture 12
No ratings yet
CA Lecture 12
48 pages
Instruction Pipelining
No ratings yet
Instruction Pipelining
32 pages
5.1-5.3 Pipelining and Parallel Processing
No ratings yet
5.1-5.3 Pipelining and Parallel Processing
56 pages
Slot24 25 CH14 ProcessorStructureAndFunction 42 Slots
No ratings yet
Slot24 25 CH14 ProcessorStructureAndFunction 42 Slots
42 pages
Advanced Pipelining Techniques
No ratings yet
Advanced Pipelining Techniques
44 pages
Unit - 1 Microprocessor Architecture
No ratings yet
Unit - 1 Microprocessor Architecture
52 pages
Pipe Lining
No ratings yet
Pipe Lining
66 pages
Instruction Pipelining Basics
No ratings yet
Instruction Pipelining Basics
20 pages
Ch2 Lec7 Instruction Piplining
No ratings yet
Ch2 Lec7 Instruction Piplining
34 pages
Chapter 8 - Pipelining
No ratings yet
Chapter 8 - Pipelining
38 pages
Chapter 2 Lecture 4 and 5
No ratings yet
Chapter 2 Lecture 4 and 5
56 pages
Lec03 - Processor Structure and Function
No ratings yet
Lec03 - Processor Structure and Function
55 pages
Computer Systems Pipelining Guide
No ratings yet
Computer Systems Pipelining Guide
7 pages
Computer Architecture: Nguyễn Trí Thành
No ratings yet
Computer Architecture: Nguyễn Trí Thành
77 pages
05 Risc V Pipeline
No ratings yet
05 Risc V Pipeline
31 pages
CPU Pipelining Explained
No ratings yet
CPU Pipelining Explained
30 pages
Computer Science 37 Lecture 22
No ratings yet
Computer Science 37 Lecture 22
14 pages
CO Pipelining PDF Notes
No ratings yet
CO Pipelining PDF Notes
10 pages
CH12 CPU Structure and Function
No ratings yet
CH12 CPU Structure and Function
44 pages
CH14 COA10e
No ratings yet
CH14 COA10e
54 pages
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
No ratings yet
Pipelining. Pipeline Hazards: Sabina Batyrkhanovna
19 pages
STW120CT Computer Architecture and Networks: (Instruction Pipelining)
No ratings yet
STW120CT Computer Architecture and Networks: (Instruction Pipelining)
24 pages
Module 4-Pipelining
No ratings yet
Module 4-Pipelining
39 pages
Pipeline 1
No ratings yet
Pipeline 1
6 pages
Lec3 PDF
No ratings yet
Lec3 PDF
15 pages
Lecutre-7 Instruction Pipelining
No ratings yet
Lecutre-7 Instruction Pipelining
29 pages
PIpeline Processing and Multi Processing
No ratings yet
PIpeline Processing and Multi Processing
16 pages
Enhancing Performance With Pipelining
No ratings yet
Enhancing Performance With Pipelining
85 pages
Computer Pipelining Explained
No ratings yet
Computer Pipelining Explained
45 pages
DLCO Module 6 Sem 3
No ratings yet
DLCO Module 6 Sem 3
40 pages
CPU Architecture for Students
100% (1)
CPU Architecture for Students
30 pages
CBA Processor
No ratings yet
CBA Processor
21 pages
Concept of Pipelining - Computer Architecture Tutorial What Is Pipelining?
100% (1)
Concept of Pipelining - Computer Architecture Tutorial What Is Pipelining?
5 pages
Canvas Pipelining and Parallel Processors
No ratings yet
Canvas Pipelining and Parallel Processors
5 pages
Pipelining Updated
No ratings yet
Pipelining Updated
39 pages
23BTH063L - Sadiya Rahman - COA
No ratings yet
23BTH063L - Sadiya Rahman - COA
12 pages
Processor Selection, Structural Units
No ratings yet
Processor Selection, Structural Units
18 pages
CAO Pipelining Lecture
No ratings yet
CAO Pipelining Lecture
50 pages
Pentium Memory Hierarchy (By Indranil Nandy, IIT KGP)
100% (5)
Pentium Memory Hierarchy (By Indranil Nandy, IIT KGP)
6 pages
MSP430 Assembly Guide for Students
No ratings yet
MSP430 Assembly Guide for Students
64 pages
DLCOunit 3
No ratings yet
DLCOunit 3
49 pages
MIPS IV Instruction Set Guide
No ratings yet
MIPS IV Instruction Set Guide
12 pages
Midterm Sol
No ratings yet
Midterm Sol
7 pages
Chapter 5 - Program Control Instructions
100% (2)
Chapter 5 - Program Control Instructions
39 pages
Reference Guide: TMS320C674x DSP CPU and Instruction Set
No ratings yet
Reference Guide: TMS320C674x DSP CPU and Instruction Set
770 pages
Mips Code Examples: - Peter Rounce
No ratings yet
Mips Code Examples: - Peter Rounce
20 pages
Computer Architecture 2 Marks
0% (1)
Computer Architecture 2 Marks
32 pages
Microprogrammed Control Unit Guide
No ratings yet
Microprogrammed Control Unit Guide
56 pages
8086 Microprocessor Overview
No ratings yet
8086 Microprocessor Overview
34 pages
Counters & Time Delays: Counter
No ratings yet
Counters & Time Delays: Counter
8 pages
Lab 09
No ratings yet
Lab 09
5 pages
FPGA Synthesis for HDL Processors
No ratings yet
FPGA Synthesis for HDL Processors
7 pages
Dpco Unit 3,4,5 New Notes
No ratings yet
Dpco Unit 3,4,5 New Notes
135 pages
Instruction Groups: The 8051 Has 255 Instructions - Every 8-Bit Opcode From 00 To FF Is Used Except For A5.
No ratings yet
Instruction Groups: The 8051 Has 255 Instructions - Every 8-Bit Opcode From 00 To FF Is Used Except For A5.
30 pages
8086 Instruction Set PDF
No ratings yet
8086 Instruction Set PDF
19 pages
LC3 AssemblyManualAndExamples
No ratings yet
LC3 AssemblyManualAndExamples
56 pages
SAK-C167CR-LM Infineon Elenota - PL PDF
100% (1)
SAK-C167CR-LM Infineon Elenota - PL PDF
23 pages
Microprocessor
No ratings yet
Microprocessor
18 pages
CSE 243: Introduction To Computer Architecture and Hardware/Software Interface
No ratings yet
CSE 243: Introduction To Computer Architecture and Hardware/Software Interface
31 pages
Hierarchical Computer Organizations
No ratings yet
Hierarchical Computer Organizations
13 pages
CS M151B / EE M116C: Computer Systems Architecture
No ratings yet
CS M151B / EE M116C: Computer Systems Architecture
50 pages
Arc Instruction Sets P 10
No ratings yet
Arc Instruction Sets P 10
9 pages
Chapter 1 Lecture 2 & 3 - Performance
No ratings yet
Chapter 1 Lecture 2 & 3 - Performance
36 pages
Branching Instructions in 8085 Microprocessor
No ratings yet
Branching Instructions in 8085 Microprocessor
8 pages
Lecture Notes III-SEM - COA - Module 2
No ratings yet
Lecture Notes III-SEM - COA - Module 2
35 pages

Moduel 5

Uploaded by

Moduel 5

Uploaded by

Processor Structure and Reduced

• Data registers: Hold integer or floating-point values.

• Address registers: Store memory addresses for instructions.

• Instruction Register (IR): Holds the fetched instruction.

• Memory Address Register (MAR): Holds the address of data in memory.

2. Decode: Determines the operation and required operands.

3. Execute: Performs the operation using the ALU or registers.

• The cycle repeats for each instruction in the program.

• Memory → MBR → IR (Instruction is fetched)

• PC is incremented for the next instruction

2. Indirect Cycle (if needed):

• MBR (stores final operand address)

• The result is stored in a register or memory location.

• Instruction pipelining is a technique used to improve CPU performance by overlapping

• Basic Two-Stage Pipeline

To optimize performance, instruction processing can be broken into more stages:

1. Fetch Instruction (FI): Reads the instruction into a buffer.

2. Decode Instruction (DI): Determines the opcode and operands.

3. Calculate Operands (CO): Computes effective addresses.

4. Fetch Operands (FO): Retrieves operands from memory or registers.

5. Execute Instruction (EI): Performs the computation.

6. Write Operand (WO): Stores the result.

1. Resource Hazards (Structural Hazards)

• Penalty: Flushing incorrect instructions from the pipeline causes delays.

3. Loop Buffer: A small cache that stores recently executed instructions.

• Used in CDC Star-100, CRAY-1.

4. Branch Prediction: The CPU predicts whether a branch will be taken.

• Data movement and control flow dominate, not complex arithmetic.

• Even though CALL/RETURN occurs less frequently, it consumes a

• Use simple instruction formats for fast decoding

• Design better pipelines and branch prediction

• Optimize hardware for realistic program behavior, not hypothetical workloads

• Reduced memory access = better performance

• Enables faster procedure calls with register windows

• Improves instruction pipelining efficiency

• Allows more operand storage for HLL programs

• Minimizing load/store instructions to/from memory

• Reusing registers when possible without conflicts

• This is called register allocation and optimization.

• Reason for Register Optimization

• Instruction formats, addressing modes, and data types.

• Registers, memory organization, and I/O mechanisms.

• Interrupts and exception handling mechanisms.

• Usually, Instruction Format is of 3 types:

• Instructions facilitate Simplified decoding

• Registers are used predominantly for operands

Field Opcode rs rt rd shamt funct

• rs, rt : Source register

• Used for data transfer, arithmetic with constants, and branching.

• Immediate: constant value or address offset

• Used for unconditional jumps.

2. Instruction Decode Unit (IDU)

• Reduces memory access, speeds up computation.

4. Arithmetic and Logic Unit (ALU)

• Works on data from registers, not memory.

6. Load/Store Unit (Memory Access)

• Only these instructions access memory.

8. Program Counter (PC)

9. Instruction and Data Cache

4. MEM - Memory Access (if needed)

2. Few Addressing Modes:

4. Simple Control Logic:

You might also like