0% found this document useful (0 votes)

162 views66 pages

Module 4

This chapter discusses multiprocessor and multicomputer systems. It covers multiprocessor system interconnects such as bus-based, crossbar switch, and multistage network topologies. It also discusses cache coherence mechanisms like snoopy bus protocols and directory-based protocols. Finally, it examines three generations of multicomputers including early systems from Caltech and Intel, and how they have evolved from distributed memory with message passing to today's unified memory architectures.

Uploaded by

1HK16CS104 Muntazir Hussain Bhat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

162 views66 pages

Module 4

Uploaded by

1HK16CS104 Muntazir Hussain Bhat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 66

Advanced Computer Architecture

Chapter 7
Multiprocessors and Multicomputers
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani
In this chapter…

• Multiprocessor System Interconnects

• Cache Coherence and Synchronization Mechanisms
• Three Generations of Multi-computers
• Message Routing Schemes

2
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

MULTIPROCESSOR SYSTEM INTERCONNECTS

• Network Characteristics
o Topology
• Dynamic Networks
o Timing control protocol
• Synchronous (with global clock)
• Asynchronous (with handshake or interlocking mechanism)
o Switching method
• Circuit switching
• Packet switching
o Control Strategy
• Centralized (global controller to receive requests from all devices and grant network access)
• Distributed (requests handled by local devices independently)

4
MULTIPROCESSOR SYSTEM INTERCONNECTS

• Hierarchical Bus System

o Local Bus (board level)
• Memory bus, data bus
o Backplane Bus (backplane level)
• VME bus (IEEE 1014-1987), Multibus II (IEEE 1296-1987), Futurebus+ (IEEE 896.1-1991)
o I/O Bus (I/O level)
o E.g. Encore Multimax multprocessor’s nanobus
• 20 slots
• 32-bit address path
• 64-bit data path
• Clock rate: 12.5 MHz
• Total Memory bandwidth: 100 Megabytes per second

5
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

MULTIPROCESSOR SYSTEM INTERCONNECTS

• Hierarchical Buses and Caches

o Cache Levels
• First level caches
• Second level caches
o Buses
• (Intra) Cluster Bus
• Inter-cluster bus
o Cache coherence
• Snoopy cache protocol for coherence among first level caches of same cluster
• Intra-cluster cache coherence controlled among second level caches and results passed to
first level caches
o Use of Bridges between multiprocessor clusters

7
MULTIPROCESSOR SYSTEM INTERCONNECTS

• Hierarchical Buses and Caches

8
MULTIPROCESSOR SYSTEM INTERCONNECTS

9
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 10

MULTIPROCESSOR SYSTEM INTERCONNECTS

• Crossbar Switch Design

o Based on number of network stages
• Single stage (or recirculating) networks
• Multistage networks
o Blocking networks
o Non-blocking (re-arranging) networks
• Crossbar networks
o n x m and n2 Cross-point switchdesign
o Crossbar benefits and limitations

• Multiport Memory Design

o Multiport Memory
MULTIPROCESSOR SYSTEM INTERCONNECTS
MULTIPROCESSOR SYSTEM INTERCONNECTS
CACHE COHERENCE MECHANISMS

• Cache Coherence Problem

o Inconsistent copies of same memory block in different caches
o Sources of inconsistency:
• Sharing of writable data
• Process migration
• I/O activity

• Protocol Approaches
o Snoopy Bus Protocol
o Directory Based Protocol

• Write Policies
o (Write-back, Write-through) x (Write-invalidate, Write-update)
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS

• Snoopy Bus Protocols

o Write-through caches
• Write invalidate coherence protocol for write-through caches
• Write-update coherence protocol for write-through caches
• Data item states:
o VALID
o INVALID
• Possible operations:
o Read by same processor R(i) Read by different processor R( j )
o Write by same processor W(i) Write by different processor W( j )
o Replace by same processor Z(i) Replace by different processor Z( j )
CACHE COHERENCE MECHANISMS
CACHE COHERENCE MECHANISMS

• Snoopy Bus Protocols

o Write-through caches – write invalidate scheme

Current Operation New Current New

Operation
State State State State
R(i) Valid R(i) Valid
W(i) Valid W(i) Valid
Z(i) Invalid Z(i) Invalid
Valid Invalid
R(j) Valid R(j) Invalid
W(j) Invalid W(j) Invalid
Z(j) Valid Z(j) Invalid
CACHE COHERENCE MECHANISMS

• Snoopy Bus Protocols

o Write-back caches
• Ownership protocol: Write invalidate coherence protocol for write-through caches
• Data item states:
o RO : Read Only (Valid state)
o RW : Read Write (Valid state)
o INV : Invalid state
• Possible operations:
o Read by same processor R(i) Read by different processor R( j )
o Write by same processor W(i) Write by different processor W( j )
o Replace by same processor Z(i) Replace by different processor Z( j )
CACHE COHERENCE MECHANISMS

• Snoopy Bus Protocols

o Write-back caches – write invalidate (ownership protocol) scheme

Current Operation New Current New Current New

Operation Operation
State State State State State State
R(i) RO R(i) RW R(i) RO
W(i) RW W(i) RW W(i) RW
RO Z(i) INV RW Z(i) INV INV Z(i) INV
(Valid) R(j) RO (Valid) R(j) RO (Invalid) R(j) INV
W(j) INV W(j) INV W(j) INV
Z(j) RO Z(j) RW Z(j) INV
CACHE COHERENCE MECHANISMS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 24

CACHE COHERENCE MECHANISMS

• Snoopy Bus Protocols

o Write-once Protocol
• First write using write-through policy
• Subsequent writes using write-back policy
• In both cases, data item copy in remote caches is invalidated
• Data item states:
o Valid :cache block consistent with main memory copy
o Reserved : data has been written exactly once and is consistent with main memory
copy
o Dirty : data is written more than once but is not consistent with main memory copy
o Invalid :block not found in cache or is inconsistent with main memory copy
CACHE COHERENCE MECHANISMS

• Snoopy Bus Protocols

o Write-once Protocol
• Cache events and actions:
o Read-miss
o Read-hit
o Write-miss
o Write-hit
o Block replacement
CACHE COHERENCE MECHANISMS

• Multilevel Cache Coherence

CACHE COHERENCE MECHANISMS
• Protocol Performance issues
o Snoopy Cache Protocol Performance determinants:
• Workload Patterns
• Implementation Efficiency
o Goals/Motivation behind using snooping mechanism
• Reduce bus traffic
• Reduce effective memory access time
o Data Pollution Point
• Miss ratio decreases as block size increases, up to a data pollution point (that is, as blocks
become larger, the probability of finding a desired data item in the cache increases).
• The miss ratio starts to increasing as the block size increases to data pollution point.
o Ping-Pong effect on data shared between multiple caches
• If two processes update a data item alternately, data will continually migrate between two caches
with high miss-rate
THREE GENERATIONS OF MULTICOMPUTERS

• Multicomputer v/s Multiprocessor

• Design Choices for Multi-computers
o Processors
• Low cost commodity (off-the-shelf) processors
o Memory Structure
• Distributed memory organization
• Local memory with each processor
o Interconnection Schemes
• Message passing, point-to-point , direct networks with send/receive semantics with/without
uniform message communication speed
o Control Strategy
• Asynchronous MIMD, MPMD and SPMD operations
THREE GENERATIONS OF MULTICOMPUTERS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 30

THREE GENERATIONS OF MULTICOMPUTERS

• The Past, Present and Future Development

o First Generation
• Example Systems: Caltech’s Cosmic Cube, Intel iPSC/1, Ametek S/14, nCube/10
o Second Generation
• Example Systems: iPSC/2, i860, Delta, nCube/2, Supernode 1000, Ametek Series 2010
o Third Generation
• Example Systems: Caltech’s Mosaic C, J-Machine, Intel Paragon
o First and second generation multi-computers are regarded as medium-grain systems
o Third generation multi-computers were regarded as fine-grain systems.
o Fine-grain and shared memory approach can, in theory, combine the relative merits of
multiprocessors and multi-computers in a heterogeneous processing environment.
1st Generation 2nd Generation 3rd Generation
THREE GENERAT IONS OF MULTICOM PUTERS
MIPS 1 10 100
Typical MFLOPS (scalar) 0.1 2 40
Node
Attributes MFLOPS (vector) 10 40 200
Memory Size (in MB) 0.5 4 32
Number of Nodes (N) 64 256 1024

Typical MIPS 64 2560 100 K

System MFLOPS (scalar) 6.4 512 40 K
Attributes MFLOPS (vector) 640 10 K 200 K
Memory Size (in MB) 32 1K 32 K
Local Neighbour
Communi- (in microseconds) 2000 5 0.5
cation
Latency Non-local node 6000 5 0.5
(in microseconds)
THREE GENERATIONS OF MULTICOMPUTERS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 33

MESSAGE PASSING SCHEMES

• Message Routing Schemes

• Message Formats
o Messages
o Packets
o Flits (Control Flow Digits)
• Data Only Flits
• Sequence Number
• Routing Information

• Store-and-forward routing
• Wormhole routing
MESSAGE PASSING SCHEMES
MESSAGE PASSING SCHEMES

• Asynchronous Pipelining
MESSAGE PASSING SCHEMES

• Latency Analysis
o L: Packet length (in bits)
o W: Channel Bandwidth (in bits per second)
o D: Distance (number of nodes traversed minus 1)
o F: Flit length (in bits)
o Communication Latency in Store-and-forward Routing
• TSF = L (D + 1) / W
o Communication Latency in Wormhole Routing
• TWH = L / W + F D / W
Advanced Computer Architecture

Chapter 8
Multivector and SIMD Computers
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani
In this chapter…

• Vector Processing Principles

• Compound Vector Operations
• Vector Loops and Chaining
• SIMD Computer Implementation Models

2
VECTOR PROCESSING PRINCIPLES

• Vector Processing Definitions

o Vector
o Stride
o Vector Processor
o Vector Processing
o Vectorization
o Vectorizing Compiler or Vectorizer

• Vector Instruction Types

o Vector-vector instructions
o Vector-scalar instructions
o Vector-memory instructions

3
VECTOR PROCESSING PRINCIPLES

4
VECTOR PROCESSING PRINCIPLES

• Vector-Vector Instructions
o F1: Vi  Vj
o F2: Vi x Vj Vk
o Examples: V1 = sin(V2) V3 = V1+ V2

• Vector-Scalar Instructions
o F3: s x Vi  Vj
o Examples: V2 = 6 + V1

• Vector-Memory Instructions
o F4: MV (Vector Load)
o F5: VM (Vector Store)
o Examples: X = V1 V2 = Y

5
VECTOR PROCESSING PRINCIPLES

• Vector Reduction Instructions

o F6: Vi  s
o F7: Vi x Vj  s

• Gather and Scatter Instructions

o F8: M  Vi x Vj (Gather)
o F9: Vi x Vj  M (Scatter)

• Masking
o F10: Vi x Vm  Vj (Vm is a binary vector)
• Examples…

6
VECTOR PROCESSING PRINCIPLES

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

VECTOR PROCESSING PRINCIPLES

• Vector-Access Memory Schemes

o Vector-operand Specifications
• Base address, stride and length
o C-Access Memory Organization
• Low-order m-way interleaved memory
o S-access Memory Organizations
• High-order m-way interleaved memory
o C/S Access Memory Organization

• Early Supercomputers (Vectors Processors)

o Cray Series ETA 10E NEC Sx-X 44
o CDC Cyber Fujitsu VP2600 Hitachi 820/80
VECTOR PROCESSING PRINCIPLES

• Relative Vector/Scalar Performance

o Vector/scalar speed ratio r
o Vectorization ratio in program f
o Relative Performance P is given by:
𝟏 𝒓
• 𝑷= =
𝟏−𝒇 + 𝒇/𝒓 𝟏−𝒇 𝒓 + 𝒇
o When f is low, the speedup cannot be high even with very high r
o Limiting Case:
• P  1 if f  0
o Maximum Case:
• P  r if f  1
o Powerful single chip processors and multicore system-on-a-chip provide High-Performance
Computing (HPC) using MIMD and/or SPMD configurations with large no. of processors.
COMPUOUND VECTOR PROCESSING

• Compound Vector Operations

o Compound Vector Functions (CVFs)
• Composite function of vector operations converted from a looping structure of linked scalar
operations
o CVF Example: The SAXPY (Single-precision A multiply X Plus Y) Code
• For I = 1 to N
o Load R1, X(I)
o Load R2, Y(I)
o Multiply R1, A
o Add R2, R1
o Store Y(I), R2
• (End of Loop)
COMPUOUND VECTOR PROCESSING

• One-dimensional CVF Examples

o V(I) = V2(I) + V(3) x V(4)
o V1(I) = B(I) + C(I)
oA(I) = V(I) x S + B(I)
o A(I) = V(I) + B(I) + C(I)
oA(I) = Q x v1(I) (R x B(I) + C(I)), etc.
Legend:
o Vi(I) are vector registers
o A(I), B(I), C(I) are vectors in memory
o Q, S are scalars available from scalar registers in memory
COMPUOUND VECTOR PROCESSING

• Vector Loops
o Vector segmentation or strip-mining approach
o Example

• Vector Chaining
o Example: SAXPY code
• Limited Chaining using only one memory-access pipe in Cray-I
• Complete Chaining using three memory-access pipes in Cray X-MP

• Functional Unit Independence

• Vector Recurrence
COMPUOUND VECTOR PROCESSING

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 13

COMPUOUND VECTOR PROCESSING

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 14

SIMD COMPUTER ORGANIZATIONS

• SIMD Computer Variants

o Array Processor
o Associative Processor
• SIMD Processor v/s SISD v/s Vector Processor Operation
o Illustration: for(i=0;i<5;i++) a[i] = a[i]+2;
o Lockstep mode of operation in SIMD processor
o Relative Performance comparison
• SIMD Implementation Models
o Distributed Memory Model
• E.g. Illiac IV
o Shared memory Model
• E.g. BSP (Burroughs Scientific Processor)
SIMD COMPUTER ORGANIZATIONS
SIMD COMPUTER ORGANIZATIONS
SIMD COMPUTER ORGANIZATIONS

• SIMD Instructions
o Scalar Operations
• Arithmetic/Logical
o Vector Operations
• Arithmetic/Logical
o Data Routing Operations
• Permutations, broadcasts, multicasts, rotation and shifting
o Masking Operations
• Enable/Disable PEs
• Host and I/O
• Bit-slice and Word-slice Processing
o WSBS, WSBP, WPBS, WPBP
Advanced Computer Architecture

Chapter 9
…Dataflow Architectures
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani
In this chapter…

• Evolution of Dataflow Computers

• Dataflow Graphs
• Static v/s Dynamic Data Flow Computers
• Pure Dataflow Machines
• Explicit Token Store Machines
• Hybrid and Unified Architectures
• Dataflow v/s Control flow Computers

2
DATAFLOW AND HYBRID ARCHITECTURES

• Data-driven machines
• Evolution of Dataflow Machines
• Dataflow Graphs
o Dataflow Graphs examples.
o Activity Templates and Activity Store
o Example: dataflow graph for cos x
𝟔
𝒙𝟐 𝒙𝟒 𝒙
• 𝐜𝐨𝐬 𝐱 ≅ 𝟏 − + −
𝟐! 𝟒! 𝟔!
o More examples

3
DATAFLOW AND HYBRID ARCHITECTURES

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

DATAFLOW AND HYBRID ARCHITECTURES

• Static Dataflow Computers

o Special Feedback Acknowledge Signals between nodes
o Firing Rule:
• A node is enabled as soon as tokens are present on all input arcs and there is no tokenon
any of its output arc
o Example: Dennis Machine (1974)

• Dynamic Dataflow Computers

o TaggedTokens
o Firing Rule:
• A node is enabled as soon as tokens with identical tags are present at each of its inputarcs
o Example: MIT Tagged Token Dataflow Architecture (TTDA) machine (just simulation, never built)
DATAFLOW AND HYBRID ARCHITECTURES

• Diagrams of static dataflow and dynamic dataflow

• from Hwang and Briggs….
DATAFLOW AND HYBRID ARCHITECTURES

• Pure Dataflow Machines

o TTDA(1983)
• TTDA was simulated but never built
o Manchester Dataflow Computer (1982)
• Operated asynchronously using a separate clock for each PE
o ETL Sigma-1 (1987)
• 128 PEs fully synchronous with a 10-Mhz clock
• Implemented I-structured memory proposed in TTDA
• Lacked High Level Languages for users
DATAFLOW AND HYBRID ARCHITECTURES

• Explicit Token Store Machines

o Eliminate associative token matching.
o Waiting token memory is directly accessed using full/emptybits.
o Examples
• MIT/Motorola Monsoon (proposed 1988; operational 1991)
o Multithreading support
o 8 processors
o 8 I-structure memory modules
o 8x8 crossbar network
• ETL EM-4 (1989)
o Extension of Sigma-1
o Proposed 1024 nodes; Operational Implementation 80 nodes
DATAFLOW AND HYBRID ARCHITECTURES

• Hybrid and Unified Architectures

o Combining Features of von-Neumann and Dataflow architectures
o Examples:
• MIT P-RISC (1988)
• IBM Empire (1991)
• MIT/Motorola *T (1991)
o “RISC-ified” dataflow architecture
• Implemented in P-RISC
• Splitting complex dataflow instructions into separate simple component instructions
• Tighter encoding and longer threads for better performance

• Dataflow Processing v/s Control Flow Processing

DATAFLOW AND HYBRID ARCHITECTURES

• Computing ab + a/c with:

(a) control flow; (b) dataflow. Pure dataflow basic execution pipeline: (c) single-token-per-arc dataflow;
(d) tagged-token dataflow; (e) explicit token store dataflow
DATAFLOW AND HYBRID ARCHITECTURES

• Computing ab + a/c with:

(a) control flow; (b) dataflow. Pure dataflow basic execution pipeline: (c) single-token-per-arc dataflow;
(d) tagged-token dataflow; (e) explicit token store dataflow

Unit 4
No ratings yet
Unit 4
9 pages
Chapter 7
No ratings yet
Chapter 7
97 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
L7 Multicore 1
No ratings yet
L7 Multicore 1
50 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
CA Lecture 13
No ratings yet
CA Lecture 13
27 pages
Parallel Arch 2
No ratings yet
Parallel Arch 2
9 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
Multiprocessors & Thread-Level Parallelism
79% (19)
Multiprocessors & Thread-Level Parallelism
29 pages
MODULE 4 HPC
No ratings yet
MODULE 4 HPC
41 pages
Multi Processors and Thread Level Parallelism
No ratings yet
Multi Processors and Thread Level Parallelism
74 pages
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
24 pages
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
No ratings yet
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
13 pages
Multiprocessor Architecture and Programming
No ratings yet
Multiprocessor Architecture and Programming
20 pages
Module 4
No ratings yet
Module 4
40 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
CH17 COA9e Parallel Processing
No ratings yet
CH17 COA9e Parallel Processing
52 pages
Parallelism and Multicores
No ratings yet
Parallelism and Multicores
54 pages
Computer Architecture Basics
No ratings yet
Computer Architecture Basics
16 pages
Comporg6 ch12
No ratings yet
Comporg6 ch12
36 pages
Multiprocessor Architectures & Cache Coherence
No ratings yet
Multiprocessor Architectures & Cache Coherence
54 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
Unit 10 Multiprocessors
No ratings yet
Unit 10 Multiprocessors
26 pages
EGC121lect20 Multicore MSI Protocol
No ratings yet
EGC121lect20 Multicore MSI Protocol
39 pages
Chapter Ten Architeture
No ratings yet
Chapter Ten Architeture
14 pages
Parallel Processor Overview
No ratings yet
Parallel Processor Overview
32 pages
Yan Solihin - Fundamentals of Parallel Computer Architecture
100% (2)
Yan Solihin - Fundamentals of Parallel Computer Architecture
547 pages
Distributed OS: Memory & Multiprocessors
No ratings yet
Distributed OS: Memory & Multiprocessors
89 pages
Lect4 Parallelsystem-Shared Memory
No ratings yet
Lect4 Parallelsystem-Shared Memory
31 pages
Cache Coherence: Computer Science & Artificial Intelligence Lab
No ratings yet
Cache Coherence: Computer Science & Artificial Intelligence Lab
36 pages
Lecture 12
No ratings yet
Lecture 12
49 pages
Aca Mod4&5
No ratings yet
Aca Mod4&5
66 pages
Multiprocessor Systems Overview
No ratings yet
Multiprocessor Systems Overview
51 pages
Shared Memory Architecture
No ratings yet
Shared Memory Architecture
39 pages
Multicore Question Bank
No ratings yet
Multicore Question Bank
5 pages
Multiprocessors
No ratings yet
Multiprocessors
8 pages
Chapter Thirteen: Multiprocessors
No ratings yet
Chapter Thirteen: Multiprocessors
55 pages
Multiprocessor Systems Overview
No ratings yet
Multiprocessor Systems Overview
26 pages
ACA Lecture 29 Cache-Coherence 2
No ratings yet
ACA Lecture 29 Cache-Coherence 2
42 pages
Lecture 9 Multi-Processor
No ratings yet
Lecture 9 Multi-Processor
83 pages
Unit VI
No ratings yet
Unit VI
50 pages
Multi Processor
No ratings yet
Multi Processor
63 pages
Aca UNIT-4
No ratings yet
Aca UNIT-4
18 pages
Cache Coherency
No ratings yet
Cache Coherency
33 pages
Lecture 19
No ratings yet
Lecture 19
20 pages
PART17
No ratings yet
PART17
45 pages
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
No ratings yet
Shared-Memory Architectures: Adapted From A Lecture by Ian Watson, University of Machester
33 pages
Ch-9 MIMD Architecture and SPMD
No ratings yet
Ch-9 MIMD Architecture and SPMD
8 pages
Coa Unit 5
No ratings yet
Coa Unit 5
18 pages
Module 3
No ratings yet
Module 3
25 pages
Multicore Processors Overview
No ratings yet
Multicore Processors Overview
12 pages
Cs7103 Multicore Architecture
No ratings yet
Cs7103 Multicore Architecture
5 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
MULTIPROCTLPA
No ratings yet
MULTIPROCTLPA
99 pages
Aca UNIT-4
No ratings yet
Aca UNIT-4
19 pages
Demo Mid
No ratings yet
Demo Mid
8 pages
Machine Learning Lab Guide
No ratings yet
Machine Learning Lab Guide
27 pages
College Notes Gallery
100% (4)
College Notes Gallery
34 pages
Chapter-10 Parallel Programming Models, Languages and Compilers
No ratings yet
Chapter-10 Parallel Programming Models, Languages and Compilers
30 pages
Computer Bus Systems Guide
No ratings yet
Computer Bus Systems Guide
50 pages
Module 5
No ratings yet
Module 5
40 pages
OR Explain Inclusion, Coherence and Locality Properties (8 Marks)
No ratings yet
OR Explain Inclusion, Coherence and Locality Properties (8 Marks)
2 pages
HKBK College of Engineering Department of Computer Science and Engineering
No ratings yet
HKBK College of Engineering Department of Computer Science and Engineering
24 pages
A Sec Seminar Presentation Schedule
No ratings yet
A Sec Seminar Presentation Schedule
1 page
Basf Masterlife Ci 222 Tds
No ratings yet
Basf Masterlife Ci 222 Tds
2 pages
2003 Chevy Suburban Wiring Diagrams
75% (8)
2003 Chevy Suburban Wiring Diagrams
152 pages
Craft
No ratings yet
Craft
7 pages
Protecting Children from Occult Toys
100% (1)
Protecting Children from Occult Toys
4 pages
IGC2 Element 7 Chemical Biological
100% (2)
IGC2 Element 7 Chemical Biological
68 pages
Blastmate III
No ratings yet
Blastmate III
2 pages
June 2022 Question Paper 11
No ratings yet
June 2022 Question Paper 11
9 pages
CNN Based Features Extraction For Age Estimation A
No ratings yet
CNN Based Features Extraction For Age Estimation A
9 pages
Prolok NEWEST
No ratings yet
Prolok NEWEST
8 pages
Gong 等 - 2021 - 1.37 kV12 A NiOβ-Ga2O3 Heterojunction Diode With
No ratings yet
Gong 等 - 2021 - 1.37 kV12 A NiOβ-Ga2O3 Heterojunction Diode With
5 pages
Auditing and Assurance Principle Chapter 2 & 3
No ratings yet
Auditing and Assurance Principle Chapter 2 & 3
56 pages
Curves For Computer Aided Geometric Design
No ratings yet
Curves For Computer Aided Geometric Design
21 pages
Part of Your Routine: Make Safety and Accuracy
No ratings yet
Part of Your Routine: Make Safety and Accuracy
8 pages
(Tiếng Anh nâng cao) TOPIC 2. Đọc hiểu nâng cao
No ratings yet
(Tiếng Anh nâng cao) TOPIC 2. Đọc hiểu nâng cao
7 pages
HPGD 2203 - Educational Management
No ratings yet
HPGD 2203 - Educational Management
19 pages
New Batch First Year BAMS Syallbus 2024-2025-1
No ratings yet
New Batch First Year BAMS Syallbus 2024-2025-1
4 pages
Watertec Bubblebox Catalog en e
No ratings yet
Watertec Bubblebox Catalog en e
12 pages
Building+Your+Mic+Locker+ +Cheat+Sheet
100% (1)
Building+Your+Mic+Locker+ +Cheat+Sheet
5 pages
Design Criteria For Film Cooling For Small Liquid-Propellant Rocket Engines
No ratings yet
Design Criteria For Film Cooling For Small Liquid-Propellant Rocket Engines
6 pages
India Handbook Notes
No ratings yet
India Handbook Notes
68 pages
PROJECT PLAN WATER MARBLING Jugapao
No ratings yet
PROJECT PLAN WATER MARBLING Jugapao
4 pages
Partitura Completa - Oboe Concerto in G Minor, Op.9 No.8 (Albinoni, Tomaso)
100% (1)
Partitura Completa - Oboe Concerto in G Minor, Op.9 No.8 (Albinoni, Tomaso)
23 pages
The Little Prince: Timeless Fable Review
No ratings yet
The Little Prince: Timeless Fable Review
2 pages
What Is Western Culture
100% (1)
What Is Western Culture
3 pages
Empathy at Work: - Ability To Share Someone Else's Feelings or
No ratings yet
Empathy at Work: - Ability To Share Someone Else's Feelings or
5 pages
Holy Cities On The Rivers of Hindu India
No ratings yet
Holy Cities On The Rivers of Hindu India
21 pages
NCERT Solutions For Class 9 English Chapter 1 Poem The Road Not Taken
No ratings yet
NCERT Solutions For Class 9 English Chapter 1 Poem The Road Not Taken
2 pages
Spring MVC Introduction To EJB 3.0: Yet Another Web Framework Easy Enterprise Development
No ratings yet
Spring MVC Introduction To EJB 3.0: Yet Another Web Framework Easy Enterprise Development
22 pages
2024 SI Programme Recruitment Ad
No ratings yet
2024 SI Programme Recruitment Ad
2 pages
12d Stem - g2 - Research Paper (Chapter I & II)
No ratings yet
12d Stem - g2 - Research Paper (Chapter I & II)
21 pages

Module 4

Uploaded by

Module 4

Uploaded by

Advanced Computer Architecture

• Multiprocessor System Interconnects

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

• Hierarchical Bus System

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

• Hierarchical Buses and Caches

• Hierarchical Buses and Caches

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 10

• Crossbar Switch Design

• Multiport Memory Design

• Cache Coherence Problem

• Snoopy Bus Protocols

• Snoopy Bus Protocols

Current Operation New Current New

• Snoopy Bus Protocols

• Snoopy Bus Protocols

Current Operation New Current New Current New

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 24

• Snoopy Bus Protocols

• Snoopy Bus Protocols

• Multilevel Cache Coherence

• Multicomputer v/s Multiprocessor

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 30

• The Past, Present and Future Development

Typical MIPS 64 2560 100 K

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 33

• Message Routing Schemes

• Vector Processing Principles

• Vector Processing Definitions

• Vector Instruction Types

• Vector Reduction Instructions

• Gather and Scatter Instructions

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

• Vector-Access Memory Schemes

• Early Supercomputers (Vectors Processors)

• Relative Vector/Scalar Performance

• Compound Vector Operations

• One-dimensional CVF Examples

• Functional Unit Independence

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 13

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 14

• SIMD Computer Variants

• Evolution of Dataflow Computers

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

• Static Dataflow Computers

• Dynamic Dataflow Computers

• Diagrams of static dataflow and dynamic dataflow

• Pure Dataflow Machines

• Explicit Token Store Machines

• Hybrid and Unified Architectures

• Dataflow Processing v/s Control Flow Processing

• Computing ab + a/c with:

• Computing ab + a/c with:

You might also like