SlideShare a Scribd company logo
1 of 37
Download to read offline
CSE539: Advanced Computer Architecture

Chapter 7

Multiprocessors and Multicomputers
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani

Sumit Mittu
Assistant Professor, CSE/IT
Lovely Professional University
sumit.12735@lpu.co.in
In this chapter…
•
•
•
•

Multiprocessor System Interconnects
Cache Coherence and Synchronization Mechanisms
Three Generations of Multi-computers
Message Routing Schemes

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

2
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

3
MULTIPROCESSOR SYSTEM INTERCONNECTS
• Network Characteristics
o Topology
• Dynamic Networks
o Timing control protocol
• Synchronous (with global clock)
• Asynchronous (with handshake or interlocking mechanism)
o Switching method
• Circuit switching
• Packet switching
o Control Strategy
• Centralized (global controller to receive requests from all devices and grant network access)
• Distributed (requests handled by local devices independently)
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

4
MULTIPROCESSOR SYSTEM INTERCONNECTS
• Hierarchical Bus System
o Local Bus (board level)
• Memory bus, data bus
o Backplane Bus (backplane level)
• VME bus (IEEE 1014-1987), Multibus II (IEEE 1296-1987), Futurebus+ (IEEE 896.1-1991)
o I/O Bus (I/O level)
o E.g. Encore Multimax multprocessor’s nanobus
• 20 slots
• 32-bit address path
• 64-bit data path
• Clock rate: 12.5 MHz
• Total Memory bandwidth: 100 Megabytes per second
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

5
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

6
MULTIPROCESSOR SYSTEM INTERCONNECTS
• Hierarchical Buses and Caches
o Cache Levels
• First level caches
• Second level caches
o Buses
• (Intra) Cluster Bus
• Inter-cluster bus
o Cache coherence
• Snoopy cache protocol for coherence among first level caches of same cluster
• Intra-cluster cache coherence controlled among second level caches and results passed to
first level caches
o Use of Bridges between multiprocessor clusters
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

7
MULTIPROCESSOR SYSTEM INTERCONNECTS
• Hierarchical Buses and Caches

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

8
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

9
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

10
MULTIPROCESSOR SYSTEM INTERCONNECTS
• Crossbar Switch Design
o Based on number of network stages
• Single stage (or recirculating) networks
• Multistage networks
o Blocking networks
o Non-blocking (re-arranging) networks
• Crossbar networks
o n x m and n2 Cross-point switch design
o Crossbar benefits and limitations

• Multiport Memory Design
o Multiport Memory

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

11
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

12
MULTIPROCESSOR SYSTEM INTERCONNECTS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

13
CACHE COHERENCE MECHANISMS
• Cache Coherence Problem
o Inconsistent copies of same memory block in different caches
o Sources of inconsistency:
• Sharing of writable data
• Process migration
• I/O activity

• Protocol Approaches
o Snoopy Bus Protocol
o Directory Based Protocol

• Write Policies
o (Write-back, Write-through) x (Write-invalidate, Write-update)
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

14
CACHE COHERENCE MECHANISMS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

15
CACHE COHERENCE MECHANISMS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

16
CACHE COHERENCE MECHANISMS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

17
CACHE COHERENCE MECHANISMS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

18
CACHE COHERENCE MECHANISMS
• Snoopy Bus Protocols
o Write-through caches
• Write invalidate coherence protocol for write-through caches
• Write-update coherence protocol for write-through caches
• Data item states:
o VALID
o INVALID
• Possible operations:
o Read by same processor R(i)
Read by different processor R( j )
o Write by same processor W(i)
Write by different processor W( j )
o Replace by same processor Z(i)
Replace by different processor Z( j )

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

19
CACHE COHERENCE MECHANISMS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

20
CACHE COHERENCE MECHANISMS
• Snoopy Bus Protocols
o Write-through caches – write invalidate scheme

Current
State

Operation

New
State

R(i)

Operation

New
State

Valid

R(i)

Valid

W(i)

Valid

W(i)

Valid

Z(i)

Invalid

Z(i)

Invalid

R(j)

Valid

R(j)

Invalid

W(j)

Invalid

W(j)

Invalid

Z(j)

Valid

Current
State

Valid

Z(j)

Invalid

Invalid

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

21
CACHE COHERENCE MECHANISMS
• Snoopy Bus Protocols
o Write-back caches
• Ownership protocol: Write invalidate coherence protocol for write-through caches
• Data item states:
o RO : Read Only (Valid state)
o RW : Read Write (Valid state)
o INV : Invalid state
• Possible operations:
o Read by same processor R(i)
Read by different processor R( j )
o Write by same processor W(i)
Write by different processor W( j )
o Replace by same processor Z(i)
Replace by different processor Z( j )

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

22
CACHE COHERENCE MECHANISMS
• Snoopy Bus Protocols
o Write-back caches – write invalidate (ownership protocol) scheme

Current
State

Operation

New
State

R(i)

Operation

New
State

RO

R(i)

W(i)

RW

Z(i)

INV

R(j)

RO

W(j)
Z(j)

RO
(Valid)

Current
State

Operation

New
State

RW

R(i)

RO

W(i)

RW

W(i)

RW

Z(i)

INV

Z(i)

INV

R(j)

RO

R(j)

INV

INV

W(j)

INV

W(j)

INV

RO

Z(j)

RW

Z(j)

INV

RW
(Valid)

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

Current
State

INV
(Invalid)

23
CACHE COHERENCE MECHANISMS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

24
CACHE COHERENCE MECHANISMS
• Snoopy Bus Protocols
o Write-once Protocol
• First write using write-through policy
• Subsequent writes using write-back policy
• In both cases, data item copy in remote caches is invalidated
• Data item states:
o Valid :cache block consistent with main memory copy
o Reserved : data has been written exactly once and is consistent with main memory
copy
o Dirty : data is written more than once but is not consistent with main memory copy
o Invalid :block not found in cache or is inconsistent with main memory copy

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

25
CACHE COHERENCE MECHANISMS
• Snoopy Bus Protocols
o Write-once Protocol
• Cache events and actions:
o Read-miss
o Read-hit
o Write-miss
o Write-hit
o Block replacement

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

26
CACHE COHERENCE MECHANISMS
• Multilevel Cache Coherence

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

27
CACHE COHERENCE MECHANISMS
• Protocol Performance issues
o Snoopy Cache Protocol Performance determinants:
• Workload Patterns
• Implementation Efficiency
o Goals/Motivation behind using snooping mechanism
• Reduce bus traffic
• Reduce effective memory access time
o Data Pollution Point
• Miss ratio decreases as block size increases, up to a data pollution point (that is, as blocks
become larger, the probability of finding a desired data item in the cache increases).
• The miss ratio starts to increasing as the block size increases to data pollution point.
o Ping-Pong effect on data shared between multiple caches
• If two processes update a data item alternately, data will continually migrate between two caches
with high miss-rate

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

28
THREE GENERATIONS OF MULTICOMPUTERS
• Multicomputer v/s Multiprocessor
• Design Choices for Multi-computers
o Processors
• Low cost commodity (off-the-shelf) processors
o Memory Structure
• Distributed memory organization
• Local memory with each processor
o Interconnection Schemes
• Message passing, point-to-point , direct networks with send/receive semantics with/without
uniform message communication speed
o Control Strategy
• Asynchronous MIMD, MPMD and SPMD operations

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

29
THREE GENERATIONS OF MULTICOMPUTERS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

30
THREE GENERATIONS OF MULTICOMPUTERS
• The Past, Present and Future Development
o First Generation
• Example Systems: Caltech’s Cosmic Cube, Intel iPSC/1, Ametek S/14, nCube/10
o Second Generation
• Example Systems: iPSC/2, i860, Delta, nCube/2, Supernode 1000, Ametek Series 2010
o Third Generation
• Example Systems: Caltech’s Mosaic C, J-Machine, Intel Paragon
o First and second generation multi-computers are regarded as medium-grain systems
o Third generation multi-computers were regarded as fine-grain systems.
o Fine-grain and shared memory approach can, in theory, combine the relative merits of
multiprocessors and multi-computers in a heterogeneous processing environment.

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

31
1st Generation

2nd Generation

3rd Generation

THREE GENERATIONS1OF MULTICOMPUTERS
MIPS
10
100
Typical
MFLOPS (scalar)
Node
Attributes MFLOPS (vector)
Memory Size (in MB)

0.1

2

40

10

40

200

0.5

4

32

Number of Nodes (N)

64

256

1024

64

2560

100 K

6.4

512

40 K

640

10 K

200 K

32

1K

32 K

2000

5

0.5

6000

5

0.5

MIPS
Typical
System MFLOPS (scalar)
Attributes MFLOPS (vector)
Memory Size (in MB)

Local Neighbour
Communi- (in microseconds)
cation
Latency Non-local node
(in microseconds)

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

32
THREE GENERATIONS OF MULTICOMPUTERS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

33
MESSAGE PASSING SCHEMES
• Message Routing Schemes
• Message Formats
o Messages
o Packets
o Flits (Control Flow Digits)
• Data Only Flits
• Sequence Number
• Routing Information

• Store-and-forward routing
• Wormhole routing
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

34
MESSAGE PASSING SCHEMES

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

35
MESSAGE PASSING SCHEMES
• Asynchronous Pipelining

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

36
MESSAGE PASSING SCHEMES
• Latency Analysis
L: Packet length (in bits)
W: Channel Bandwidth (in bits per second)
D: Distance (number of nodes traversed minus 1)
F: Flit length (in bits)
Communication Latency in Store-and-forward Routing
• TSF = L (D + 1) / W
o Communication Latency in Wormhole Routing
• TWH = L / W + F D / W
o
o
o
o
o

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

37

More Related Content

What's hot

Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing ModelAdlin Jeena
 
system interconnect architectures in ACA
system interconnect architectures in ACAsystem interconnect architectures in ACA
system interconnect architectures in ACAPankaj Kumar Jain
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed SystemSunita Sahu
 
System models in distributed system
System models in distributed systemSystem models in distributed system
System models in distributed systemishapadhy
 
Chapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitationChapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitationsubramaniam shankar
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system modelHarshad Umredkar
 
Computer networks - Channelization
Computer networks - ChannelizationComputer networks - Channelization
Computer networks - ChannelizationElambaruthi Elambaruthi
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentationVIKAS SINGH BHADOURIA
 
Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)k33a
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) A B Shinde
 
Communications is distributed systems
Communications is distributed systemsCommunications is distributed systems
Communications is distributed systemsSHATHAN
 
Parallel programming model
Parallel programming modelParallel programming model
Parallel programming modeleasy notes
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architectureGagan Kumar
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architecturesGokuldhev mony
 

What's hot (20)

Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
 
system interconnect architectures in ACA
system interconnect architectures in ACAsystem interconnect architectures in ACA
system interconnect architectures in ACA
 
Cs8591 u4
Cs8591 u4Cs8591 u4
Cs8591 u4
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Introduction to Distributed System
Introduction to Distributed SystemIntroduction to Distributed System
Introduction to Distributed System
 
System models in distributed system
System models in distributed systemSystem models in distributed system
System models in distributed system
 
Chapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitationChapter 3 instruction level parallelism and its exploitation
Chapter 3 instruction level parallelism and its exploitation
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
Network layer tanenbaum
Network layer tanenbaumNetwork layer tanenbaum
Network layer tanenbaum
 
Computer networks - Channelization
Computer networks - ChannelizationComputer networks - Channelization
Computer networks - Channelization
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentation
 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
 
Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)Transmission Control Protocol (TCP)
Transmission Control Protocol (TCP)
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism)
 
Communications is distributed systems
Communications is distributed systemsCommunications is distributed systems
Communications is distributed systems
 
Parallel programming model
Parallel programming modelParallel programming model
Parallel programming model
 
Message passing in Distributed Computing Systems
Message passing in Distributed Computing SystemsMessage passing in Distributed Computing Systems
Message passing in Distributed Computing Systems
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architecture
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architectures
 

Viewers also liked

Aca2 01 new
Aca2 01 newAca2 01 new
Aca2 01 newSumit Mittu
 
Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection NetworkAli A Jalil
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherenceFraboni Ec
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1Mr SMAK
 
Aca2 08 new
Aca2 08 newAca2 08 new
Aca2 08 newSumit Mittu
 
Chapter 7
Chapter 7 Chapter 7
Chapter 7 carnillr
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11Sumit Mittu
 
Lecture 3
Lecture 3Lecture 3
Lecture 3Mr SMAK
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwangSumedha
 
Cache coherence
Cache coherenceCache coherence
Cache coherenceEmployee
 
Cache memory
Cache memoryCache memory
Cache memoryAnand Goyal
 
Computer architecture
Computer architecture Computer architecture
Computer architecture Ashish Kumar
 
Computer organization memory hierarchy
Computer organization memory hierarchyComputer organization memory hierarchy
Computer organization memory hierarchyAJAL A J
 

Viewers also liked (20)

Aca2 01 new
Aca2 01 newAca2 01 new
Aca2 01 new
 
Interconnection Network
Interconnection NetworkInterconnection Network
Interconnection Network
 
Unit 8
Unit 8Unit 8
Unit 8
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
 
Aca2 08 new
Aca2 08 newAca2 08 new
Aca2 08 new
 
Chapter 7
Chapter 7 Chapter 7
Chapter 7
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwang
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
pipelining
pipeliningpipelining
pipelining
 
Parallel processing Concepts
Parallel processing ConceptsParallel processing Concepts
Parallel processing Concepts
 
Cache memory
Cache memoryCache memory
Cache memory
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
 
Computer organization memory hierarchy
Computer organization memory hierarchyComputer organization memory hierarchy
Computer organization memory hierarchy
 
Cache memory
Cache memoryCache memory
Cache memory
 
04 Cache Memory
04  Cache  Memory04  Cache  Memory
04 Cache Memory
 
Memory hierarchy
Memory hierarchyMemory hierarchy
Memory hierarchy
 

Similar to Aca2 07 new

CS304PC:Computer Organization and Architecture Session 31 Multiprogramming.pptx
CS304PC:Computer Organization and Architecture  Session 31 Multiprogramming.pptxCS304PC:Computer Organization and Architecture  Session 31 Multiprogramming.pptx
CS304PC:Computer Organization and Architecture Session 31 Multiprogramming.pptxAsst.prof M.Gokilavani
 
CS304PC:Computer Organization and Architecture Session 32 Interprocessors co...
CS304PC:Computer Organization and Architecture  Session 32 Interprocessors co...CS304PC:Computer Organization and Architecture  Session 32 Interprocessors co...
CS304PC:Computer Organization and Architecture Session 32 Interprocessors co...Asst.prof M.Gokilavani
 
Lecture 3 parallel programming platforms
Lecture 3   parallel programming platformsLecture 3   parallel programming platforms
Lecture 3 parallel programming platformsVajira Thambawita
 
An octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passingAn octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passingeSAT Journals
 
Ocd lec networks_10-11 (1)
Ocd lec networks_10-11 (1)Ocd lec networks_10-11 (1)
Ocd lec networks_10-11 (1)80094859
 
Parallel Computing - Lec 3
Parallel Computing - Lec 3Parallel Computing - Lec 3
Parallel Computing - Lec 3Shah Zaib
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systemsvampugani
 
Aca2 09 new
Aca2 09 newAca2 09 new
Aca2 09 newSumit Mittu
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systemsAhmed Soliman
 
Overview of Distributed Systems
Overview of Distributed SystemsOverview of Distributed Systems
Overview of Distributed Systemsvampugani
 
Distributed systems and scalability rules
Distributed systems and scalability rulesDistributed systems and scalability rules
Distributed systems and scalability rulesOleg Tsal-Tsalko
 
Robust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceRobust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceIOSRJVSP
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012TEST Huddle
 

Similar to Aca2 07 new (20)

CS304PC:Computer Organization and Architecture Session 31 Multiprogramming.pptx
CS304PC:Computer Organization and Architecture  Session 31 Multiprogramming.pptxCS304PC:Computer Organization and Architecture  Session 31 Multiprogramming.pptx
CS304PC:Computer Organization and Architecture Session 31 Multiprogramming.pptx
 
Operating System
Operating SystemOperating System
Operating System
 
CS304PC:Computer Organization and Architecture Session 32 Interprocessors co...
CS304PC:Computer Organization and Architecture  Session 32 Interprocessors co...CS304PC:Computer Organization and Architecture  Session 32 Interprocessors co...
CS304PC:Computer Organization and Architecture Session 32 Interprocessors co...
 
Lecture 3 parallel programming platforms
Lecture 3   parallel programming platformsLecture 3   parallel programming platforms
Lecture 3 parallel programming platforms
 
An octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passingAn octa core processor with shared memory and message-passing
An octa core processor with shared memory and message-passing
 
Ocd lec networks_10-11 (1)
Ocd lec networks_10-11 (1)Ocd lec networks_10-11 (1)
Ocd lec networks_10-11 (1)
 
OS_MD_4.pdf
OS_MD_4.pdfOS_MD_4.pdf
OS_MD_4.pdf
 
Machine Learning @NECST
Machine Learning @NECSTMachine Learning @NECST
Machine Learning @NECST
 
Parallel Computing - Lec 3
Parallel Computing - Lec 3Parallel Computing - Lec 3
Parallel Computing - Lec 3
 
Multi processor
Multi processorMulti processor
Multi processor
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
 
Aca2 09 new
Aca2 09 newAca2 09 new
Aca2 09 new
 
What is 3d torus
What is 3d torusWhat is 3d torus
What is 3d torus
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systems
 
Hpc 4 5
Hpc 4 5Hpc 4 5
Hpc 4 5
 
Overview of Distributed Systems
Overview of Distributed SystemsOverview of Distributed Systems
Overview of Distributed Systems
 
Distributed systems and scalability rules
Distributed systems and scalability rulesDistributed systems and scalability rules
Distributed systems and scalability rules
 
Robust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceRobust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory Interface
 
Introduction
IntroductionIntroduction
Introduction
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
 

More from Sumit Mittu

More from Sumit Mittu (7)

Int306 03
Int306 03Int306 03
Int306 03
 
Int306 02
Int306 02Int306 02
Int306 02
 
Int306 01
Int306 01Int306 01
Int306 01
 
Int306 00
Int306 00Int306 00
Int306 00
 
Int306 04
Int306 04Int306 04
Int306 04
 
Aca2 06 new
Aca2 06 newAca2 06 new
Aca2 06 new
 
Aca11 bk2 ch9
Aca11 bk2 ch9Aca11 bk2 ch9
Aca11 bk2 ch9
 

Recently uploaded

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)Dr. Mazin Mohamed alkathiri
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 

Recently uploaded (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 

Aca2 07 new

  • 1. CSE539: Advanced Computer Architecture Chapter 7 Multiprocessors and Multicomputers Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani Sumit Mittu Assistant Professor, CSE/IT Lovely Professional University sumit.12735@lpu.co.in
  • 2. In this chapter… • • • • Multiprocessor System Interconnects Cache Coherence and Synchronization Mechanisms Three Generations of Multi-computers Message Routing Schemes Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 2
  • 3. MULTIPROCESSOR SYSTEM INTERCONNECTS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 3
  • 4. MULTIPROCESSOR SYSTEM INTERCONNECTS • Network Characteristics o Topology • Dynamic Networks o Timing control protocol • Synchronous (with global clock) • Asynchronous (with handshake or interlocking mechanism) o Switching method • Circuit switching • Packet switching o Control Strategy • Centralized (global controller to receive requests from all devices and grant network access) • Distributed (requests handled by local devices independently) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 4
  • 5. MULTIPROCESSOR SYSTEM INTERCONNECTS • Hierarchical Bus System o Local Bus (board level) • Memory bus, data bus o Backplane Bus (backplane level) • VME bus (IEEE 1014-1987), Multibus II (IEEE 1296-1987), Futurebus+ (IEEE 896.1-1991) o I/O Bus (I/O level) o E.g. Encore Multimax multprocessor’s nanobus • 20 slots • 32-bit address path • 64-bit data path • Clock rate: 12.5 MHz • Total Memory bandwidth: 100 Megabytes per second Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 5
  • 6. MULTIPROCESSOR SYSTEM INTERCONNECTS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 6
  • 7. MULTIPROCESSOR SYSTEM INTERCONNECTS • Hierarchical Buses and Caches o Cache Levels • First level caches • Second level caches o Buses • (Intra) Cluster Bus • Inter-cluster bus o Cache coherence • Snoopy cache protocol for coherence among first level caches of same cluster • Intra-cluster cache coherence controlled among second level caches and results passed to first level caches o Use of Bridges between multiprocessor clusters Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 7
  • 8. MULTIPROCESSOR SYSTEM INTERCONNECTS • Hierarchical Buses and Caches Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 8
  • 9. MULTIPROCESSOR SYSTEM INTERCONNECTS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 9
  • 10. MULTIPROCESSOR SYSTEM INTERCONNECTS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 10
  • 11. MULTIPROCESSOR SYSTEM INTERCONNECTS • Crossbar Switch Design o Based on number of network stages • Single stage (or recirculating) networks • Multistage networks o Blocking networks o Non-blocking (re-arranging) networks • Crossbar networks o n x m and n2 Cross-point switch design o Crossbar benefits and limitations • Multiport Memory Design o Multiport Memory Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 11
  • 12. MULTIPROCESSOR SYSTEM INTERCONNECTS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 12
  • 13. MULTIPROCESSOR SYSTEM INTERCONNECTS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 13
  • 14. CACHE COHERENCE MECHANISMS • Cache Coherence Problem o Inconsistent copies of same memory block in different caches o Sources of inconsistency: • Sharing of writable data • Process migration • I/O activity • Protocol Approaches o Snoopy Bus Protocol o Directory Based Protocol • Write Policies o (Write-back, Write-through) x (Write-invalidate, Write-update) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 14
  • 15. CACHE COHERENCE MECHANISMS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 15
  • 16. CACHE COHERENCE MECHANISMS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 16
  • 17. CACHE COHERENCE MECHANISMS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 17
  • 18. CACHE COHERENCE MECHANISMS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 18
  • 19. CACHE COHERENCE MECHANISMS • Snoopy Bus Protocols o Write-through caches • Write invalidate coherence protocol for write-through caches • Write-update coherence protocol for write-through caches • Data item states: o VALID o INVALID • Possible operations: o Read by same processor R(i) Read by different processor R( j ) o Write by same processor W(i) Write by different processor W( j ) o Replace by same processor Z(i) Replace by different processor Z( j ) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 19
  • 20. CACHE COHERENCE MECHANISMS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 20
  • 21. CACHE COHERENCE MECHANISMS • Snoopy Bus Protocols o Write-through caches – write invalidate scheme Current State Operation New State R(i) Operation New State Valid R(i) Valid W(i) Valid W(i) Valid Z(i) Invalid Z(i) Invalid R(j) Valid R(j) Invalid W(j) Invalid W(j) Invalid Z(j) Valid Current State Valid Z(j) Invalid Invalid Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 21
  • 22. CACHE COHERENCE MECHANISMS • Snoopy Bus Protocols o Write-back caches • Ownership protocol: Write invalidate coherence protocol for write-through caches • Data item states: o RO : Read Only (Valid state) o RW : Read Write (Valid state) o INV : Invalid state • Possible operations: o Read by same processor R(i) Read by different processor R( j ) o Write by same processor W(i) Write by different processor W( j ) o Replace by same processor Z(i) Replace by different processor Z( j ) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 22
  • 23. CACHE COHERENCE MECHANISMS • Snoopy Bus Protocols o Write-back caches – write invalidate (ownership protocol) scheme Current State Operation New State R(i) Operation New State RO R(i) W(i) RW Z(i) INV R(j) RO W(j) Z(j) RO (Valid) Current State Operation New State RW R(i) RO W(i) RW W(i) RW Z(i) INV Z(i) INV R(j) RO R(j) INV INV W(j) INV W(j) INV RO Z(j) RW Z(j) INV RW (Valid) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University Current State INV (Invalid) 23
  • 24. CACHE COHERENCE MECHANISMS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 24
  • 25. CACHE COHERENCE MECHANISMS • Snoopy Bus Protocols o Write-once Protocol • First write using write-through policy • Subsequent writes using write-back policy • In both cases, data item copy in remote caches is invalidated • Data item states: o Valid :cache block consistent with main memory copy o Reserved : data has been written exactly once and is consistent with main memory copy o Dirty : data is written more than once but is not consistent with main memory copy o Invalid :block not found in cache or is inconsistent with main memory copy Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 25
  • 26. CACHE COHERENCE MECHANISMS • Snoopy Bus Protocols o Write-once Protocol • Cache events and actions: o Read-miss o Read-hit o Write-miss o Write-hit o Block replacement Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 26
  • 27. CACHE COHERENCE MECHANISMS • Multilevel Cache Coherence Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 27
  • 28. CACHE COHERENCE MECHANISMS • Protocol Performance issues o Snoopy Cache Protocol Performance determinants: • Workload Patterns • Implementation Efficiency o Goals/Motivation behind using snooping mechanism • Reduce bus traffic • Reduce effective memory access time o Data Pollution Point • Miss ratio decreases as block size increases, up to a data pollution point (that is, as blocks become larger, the probability of finding a desired data item in the cache increases). • The miss ratio starts to increasing as the block size increases to data pollution point. o Ping-Pong effect on data shared between multiple caches • If two processes update a data item alternately, data will continually migrate between two caches with high miss-rate Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 28
  • 29. THREE GENERATIONS OF MULTICOMPUTERS • Multicomputer v/s Multiprocessor • Design Choices for Multi-computers o Processors • Low cost commodity (off-the-shelf) processors o Memory Structure • Distributed memory organization • Local memory with each processor o Interconnection Schemes • Message passing, point-to-point , direct networks with send/receive semantics with/without uniform message communication speed o Control Strategy • Asynchronous MIMD, MPMD and SPMD operations Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 29
  • 30. THREE GENERATIONS OF MULTICOMPUTERS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 30
  • 31. THREE GENERATIONS OF MULTICOMPUTERS • The Past, Present and Future Development o First Generation • Example Systems: Caltech’s Cosmic Cube, Intel iPSC/1, Ametek S/14, nCube/10 o Second Generation • Example Systems: iPSC/2, i860, Delta, nCube/2, Supernode 1000, Ametek Series 2010 o Third Generation • Example Systems: Caltech’s Mosaic C, J-Machine, Intel Paragon o First and second generation multi-computers are regarded as medium-grain systems o Third generation multi-computers were regarded as fine-grain systems. o Fine-grain and shared memory approach can, in theory, combine the relative merits of multiprocessors and multi-computers in a heterogeneous processing environment. Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 31
  • 32. 1st Generation 2nd Generation 3rd Generation THREE GENERATIONS1OF MULTICOMPUTERS MIPS 10 100 Typical MFLOPS (scalar) Node Attributes MFLOPS (vector) Memory Size (in MB) 0.1 2 40 10 40 200 0.5 4 32 Number of Nodes (N) 64 256 1024 64 2560 100 K 6.4 512 40 K 640 10 K 200 K 32 1K 32 K 2000 5 0.5 6000 5 0.5 MIPS Typical System MFLOPS (scalar) Attributes MFLOPS (vector) Memory Size (in MB) Local Neighbour Communi- (in microseconds) cation Latency Non-local node (in microseconds) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 32
  • 33. THREE GENERATIONS OF MULTICOMPUTERS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 33
  • 34. MESSAGE PASSING SCHEMES • Message Routing Schemes • Message Formats o Messages o Packets o Flits (Control Flow Digits) • Data Only Flits • Sequence Number • Routing Information • Store-and-forward routing • Wormhole routing Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 34
  • 35. MESSAGE PASSING SCHEMES Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 35
  • 36. MESSAGE PASSING SCHEMES • Asynchronous Pipelining Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 36
  • 37. MESSAGE PASSING SCHEMES • Latency Analysis L: Packet length (in bits) W: Channel Bandwidth (in bits per second) D: Distance (number of nodes traversed minus 1) F: Flit length (in bits) Communication Latency in Store-and-forward Routing • TSF = L (D + 1) / W o Communication Latency in Wormhole Routing • TWH = L / W + F D / W o o o o o Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 37