SlideShare a Scribd company logo
1 of 17
1
Snooping Protocols
• Topics: snooping-based cache coherence implementations
2
Design Issues, Optimizations
• When does memory get updated?
 demotion from modified to shared?
 move from modified in one cache to modified in another?
• Who responds with data? – memory or a cache that has
the block in exclusive state – does it help if sharers respond?
• We can assume that bus, memory, and cache state
transactions are atomic – if not, we will need more states
• A transition from shared to modified only requires an upgrade
request and no transfer of data
• Is the protocol simpler for a write-through cache?
3
4-State Protocol
• Multiprocessors execute many single-threaded programs
• A read followed by a write will generate bus transactions
to acquire the block in exclusive state even though there
are no sharers
• Note that we can optimize protocols by adding more
states – increases design/verification complexity
4
MESI Protocol
• The new state is exclusive-clean – the cache can service
read requests and no other cache has the same block
• When the processor attempts a write, the block is
upgraded to exclusive-modified without generating a bus
transaction
• When a processor makes a read request, it must detect
if it has the only cached copy – the interconnect must
include an additional signal that is asserted by each
cache if it has a valid copy of the block
5
Design Issues
• When caches evict blocks, they do not inform other
caches – it is possible to have a block in shared state
even though it is an exclusive-clean copy
• Cache-to-cache sharing: SRAM vs. DRAM latencies,
contention in remote caches, protocol complexities
(memory has to wait, which cache responds), can be
especially useful in distributed memory systems
• The protocol can be improved by adding a fifth
state (owner – MOESI) – the owner services reads
(instead of memory)
6
Update Protocol (Dragon)
• 4-state write-back update protocol, first used in the
Dragon multiprocessor (1984)
• Write-back update is not the same as write-through –
on a write, only caches are updated, not memory
• Goal: writes may usually not be on the critical path, but
subsequent reads may be
7
4 States
• No invalid state
• Modified and Exclusive-clean as before: used when there
is a sole cached copy
• Shared-clean: potentially multiple caches have this block
and main memory may or may not be up-to-date
• Shared-modified: potentially multiple caches have this
block, main memory is not up-to-date, and this cache
must update memory – only one block can be in Sm state
• In reality, one state would have sufficed – more states
to reduce traffic
8
Design Issues
• If the update is also sent to main memory, the Sm
state can be eliminated
• If all caches are informed when a block is evicted, the
block can be moved from shared to M or E – this can
help save future bus transactions
• Having an extra wire to determine exclusivity seems
like a worthy trade-off in update systems
9
State Transitions
To
From
NP I E S M
NP 0 0 1.25 0.96 1.68
I 0.64 0 0 1.87 0.002
E 0.20 0 14.0 0.02 1.00
S 0.42 2.5 0 134.7 2.24
M 2.63 0.002 0 2.3 843.6
To
From
NP I E S M
NP -- -- BusRd BusRd BusRdX
I -- -- BusRd BusRd BusRdX
E -- -- -- -- --
S -- -- Not possible -- BusUpgr
M BusWB BusWB Not possible BusWB --
State transitions
per 1000 data
memory references
for Ocean
Bus actions
for each state
transition
NP – Not Present
10
Basic Implementation
• Assume single level of cache, atomic bus transactions
• It is simpler to implement a processor-side cache
controller that monitors requests from the processor and
a bus-side cache controller that services the bus
• Both controllers are constantly trying to read tags
 tags can be duplicated (moderate area overhead)
 unlike data, tags are rarely updated
 tag updates stall the other controller
11
Reporting Snoop Results
• Uniprocessor system: initiator places address on bus, all
devices monitor address, one device acks by raising a
wired-OR signal, data is transferred
• In a multiprocessor, memory has to wait for the snoop
result before it chooses to respond – need 3 wired-OR
signals: (i) indicates that a cache has a copy, (ii) indicates
that a cache has a modified copy, (iii) indicates that the
snoop has not completed
• Ensuring timely snoops: the time to respond could be
fixed or variable (with the third wired-OR signal), or the
memory could track if a cache has a block in M state
12
Non-Atomic State Transitions
• Note that a cache controller’s actions are not all atomic: tag
look-up, bus arbitration, bus transaction, data/tag update
• Consider this: block A in shared state in P1 and P2; both
issue a write; the bus controllers are ready to issue an
upgrade request and try to acquire the bus; is there a
problem?
• The controller can keep track of additional intermediate
states so it can react to bus traffic (e.g. SM, IM, IS,E)
• Alternatively, eliminate upgrade request; use the shared
wire to suppress memory’s response to an exclusive-rd
13
Livelock
• Livelock can happen if the processor-cache handshake
is not designed correctly
• Before the processor can attempt the write, it must
acquire the block in exclusive state
• If all processors are writing to the same block, one of
them acquires the block first – if another exclusive request
is seen on the bus, the cache controller must wait for the
processor to complete the write before releasing the block
-- else, the processor’s write will fail again because the
block would be in invalid state
14
Split Transaction Bus
• What would it take to implement the protocol correctly
while assuming a split transaction bus?
• Split transaction bus: a cache puts out a request, releases
the bus (so others can use the bus), receives its response
much later
• Assumptions:
 only one request per block can be outstanding
 separate lines for addr (request) and data (response)
15
Split Transaction Bus
Proc 1
Cache
Proc 2
Cache
Proc 3
Cache
Request lines
Response lines
16
Design Issues
• When does the snoop complete? What if the snoop takes
a long time?
• What if the buffer in a processor/memory is full? When
does the buffer release an entry? Are the buffers identical?
• How does each processor ensure that a block does not
have multiple outstanding requests?
• What determines the write order – requests or responses?
17
Design Issues II
• What happens if a processor is arbitrating for the bus and
witnesses another bus transaction for the same address?
• If the processor issues a read miss and there is already a
matching read in the request table, can we reduce bus
traffic?

More Related Content

What's hot

Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecturePiyush Mittal
 
INTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptxINTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptxLECO9
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systemsvampugani
 
Presentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingPresentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingVengada Karthik Rangaraju
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelismprashantdahake
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systemssumitjain2013
 
Congestion avoidance in TCP
Congestion avoidance in TCPCongestion avoidance in TCP
Congestion avoidance in TCPselvakumar_b1985
 
Code optimization in compiler design
Code optimization in compiler designCode optimization in compiler design
Code optimization in compiler designKuppusamy P
 
Operating system support in distributed system
Operating system support in distributed systemOperating system support in distributed system
Operating system support in distributed systemishapadhy
 
Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)Sri Prasanna
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architectureArpan Baishya
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory MultiprocessorsSalvatore La Bua
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Ravindra Raju Kolahalam
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architectureMaulik Togadiya
 

What's hot (20)

Cache coherence
Cache coherenceCache coherence
Cache coherence
 
Lecture 3 threads
Lecture 3   threadsLecture 3   threads
Lecture 3 threads
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecture
 
INTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptxINTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptx
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
 
Presentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingPresentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel Programming
 
Hardware and Software parallelism
Hardware and Software parallelismHardware and Software parallelism
Hardware and Software parallelism
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
Congestion avoidance in TCP
Congestion avoidance in TCPCongestion avoidance in TCP
Congestion avoidance in TCP
 
Code optimization in compiler design
Code optimization in compiler designCode optimization in compiler design
Code optimization in compiler design
 
Operating system support in distributed system
Operating system support in distributed systemOperating system support in distributed system
Operating system support in distributed system
 
Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)Logical Clocks (Distributed computing)
Logical Clocks (Distributed computing)
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
11. dfs
11. dfs11. dfs
11. dfs
 
Knapsack problem using fixed tuple
Knapsack problem using fixed tupleKnapsack problem using fixed tuple
Knapsack problem using fixed tuple
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory Multiprocessors
 
Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]Inter Process Communication Presentation[1]
Inter Process Communication Presentation[1]
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architecture
 

Similar to Snooping protocols 3

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Introduction 1
Introduction 1Introduction 1
Introduction 1Yasir Khan
 
Memory Management Strategies - II.pdf
Memory Management Strategies - II.pdfMemory Management Strategies - II.pdf
Memory Management Strategies - II.pdfHarika Pudugosula
 
8 memory management strategies
8 memory management strategies8 memory management strategies
8 memory management strategiesDr. Loganathan R
 
Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Sarwan ali
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING Zena Abo-Altaheen
 
coa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptxcoa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptxRuhul Amin
 

Similar to Snooping protocols 3 (20)

Snooping 2
Snooping 2Snooping 2
Snooping 2
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Mesi
MesiMesi
Mesi
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
12-6810-12.ppt
12-6810-12.ppt12-6810-12.ppt
12-6810-12.ppt
 
Cache Coherence.pptx
Cache Coherence.pptxCache Coherence.pptx
Cache Coherence.pptx
 
Ch8 main memory
Ch8   main memoryCh8   main memory
Ch8 main memory
 
Memory Management.pdf
Memory Management.pdfMemory Management.pdf
Memory Management.pdf
 
Introduction 1
Introduction 1Introduction 1
Introduction 1
 
Week5
Week5Week5
Week5
 
Memory Management Strategies - II.pdf
Memory Management Strategies - II.pdfMemory Management Strategies - II.pdf
Memory Management Strategies - II.pdf
 
8 memory management strategies
8 memory management strategies8 memory management strategies
8 memory management strategies
 
Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler
 
cache memory
 cache memory cache memory
cache memory
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
 
Distributed system
Distributed systemDistributed system
Distributed system
 
Cache optimization
Cache optimizationCache optimization
Cache optimization
 
27 multicore
27 multicore27 multicore
27 multicore
 
coa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptxcoa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptx
 
27 multicore
27 multicore27 multicore
27 multicore
 

More from Yasir Khan (20)

Lecture 6
Lecture 6Lecture 6
Lecture 6
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Lec#1
Lec#1Lec#1
Lec#1
 
Ch10 (1)
Ch10 (1)Ch10 (1)
Ch10 (1)
 
Ch09
Ch09Ch09
Ch09
 
Ch05
Ch05Ch05
Ch05
 
Hpc sys
Hpc sysHpc sys
Hpc sys
 
Hpc 6 7
Hpc 6 7Hpc 6 7
Hpc 6 7
 
Hpc 4 5
Hpc 4 5Hpc 4 5
Hpc 4 5
 
Hpc 3
Hpc 3Hpc 3
Hpc 3
 
Hpc 2
Hpc 2Hpc 2
Hpc 2
 
Hpc 1
Hpc 1Hpc 1
Hpc 1
 
Flynns classification
Flynns classificationFlynns classification
Flynns classification
 
Dir based imp_5
Dir based imp_5Dir based imp_5
Dir based imp_5
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Uncertainity
Uncertainity Uncertainity
Uncertainity
 
Logic
LogicLogic
Logic
 
M6 game
M6 gameM6 game
M6 game
 

Recently uploaded

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 

Recently uploaded (20)

FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 

Snooping protocols 3

  • 1. 1 Snooping Protocols • Topics: snooping-based cache coherence implementations
  • 2. 2 Design Issues, Optimizations • When does memory get updated?  demotion from modified to shared?  move from modified in one cache to modified in another? • Who responds with data? – memory or a cache that has the block in exclusive state – does it help if sharers respond? • We can assume that bus, memory, and cache state transactions are atomic – if not, we will need more states • A transition from shared to modified only requires an upgrade request and no transfer of data • Is the protocol simpler for a write-through cache?
  • 3. 3 4-State Protocol • Multiprocessors execute many single-threaded programs • A read followed by a write will generate bus transactions to acquire the block in exclusive state even though there are no sharers • Note that we can optimize protocols by adding more states – increases design/verification complexity
  • 4. 4 MESI Protocol • The new state is exclusive-clean – the cache can service read requests and no other cache has the same block • When the processor attempts a write, the block is upgraded to exclusive-modified without generating a bus transaction • When a processor makes a read request, it must detect if it has the only cached copy – the interconnect must include an additional signal that is asserted by each cache if it has a valid copy of the block
  • 5. 5 Design Issues • When caches evict blocks, they do not inform other caches – it is possible to have a block in shared state even though it is an exclusive-clean copy • Cache-to-cache sharing: SRAM vs. DRAM latencies, contention in remote caches, protocol complexities (memory has to wait, which cache responds), can be especially useful in distributed memory systems • The protocol can be improved by adding a fifth state (owner – MOESI) – the owner services reads (instead of memory)
  • 6. 6 Update Protocol (Dragon) • 4-state write-back update protocol, first used in the Dragon multiprocessor (1984) • Write-back update is not the same as write-through – on a write, only caches are updated, not memory • Goal: writes may usually not be on the critical path, but subsequent reads may be
  • 7. 7 4 States • No invalid state • Modified and Exclusive-clean as before: used when there is a sole cached copy • Shared-clean: potentially multiple caches have this block and main memory may or may not be up-to-date • Shared-modified: potentially multiple caches have this block, main memory is not up-to-date, and this cache must update memory – only one block can be in Sm state • In reality, one state would have sufficed – more states to reduce traffic
  • 8. 8 Design Issues • If the update is also sent to main memory, the Sm state can be eliminated • If all caches are informed when a block is evicted, the block can be moved from shared to M or E – this can help save future bus transactions • Having an extra wire to determine exclusivity seems like a worthy trade-off in update systems
  • 9. 9 State Transitions To From NP I E S M NP 0 0 1.25 0.96 1.68 I 0.64 0 0 1.87 0.002 E 0.20 0 14.0 0.02 1.00 S 0.42 2.5 0 134.7 2.24 M 2.63 0.002 0 2.3 843.6 To From NP I E S M NP -- -- BusRd BusRd BusRdX I -- -- BusRd BusRd BusRdX E -- -- -- -- -- S -- -- Not possible -- BusUpgr M BusWB BusWB Not possible BusWB -- State transitions per 1000 data memory references for Ocean Bus actions for each state transition NP – Not Present
  • 10. 10 Basic Implementation • Assume single level of cache, atomic bus transactions • It is simpler to implement a processor-side cache controller that monitors requests from the processor and a bus-side cache controller that services the bus • Both controllers are constantly trying to read tags  tags can be duplicated (moderate area overhead)  unlike data, tags are rarely updated  tag updates stall the other controller
  • 11. 11 Reporting Snoop Results • Uniprocessor system: initiator places address on bus, all devices monitor address, one device acks by raising a wired-OR signal, data is transferred • In a multiprocessor, memory has to wait for the snoop result before it chooses to respond – need 3 wired-OR signals: (i) indicates that a cache has a copy, (ii) indicates that a cache has a modified copy, (iii) indicates that the snoop has not completed • Ensuring timely snoops: the time to respond could be fixed or variable (with the third wired-OR signal), or the memory could track if a cache has a block in M state
  • 12. 12 Non-Atomic State Transitions • Note that a cache controller’s actions are not all atomic: tag look-up, bus arbitration, bus transaction, data/tag update • Consider this: block A in shared state in P1 and P2; both issue a write; the bus controllers are ready to issue an upgrade request and try to acquire the bus; is there a problem? • The controller can keep track of additional intermediate states so it can react to bus traffic (e.g. SM, IM, IS,E) • Alternatively, eliminate upgrade request; use the shared wire to suppress memory’s response to an exclusive-rd
  • 13. 13 Livelock • Livelock can happen if the processor-cache handshake is not designed correctly • Before the processor can attempt the write, it must acquire the block in exclusive state • If all processors are writing to the same block, one of them acquires the block first – if another exclusive request is seen on the bus, the cache controller must wait for the processor to complete the write before releasing the block -- else, the processor’s write will fail again because the block would be in invalid state
  • 14. 14 Split Transaction Bus • What would it take to implement the protocol correctly while assuming a split transaction bus? • Split transaction bus: a cache puts out a request, releases the bus (so others can use the bus), receives its response much later • Assumptions:  only one request per block can be outstanding  separate lines for addr (request) and data (response)
  • 15. 15 Split Transaction Bus Proc 1 Cache Proc 2 Cache Proc 3 Cache Request lines Response lines
  • 16. 16 Design Issues • When does the snoop complete? What if the snoop takes a long time? • What if the buffer in a processor/memory is full? When does the buffer release an entry? Are the buffers identical? • How does each processor ensure that a block does not have multiple outstanding requests? • What determines the write order – requests or responses?
  • 17. 17 Design Issues II • What happens if a processor is arbitrating for the bus and witnesses another bus transaction for the same address? • If the processor issues a read miss and there is already a matching read in the request table, can we reduce bus traffic?