SlideShare a Scribd company logo
1 of 21
CS 6290
Many-core & Interconnect
Milos Prvulovic
Fall 2007
Interconnection Networks
• Classification: Shared Medium or Switched
Shared Media Networks
• Need arbitration to decide who gets to talk
• Arbitration can be centralized or distributed
• Centralized not used much for networks
– Special arbiter device (or must elect arbiter)
– Good performance if arbiter far away? Nah.
• Distributed arbitration
– Check if media already used (carrier sensing)
– If media not used now, start sending
– Check if another also sending (collision detection)
– If collision, wait for a while and retry
• “For a while” is random (otherwise collisions repeat forever)
• Exponential back-off to avoid wasting bandwidth on collisions
Switched Networks
• Need switches
– Introduces switching overheads
• No time wasted on arbitration and collisions
• Multiple transfers can be in progress
– If they use different links, of course
• Circuit or Packet Switching
– Circuit switching: end-to-end connections
• Reserves links for a connection (e.g. phone network)
– Packet switching: each packet routed separately
• Links used only when data transferred (e.g. Internet Protocol)
Routing
• Shared media has trivial routing (broadcast)
• In switched media we can have
– Source-based (source specifies route)
– Virtual circuits (end-to-end route created)
• When connection made, set up route
• Switches forward packets along the route
– Destination-based (source specifies destination)
• Switches must route packet toward destination
• Also can be classified into
– Deterministic (one route from a source to a destination)
– Adaptive (different routes can be used)
Routing Methods for Switches
• Store-and-Forward
– Switch receives entire packet, then forwards it
– If error occurs when forwarding, switch can re-send
• Wormhole routing
– Packet consists of flits (a few bytes each)
– First flit contains header w/ destination address
– Switch gets header, decides where to forward
– Other flits forwarded as they arrive
– Looks like packet worming through network
– If an error occurs along the way, sender must re-send
• No switch has the entire packet to re-send it
Cut-Through Routing
• What happens when link busy?
– Header arrives to switch, but outgoing link busy
– What do we do with the other flits of the packet?
• Wormhole routing: stop the tail when head stops
– Now each flit along the way blocks the a link
– One busy link creates other busy links => traffic jam
• Cut-Through Routing
– If outgoing link busy, receive and buffer incoming flits
– The buffered flits stay there until link becomes free
– When link free, the flits start worming out of the switch
– Need packet-sized buffer space in each switch
• Wormhole Routing switch needs to buffer only one flit
Routing: Network Latency
• Switch Delay
– Time from incoming to outgoing link in a switch
• Switches
– Number of switches along the way
• Transfer time
– Time to send the packet through a link
• Store-and-Forward end-to-end transfer time
– (Switches*SwitchDelay)+(TransferTime*(Switches+1))
• Wormhole or Cut-Through end-to-end transfer time
– (Switches*SwitchDelay) + TransferTime
– Much better if there are many switches along the way
– See the example on page 811
Switch Technology
• What do we want in a switch
– Many input and output links
• Usually number of input and output links the same
– Low contention inside the switch
• Best if there is none (only external links cause contention)
– Short switching delay
• Crossbar
– Very low switching delay, no internal contention
– Complexity grows as square of number of links
• Can not have too many links (e.g. up to 64 in and 64 out)
Switch Technology
• What do we want in a switch
– Many input and output links
• Usually number of input and output links the same
– Low contention inside the switch
• Best if there is none (only external links cause contention)
– Short switching delay
• Crossbar
– Very low switching delay, no internal contention
– Complexity grows as square of number of links
• Can not have too many links (e.g. up to 64 in and 64 out)
• Omega Network
– Build switches with more ports using small crossbars
– Lower complexity per link, but longer delay and more contention
Switch Technology
Network Topology
Network Topology
• What do we want in a network topology
– Many nodes, high bandwidth, low contention, low
latency
– Low latency: few switches along any route
• For each (src, dest) pair, we choose shortest route
• Longest such route over all (src,dst) pairs: network diameter
• We want networks with small diameter!
– Low contention: high aggregate bandwidth
• Divide network into two groups, each with half the nodes
• Total bandwidth between groups is bisection bandwidth
• Actually, we use the minimum over all such bisections
On-Chip Networks
• We’ll have many cores on-chip
– Need switched network to provide bandwidth
• Need to map well onto chip surface
– E.g. hypercube is not great
• Mesh or grid should work well, torus OK too
– Limited ports per switch (CPU & 4 neighbors)
– All links short (going to neighbors)
– Many parallel algorithms map well onto grids
• Matrices, grids, etc.
Trouble with shared caches
• Private caches OK
– Each placed with its own processor
• We want a shared cache, too
– Fits more data than if broken into private
caches
• Private caches replicate data
– Dynamically shared
• Threads that need more space get more space
• But how do we make a shared cache fast
Trouble with shared caches
• Private caches OK
– Each placed with its own processor
• We want a shared cache, too
– Fits more data than if broken into private
caches
• Private caches replicate data
– Dynamically shared
• Threads that need more space get more space
• But how do we make a shared cache fast
Non-uniform Cache Arch. (NUCA)
Bank
Switch
CPU
Request 0x….3
Request 0x….C
S-NUCA Perormance
• Fast access to nearby banks
• Slow access to far-away banks
• Average better than worst-case
CPU
D-NUCA Solution
A B
D-NUCA Perormance
• Fast access to nearby banks
• Slow access to far-away banks
• Average much better than worst-case
• But we keep moving blocks
– Lots of power-hungry activity
• Need smart policies for block migration
– Move blocks less frequently
– But get most of the benefit of being able to
move
D-NUCA Issues
• Blocks keep moving, how do we find them?
• One solution: Use an on-chip directory!
– Use direct mapping to assign a home bank
– If we don’t know where the block is,
ask the home bank
– If we move the block, tell the home bank
– If we think we know where the block is,
look there. If it’s been moved, ask home bank

More Related Content

Similar to unit1.ppt

Unit 4_Network Layer_Part II.pptx
Unit 4_Network Layer_Part II.pptxUnit 4_Network Layer_Part II.pptx
Unit 4_Network Layer_Part II.pptxHODElex
 
Routing algorithm
Routing algorithmRouting algorithm
Routing algorithmBushra M
 
Chapter 2 Switches in network.ppt
Chapter 2 Switches in network.pptChapter 2 Switches in network.ppt
Chapter 2 Switches in network.pptmonikarawat57
 
layer2-network-design.ppt
layer2-network-design.pptlayer2-network-design.ppt
layer2-network-design.pptPatrick Theuri
 
Topology,Switching and Routing
Topology,Switching and RoutingTopology,Switching and Routing
Topology,Switching and RoutingAnushiya Ram
 
chapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.pptchapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.pptaakarshsiwani1
 
Module 1 Introduction to Computer Networks.pptx
Module 1 Introduction to Computer Networks.pptxModule 1 Introduction to Computer Networks.pptx
Module 1 Introduction to Computer Networks.pptxAASTHAJAJOO
 
Networkprotocolstructurescope 130719081246-phpapp01
Networkprotocolstructurescope 130719081246-phpapp01Networkprotocolstructurescope 130719081246-phpapp01
Networkprotocolstructurescope 130719081246-phpapp01Gaurav Goyal
 
Network protocol structure scope
Network protocol structure scopeNetwork protocol structure scope
Network protocol structure scopeSanat Maharjan
 
12 ipt 0303 transmitting and receiving
12 ipt 0303   transmitting and receiving12 ipt 0303   transmitting and receiving
12 ipt 0303 transmitting and receivingctedds
 
HowTheInternetWorks.ppt
HowTheInternetWorks.pptHowTheInternetWorks.ppt
HowTheInternetWorks.pptPrakhar Pandey
 
layer2-network-design.ppt
layer2-network-design.pptlayer2-network-design.ppt
layer2-network-design.pptVimalMallick
 

Similar to unit1.ppt (20)

Unit 4_Network Layer_Part II.pptx
Unit 4_Network Layer_Part II.pptxUnit 4_Network Layer_Part II.pptx
Unit 4_Network Layer_Part II.pptx
 
Routing algorithm
Routing algorithmRouting algorithm
Routing algorithm
 
Chapter 2 Switches in network.ppt
Chapter 2 Switches in network.pptChapter 2 Switches in network.ppt
Chapter 2 Switches in network.ppt
 
ITFT_Switching
ITFT_Switching ITFT_Switching
ITFT_Switching
 
layer2-network-design.ppt
layer2-network-design.pptlayer2-network-design.ppt
layer2-network-design.ppt
 
Topology,Switching and Routing
Topology,Switching and RoutingTopology,Switching and Routing
Topology,Switching and Routing
 
chapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.pptchapter4-processes nd processors in DS.ppt
chapter4-processes nd processors in DS.ppt
 
Module 1 Introduction to Computer Networks.pptx
Module 1 Introduction to Computer Networks.pptxModule 1 Introduction to Computer Networks.pptx
Module 1 Introduction to Computer Networks.pptx
 
Switching
SwitchingSwitching
Switching
 
Networkprotocolstructurescope 130719081246-phpapp01
Networkprotocolstructurescope 130719081246-phpapp01Networkprotocolstructurescope 130719081246-phpapp01
Networkprotocolstructurescope 130719081246-phpapp01
 
Network protocol structure scope
Network protocol structure scopeNetwork protocol structure scope
Network protocol structure scope
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
12 ipt 0303 transmitting and receiving
12 ipt 0303   transmitting and receiving12 ipt 0303   transmitting and receiving
12 ipt 0303 transmitting and receiving
 
HowTheInternetWorks.ppt
HowTheInternetWorks.pptHowTheInternetWorks.ppt
HowTheInternetWorks.ppt
 
Routing Protocols
Routing ProtocolsRouting Protocols
Routing Protocols
 
layer2-network-design.ppt
layer2-network-design.pptlayer2-network-design.ppt
layer2-network-design.ppt
 
Switch networking
Switch networking Switch networking
Switch networking
 
Unit2.2
Unit2.2Unit2.2
Unit2.2
 
Networking.ppt
Networking.pptNetworking.ppt
Networking.ppt
 
Dcn lecture 3
Dcn lecture 3Dcn lecture 3
Dcn lecture 3
 

More from MsRAMYACSE

More from MsRAMYACSE (8)

plant ppt.pptx
plant ppt.pptxplant ppt.pptx
plant ppt.pptx
 
Unit 2-Design Patterns.ppt
Unit 2-Design Patterns.pptUnit 2-Design Patterns.ppt
Unit 2-Design Patterns.ppt
 
Unit 2.ppt
Unit 2.pptUnit 2.ppt
Unit 2.ppt
 
unit-4.ppt
unit-4.pptunit-4.ppt
unit-4.ppt
 
Unit 1.ppt
Unit 1.pptUnit 1.ppt
Unit 1.ppt
 
garden.pptx
garden.pptxgarden.pptx
garden.pptx
 
organ.pptx
organ.pptxorgan.pptx
organ.pptx
 
xen.pptx
xen.pptxxen.pptx
xen.pptx
 

Recently uploaded

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 

Recently uploaded (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 

unit1.ppt

  • 1. CS 6290 Many-core & Interconnect Milos Prvulovic Fall 2007
  • 3. Shared Media Networks • Need arbitration to decide who gets to talk • Arbitration can be centralized or distributed • Centralized not used much for networks – Special arbiter device (or must elect arbiter) – Good performance if arbiter far away? Nah. • Distributed arbitration – Check if media already used (carrier sensing) – If media not used now, start sending – Check if another also sending (collision detection) – If collision, wait for a while and retry • “For a while” is random (otherwise collisions repeat forever) • Exponential back-off to avoid wasting bandwidth on collisions
  • 4. Switched Networks • Need switches – Introduces switching overheads • No time wasted on arbitration and collisions • Multiple transfers can be in progress – If they use different links, of course • Circuit or Packet Switching – Circuit switching: end-to-end connections • Reserves links for a connection (e.g. phone network) – Packet switching: each packet routed separately • Links used only when data transferred (e.g. Internet Protocol)
  • 5. Routing • Shared media has trivial routing (broadcast) • In switched media we can have – Source-based (source specifies route) – Virtual circuits (end-to-end route created) • When connection made, set up route • Switches forward packets along the route – Destination-based (source specifies destination) • Switches must route packet toward destination • Also can be classified into – Deterministic (one route from a source to a destination) – Adaptive (different routes can be used)
  • 6. Routing Methods for Switches • Store-and-Forward – Switch receives entire packet, then forwards it – If error occurs when forwarding, switch can re-send • Wormhole routing – Packet consists of flits (a few bytes each) – First flit contains header w/ destination address – Switch gets header, decides where to forward – Other flits forwarded as they arrive – Looks like packet worming through network – If an error occurs along the way, sender must re-send • No switch has the entire packet to re-send it
  • 7. Cut-Through Routing • What happens when link busy? – Header arrives to switch, but outgoing link busy – What do we do with the other flits of the packet? • Wormhole routing: stop the tail when head stops – Now each flit along the way blocks the a link – One busy link creates other busy links => traffic jam • Cut-Through Routing – If outgoing link busy, receive and buffer incoming flits – The buffered flits stay there until link becomes free – When link free, the flits start worming out of the switch – Need packet-sized buffer space in each switch • Wormhole Routing switch needs to buffer only one flit
  • 8. Routing: Network Latency • Switch Delay – Time from incoming to outgoing link in a switch • Switches – Number of switches along the way • Transfer time – Time to send the packet through a link • Store-and-Forward end-to-end transfer time – (Switches*SwitchDelay)+(TransferTime*(Switches+1)) • Wormhole or Cut-Through end-to-end transfer time – (Switches*SwitchDelay) + TransferTime – Much better if there are many switches along the way – See the example on page 811
  • 9. Switch Technology • What do we want in a switch – Many input and output links • Usually number of input and output links the same – Low contention inside the switch • Best if there is none (only external links cause contention) – Short switching delay • Crossbar – Very low switching delay, no internal contention – Complexity grows as square of number of links • Can not have too many links (e.g. up to 64 in and 64 out)
  • 10. Switch Technology • What do we want in a switch – Many input and output links • Usually number of input and output links the same – Low contention inside the switch • Best if there is none (only external links cause contention) – Short switching delay • Crossbar – Very low switching delay, no internal contention – Complexity grows as square of number of links • Can not have too many links (e.g. up to 64 in and 64 out) • Omega Network – Build switches with more ports using small crossbars – Lower complexity per link, but longer delay and more contention
  • 13. Network Topology • What do we want in a network topology – Many nodes, high bandwidth, low contention, low latency – Low latency: few switches along any route • For each (src, dest) pair, we choose shortest route • Longest such route over all (src,dst) pairs: network diameter • We want networks with small diameter! – Low contention: high aggregate bandwidth • Divide network into two groups, each with half the nodes • Total bandwidth between groups is bisection bandwidth • Actually, we use the minimum over all such bisections
  • 14. On-Chip Networks • We’ll have many cores on-chip – Need switched network to provide bandwidth • Need to map well onto chip surface – E.g. hypercube is not great • Mesh or grid should work well, torus OK too – Limited ports per switch (CPU & 4 neighbors) – All links short (going to neighbors) – Many parallel algorithms map well onto grids • Matrices, grids, etc.
  • 15. Trouble with shared caches • Private caches OK – Each placed with its own processor • We want a shared cache, too – Fits more data than if broken into private caches • Private caches replicate data – Dynamically shared • Threads that need more space get more space • But how do we make a shared cache fast
  • 16. Trouble with shared caches • Private caches OK – Each placed with its own processor • We want a shared cache, too – Fits more data than if broken into private caches • Private caches replicate data – Dynamically shared • Threads that need more space get more space • But how do we make a shared cache fast
  • 17. Non-uniform Cache Arch. (NUCA) Bank Switch CPU Request 0x….3 Request 0x….C
  • 18. S-NUCA Perormance • Fast access to nearby banks • Slow access to far-away banks • Average better than worst-case
  • 20. D-NUCA Perormance • Fast access to nearby banks • Slow access to far-away banks • Average much better than worst-case • But we keep moving blocks – Lots of power-hungry activity • Need smart policies for block migration – Move blocks less frequently – But get most of the benefit of being able to move
  • 21. D-NUCA Issues • Blocks keep moving, how do we find them? • One solution: Use an on-chip directory! – Use direct mapping to assign a home bank – If we don’t know where the block is, ask the home bank – If we move the block, tell the home bank – If we think we know where the block is, look there. If it’s been moved, ask home bank