SlideShare a Scribd company logo
Outline
• Non-Uniform Cache Architecture (NUCA)
• Cache Coherence
• Implementation of directories in multicore
architecture
1
Non-Uniform Cache Architecture [1]
• Uniform Cache Architecture
▫ Multi-level cache hierarchies
 Organized into a few discrete levels
 Each level reduces access to the lower level
 Inclusion overhead
 Internal wire delays
 Restricted number of ports
▫ Large on-chip cache
 Single and discrete hit latency
 Undesirable due to increasing wire delays
2
Non-Uniform Cache Architecture [1]
• Non-uniform cache architecture (NUCA)
▫ Exploit non-uniformity
 Data in large cache closer to processor is accessed
faster than data residing physically farther
Level 2 caches architectures, 16MB with 50nm technology (taken from [1])
3
Non-Uniform Cache Architecture [1]
• Static NUCA
▫ Each bank can be accessed at different speeds
 Proportional to the distance from the controller
 Lower latency when closer to controller
▫ Mapping of data into banks based on block index
▫ Banks are independently addressable
▫ Access to banks may proceed in parallel
Banks have private channels
▫ Large number of wires
▫ Access time and routing delay increase with time
 Best organization at smaller technologies uses larger
banks
4
Non-Uniform Cache Architecture [1]
Static NUCA design (taken from [1])
5
Non-Uniform Cache Architecture [1]
• Switched Static NUCA
▫ 2D Mesh, point-to-point links
▫ Removes most of the large number of wires
▫ Allows a large number of faster, smaller banks
• Dynamic NUCA
▫ Allows data to be mapped to many banks
▫ Allows data to migrate among the banks
▫ Frequently used data can be promoted to faster
banks
6
Non-Uniform Cache Architecture [1]
Switched NUCA design (taken from [1])
7
Non-Uniform Cache Architecture [2]
• Policies
▫ Bank placement policy
 Where is data placed in the NUCA cache memory
▫ Bank access policy
 Determines bank-searching algorithm
▫ Bank migration policy
 Determines if a data element is allowed to change its
placement from one bank to another
 Regulates migration of data
▫ Bank replacement policy
 How NUCA behaves when there is a data eviction from
one of the banks
8
Taken from [2]
Non-Uniform Cache Architecture [2]
9
Cache Coherence
• Cache-coherence problem
• Support for large number of processors
▫ Need for high bandwidth
▫ Bus architecture insufficient
• Point-to-Point networks
▫ No broadcast mechanism
▫ Snooping protocol unusable
• Directory
▫ Solution for point-to-point networks
▫ Stores location of cache copies of blocks of data
▫ Centralized or distributed
10
Implementation of directories in
multicore architectures [3]
• DRAM (off-chip) directory
▫ Stores directory information in DRAM
 Ex: full-map protocol
▫ Does not exploit distance locality
▫ Treats each tile as a potential sharer of data
▫ Directory can be cached in on-chip SRAM
 Do not need to access off-chip memory each time
11
Implementation of directories in
multicore architectures [3]
Taken from [3]
12
Implementation of directories in
multicore architecture [4]
• DRAM (off-chip) directory with directory caches
▫ Private cache
▫ Directory is cached in each tile
 Do not need to access off-chip memory each time
 Non-coherent caches
 Home node for any given cache line
 Different range of memory address for each tile
▫ Directory controller in each tile
 Controls coherency between private caches
13
Implementation of directories in
multicore architecture [4]
Taken from [4]
14
Implementation of directories in
multicore architectures [3]
• Duplicate tag directory
▫ Directory centrally located in SRAM
▫ Connected to individual cores
▫ Exact duplicate tag store
 Directory state for a block is determined by examining
copy of tags of every possible cache that can hold the
block
 Keep copied tags up-to-date
▫ No more need to read states from DRAM memory
▫ Challenging as the number of cores increases
 64 cores, 16-way associative cache = 1024 aggregate
associativity of all tiles
15
Implementation of directories in
multicore architectures [3]
Taken from [3]
16
Implementation of directories in
multicore architecture [5]
Directory memory, 4-way associative caches (taken from [5])
17
Implementation of directories in
multicore architectures [3]
• Static cache bank directory
▫ Distributed directory among the tiles
 Mapping block address to a tile (called the home tile)
 Home tiles selected by simple interleaving
 Location can be sub-optimal (see next slide)
 Tile’s cache extended to contain directory
information
 Integrates directory states with cache tags
 Avoids SRAM or DRAM separate directory
18
Implementation of directories in
multicore architectures [3,6]
Taken from [3]
19
Taken from [6]
Implementation of directories in
multicore architecture [7]
• SGI Origin2000 multiprocessor system
▫ Directory memory connected to on-chip memory
 Shared L2 cache
 Directory memory distributed over multiple tiles
 Cache coherence controller
 Home tile sends appropriate messages to cores
20
Implementation of directories in
multicore architecture [7]
SGI Origin2000 multiprocessor system (taken from [7])
21
Implementation of directories in
multicore architecture [8]
• Tilera Tile64 architecture
▫ 2d mesh network (8X8)
▫ Provides coherent shared-memory environment
▫ Uses neighborhood caching
 Provides on-chip distributed shared cache
▫ Coherency is maintained at the home tile
 Data is not cached at non-home tiles
▫ Communication over a Tile Dynamic Network
22
Implementation of directories in
multicore architecture [9]
23
Tilera Tile64 (taken from)
References
• [1] C. Kim, D. Burger, S.W. Keckler, “An Adaptative, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip
Caches”, in Proc. 10th Int. Conf. ASPLOS, San Jose, CA, 2002, pp. 1-12
• [2] J. Lira, C. Molina, A. Gonzalez, “Analysis of Non-Uniform Cache Architecture Policies for Chip-Multiprocessors Using
the Parsec Benchmark Suite”, MMCS’09, Mar. 2009, pp. 1-8
• [3] M.R. Marty, M.D. Hill, “Virtual Hierarchies to Support Server Consolidation”, ISCA’07, June 2007, pp. 1-11
• [4] J.A. Brown, R. Kumar, D. Tullsen, “Proximity-Aware Directory-based Coherence for Multi-core Processor Architectures”,
SPAA’07, June 2007, pp. 1-9
• [5] J. Chang, G.S. Sophi, “Cooperative Caching for Chip Multiprocessors”, Computer Architecture, ISCA '06. 33rd
International Symposium on, 2006, pp.264-276
• [6] S. Cho, L. Jin, "Managing Distributed, Shared L2 Caches through OS-Level Page Allocation“, Microarchitecture, 2006.
MICRO-39. 39th Annual IEEE/ACM International Symposium on, Dec. 2006, pp.455-468
• [7] H. Lee, S. Cho, B.R. Childers, "PERFECTORY: A Fault-Tolerant Directory Memory Architecture“, Computers, IEEE
Transactions on , vol.59, no.5, May 2010, p.638-650
• [8] D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.C. Miao, J.F. Brown, A. Agarwal,
"On-Chip Interconnection Architecture of the Tile Processor“, Micro, IEEE , vol.27, no.5, Sept.-Oct. 2007, pp.15-31
• [9] Linux Devices, “4-way chip gains Linux IDE, dev cards, design wins” [online], Linux Devices, Apr. 2008 [cited Oct. 21
2010] , available from World Wide Web: < http://thing1.linuxdevices.com/news/NS4811855366.html >
24

More Related Content

What's hot

Block Level Storage Vs File Level Storage
Block Level Storage Vs File Level StorageBlock Level Storage Vs File Level Storage
Block Level Storage Vs File Level Storage
Pradeep Jagan
 
Gluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephantGluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephant
Gluster.org
 
HDFS for Geographically Distributed File System
HDFS for Geographically Distributed File SystemHDFS for Geographically Distributed File System
HDFS for Geographically Distributed File System
Konstantin V. Shvachko
 
Recent advancements in cache technology
Recent advancements in cache technologyRecent advancements in cache technology
Recent advancements in cache technology
Paras Nath Chaudhary
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
NECST Lab @ Politecnico di Milano
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012
Gluster.org
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
NECST Lab @ Politecnico di Milano
 
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vosOSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
NETWAYS
 
Database management-system
Database management-systemDatabase management-system
Database management-systemkalasalingam
 
Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2
Gang He
 
The Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.orgThe Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.org
John Mark Walker
 
file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada
umardanjumamaiwada
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
Gluster.org
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
Atin Mukherjee
 
MySQL/JVM
MySQL/JVMMySQL/JVM

What's hot (16)

Block Level Storage Vs File Level Storage
Block Level Storage Vs File Level StorageBlock Level Storage Vs File Level Storage
Block Level Storage Vs File Level Storage
 
Gluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephantGluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephant
 
HDFS for Geographically Distributed File System
HDFS for Geographically Distributed File SystemHDFS for Geographically Distributed File System
HDFS for Geographically Distributed File System
 
Recent advancements in cache technology
Recent advancements in cache technologyRecent advancements in cache technology
Recent advancements in cache technology
 
Dumitru Enache - Bacula
Dumitru Enache - BaculaDumitru Enache - Bacula
Dumitru Enache - Bacula
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vosOSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vos
 
Database management-system
Database management-systemDatabase management-system
Database management-system
 
Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2Comparison between OCFS2 and GFS2
Comparison between OCFS2 and GFS2
 
The Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.orgThe Future of GlusterFS and Gluster.org
The Future of GlusterFS and Gluster.org
 
file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
 
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized StoreGlusterD 2.0 - Managing Distributed File System Using a Centralized Store
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
 
MySQL/JVM
MySQL/JVMMySQL/JVM
MySQL/JVM
 

Viewers also liked

Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
James Wong
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
Tony Nguyen
 
ΛΟΙΜΩΞΕΙΣ
ΛΟΙΜΩΞΕΙΣΛΟΙΜΩΞΕΙΣ
ΛΟΙΜΩΞΕΙΣ
IOANNIS ALEXAKIS
 
creativity at work fall2015
creativity at work fall2015creativity at work fall2015
creativity at work fall2015Terry Chong
 
Búsqueda no informada - Backtracking/Hacia atrás
Búsqueda no informada - Backtracking/Hacia atrásBúsqueda no informada - Backtracking/Hacia atrás
Búsqueda no informada - Backtracking/Hacia atrás
Laura Del Pino Díaz
 
Abstract class
Abstract classAbstract class
Abstract class
Tony Nguyen
 
Internet de las cosas
Internet de las cosasInternet de las cosas
Internet de las cosas
MICHAELANTONYCASTILLOAVILA
 
Rafael orsini 24118492
Rafael orsini 24118492Rafael orsini 24118492
Rafael orsini 24118492
orsini07
 
Busqueda informada y explorada
Busqueda informada y exploradaBusqueda informada y explorada
Busqueda informada y explorada
Tito Rengifo Sanclemente
 
Prolog Visualizer
Prolog VisualizerProlog Visualizer
Prolog Visualizer
Zhixuan Lai
 
Power point 1 η μικροδιδασκαλία
Power point 1 η μικροδιδασκαλίαPower point 1 η μικροδιδασκαλία
Power point 1 η μικροδιδασκαλία
Maria Antorka
 
Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16
Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16
Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16
John Tzortzakis
 
Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...
Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...
Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...
John Tzortzakis
 

Viewers also liked (15)

Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
ΛΟΙΜΩΞΕΙΣ
ΛΟΙΜΩΞΕΙΣΛΟΙΜΩΞΕΙΣ
ΛΟΙΜΩΞΕΙΣ
 
creativity at work fall2015
creativity at work fall2015creativity at work fall2015
creativity at work fall2015
 
Búsqueda no informada - Backtracking/Hacia atrás
Búsqueda no informada - Backtracking/Hacia atrásBúsqueda no informada - Backtracking/Hacia atrás
Búsqueda no informada - Backtracking/Hacia atrás
 
Abstract class
Abstract classAbstract class
Abstract class
 
Poo java
Poo javaPoo java
Poo java
 
Internet de las cosas
Internet de las cosasInternet de las cosas
Internet de las cosas
 
αγαθα διακρισεις αγαθων
αγαθα διακρισεις αγαθωναγαθα διακρισεις αγαθων
αγαθα διακρισεις αγαθων
 
Rafael orsini 24118492
Rafael orsini 24118492Rafael orsini 24118492
Rafael orsini 24118492
 
Busqueda informada y explorada
Busqueda informada y exploradaBusqueda informada y explorada
Busqueda informada y explorada
 
Prolog Visualizer
Prolog VisualizerProlog Visualizer
Prolog Visualizer
 
Power point 1 η μικροδιδασκαλία
Power point 1 η μικροδιδασκαλίαPower point 1 η μικροδιδασκαλία
Power point 1 η μικροδιδασκαλία
 
Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16
Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16
Εξεταστέα ύλη μαθημάτων της Γ΄τάξης του Τομέα Δομικών Εργων σχ.έτους 2015-16
 
Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...
Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...
Αναλυτικά Προγράμματα Σπουδών Β ́ και Γ ́ τάξεων Τομέα Δομικών Έργων ΦΕΚ 770 ...
 

Similar to Directory based cache coherence

Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019
Dharma Shukla
 
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptmy no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
wondimagegndesta
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
Satish Mehta
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application Enablement
DATAVERSITY
 
CASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSCASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMS
Vipul Thakur
 
Project Presentation Final
Project Presentation FinalProject Presentation Final
Project Presentation FinalDhritiman Halder
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
Bigstep
 
D108636GC10_les01.pptx
D108636GC10_les01.pptxD108636GC10_les01.pptx
D108636GC10_les01.pptx
Suresh569521
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
PritamKathar
 
409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptx409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptx
son2483
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
Adnan Siddiqi
 
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
Eric D. Schabell
 
Factored operating systems
Factored operating systemsFactored operating systems
Factored operating systems
Indika Munaweera Kankanamge
 
Data Center
Data CenterData Center
Data Center
dhana1663
 
Nosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networksNosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networks
Nikhil Bhaware
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
Maynooth University
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
RithikRaj25
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
raghdooosh
 

Similar to Directory based cache coherence (20)

Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019Cosmos DB at VLDB 2019
Cosmos DB at VLDB 2019
 
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptmy no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
 
Cassandra tech talk
Cassandra tech talkCassandra tech talk
Cassandra tech talk
 
NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application Enablement
 
CASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSCASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMS
 
Project Presentation Final
Project Presentation FinalProject Presentation Final
Project Presentation Final
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
D108636GC10_les01.pptx
D108636GC10_les01.pptxD108636GC10_les01.pptx
D108636GC10_les01.pptx
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptx409793049-Storage-Virtualization-pptx.pptx
409793049-Storage-Virtualization-pptx.pptx
 
Apache cassandra
Apache cassandraApache cassandra
Apache cassandra
 
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
 
Factored operating systems
Factored operating systemsFactored operating systems
Factored operating systems
 
Vaibhav (2)
Vaibhav (2)Vaibhav (2)
Vaibhav (2)
 
Data Center
Data CenterData Center
Data Center
 
Nosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networksNosql query processing system for wireless sensor networks
Nosql query processing system for wireless sensor networks
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 

More from Harry Potter

How to build a rest api.pptx
How to build a rest api.pptxHow to build a rest api.pptx
How to build a rest api.pptx
Harry Potter
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Harry Potter
 
Big picture of data mining
Big picture of data miningBig picture of data mining
Big picture of data mining
Harry Potter
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
Harry Potter
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
Harry Potter
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
Harry Potter
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cache
Harry Potter
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
Harry Potter
 
Abstract data types
Abstract data typesAbstract data types
Abstract data types
Harry Potter
 
Abstraction file
Abstraction fileAbstraction file
Abstraction file
Harry Potter
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with java
Harry Potter
 
Encapsulation anonymous class
Encapsulation anonymous classEncapsulation anonymous class
Encapsulation anonymous class
Harry Potter
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
Harry Potter
 
Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your site
Harry Potter
 

More from Harry Potter (20)

How to build a rest api.pptx
How to build a rest api.pptxHow to build a rest api.pptx
How to build a rest api.pptx
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Big picture of data mining
Big picture of data miningBig picture of data mining
Big picture of data mining
 
Data mining and knowledge discovery
Data mining and knowledge discoveryData mining and knowledge discovery
Data mining and knowledge discovery
 
Cache recap
Cache recapCache recap
Cache recap
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cache
 
Smm & caching
Smm & cachingSmm & caching
Smm & caching
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
Abstract data types
Abstract data typesAbstract data types
Abstract data types
 
Abstraction file
Abstraction fileAbstraction file
Abstraction file
 
Object model
Object modelObject model
Object model
 
Concurrency with java
Concurrency with javaConcurrency with java
Concurrency with java
 
Encapsulation anonymous class
Encapsulation anonymous classEncapsulation anonymous class
Encapsulation anonymous class
 
Abstract class
Abstract classAbstract class
Abstract class
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
 
Api crash
Api crashApi crash
Api crash
 
Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your site
 
Inheritance
InheritanceInheritance
Inheritance
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 

Directory based cache coherence

  • 1. Outline • Non-Uniform Cache Architecture (NUCA) • Cache Coherence • Implementation of directories in multicore architecture 1
  • 2. Non-Uniform Cache Architecture [1] • Uniform Cache Architecture ▫ Multi-level cache hierarchies  Organized into a few discrete levels  Each level reduces access to the lower level  Inclusion overhead  Internal wire delays  Restricted number of ports ▫ Large on-chip cache  Single and discrete hit latency  Undesirable due to increasing wire delays 2
  • 3. Non-Uniform Cache Architecture [1] • Non-uniform cache architecture (NUCA) ▫ Exploit non-uniformity  Data in large cache closer to processor is accessed faster than data residing physically farther Level 2 caches architectures, 16MB with 50nm technology (taken from [1]) 3
  • 4. Non-Uniform Cache Architecture [1] • Static NUCA ▫ Each bank can be accessed at different speeds  Proportional to the distance from the controller  Lower latency when closer to controller ▫ Mapping of data into banks based on block index ▫ Banks are independently addressable ▫ Access to banks may proceed in parallel Banks have private channels ▫ Large number of wires ▫ Access time and routing delay increase with time  Best organization at smaller technologies uses larger banks 4
  • 5. Non-Uniform Cache Architecture [1] Static NUCA design (taken from [1]) 5
  • 6. Non-Uniform Cache Architecture [1] • Switched Static NUCA ▫ 2D Mesh, point-to-point links ▫ Removes most of the large number of wires ▫ Allows a large number of faster, smaller banks • Dynamic NUCA ▫ Allows data to be mapped to many banks ▫ Allows data to migrate among the banks ▫ Frequently used data can be promoted to faster banks 6
  • 7. Non-Uniform Cache Architecture [1] Switched NUCA design (taken from [1]) 7
  • 8. Non-Uniform Cache Architecture [2] • Policies ▫ Bank placement policy  Where is data placed in the NUCA cache memory ▫ Bank access policy  Determines bank-searching algorithm ▫ Bank migration policy  Determines if a data element is allowed to change its placement from one bank to another  Regulates migration of data ▫ Bank replacement policy  How NUCA behaves when there is a data eviction from one of the banks 8
  • 9. Taken from [2] Non-Uniform Cache Architecture [2] 9
  • 10. Cache Coherence • Cache-coherence problem • Support for large number of processors ▫ Need for high bandwidth ▫ Bus architecture insufficient • Point-to-Point networks ▫ No broadcast mechanism ▫ Snooping protocol unusable • Directory ▫ Solution for point-to-point networks ▫ Stores location of cache copies of blocks of data ▫ Centralized or distributed 10
  • 11. Implementation of directories in multicore architectures [3] • DRAM (off-chip) directory ▫ Stores directory information in DRAM  Ex: full-map protocol ▫ Does not exploit distance locality ▫ Treats each tile as a potential sharer of data ▫ Directory can be cached in on-chip SRAM  Do not need to access off-chip memory each time 11
  • 12. Implementation of directories in multicore architectures [3] Taken from [3] 12
  • 13. Implementation of directories in multicore architecture [4] • DRAM (off-chip) directory with directory caches ▫ Private cache ▫ Directory is cached in each tile  Do not need to access off-chip memory each time  Non-coherent caches  Home node for any given cache line  Different range of memory address for each tile ▫ Directory controller in each tile  Controls coherency between private caches 13
  • 14. Implementation of directories in multicore architecture [4] Taken from [4] 14
  • 15. Implementation of directories in multicore architectures [3] • Duplicate tag directory ▫ Directory centrally located in SRAM ▫ Connected to individual cores ▫ Exact duplicate tag store  Directory state for a block is determined by examining copy of tags of every possible cache that can hold the block  Keep copied tags up-to-date ▫ No more need to read states from DRAM memory ▫ Challenging as the number of cores increases  64 cores, 16-way associative cache = 1024 aggregate associativity of all tiles 15
  • 16. Implementation of directories in multicore architectures [3] Taken from [3] 16
  • 17. Implementation of directories in multicore architecture [5] Directory memory, 4-way associative caches (taken from [5]) 17
  • 18. Implementation of directories in multicore architectures [3] • Static cache bank directory ▫ Distributed directory among the tiles  Mapping block address to a tile (called the home tile)  Home tiles selected by simple interleaving  Location can be sub-optimal (see next slide)  Tile’s cache extended to contain directory information  Integrates directory states with cache tags  Avoids SRAM or DRAM separate directory 18
  • 19. Implementation of directories in multicore architectures [3,6] Taken from [3] 19 Taken from [6]
  • 20. Implementation of directories in multicore architecture [7] • SGI Origin2000 multiprocessor system ▫ Directory memory connected to on-chip memory  Shared L2 cache  Directory memory distributed over multiple tiles  Cache coherence controller  Home tile sends appropriate messages to cores 20
  • 21. Implementation of directories in multicore architecture [7] SGI Origin2000 multiprocessor system (taken from [7]) 21
  • 22. Implementation of directories in multicore architecture [8] • Tilera Tile64 architecture ▫ 2d mesh network (8X8) ▫ Provides coherent shared-memory environment ▫ Uses neighborhood caching  Provides on-chip distributed shared cache ▫ Coherency is maintained at the home tile  Data is not cached at non-home tiles ▫ Communication over a Tile Dynamic Network 22
  • 23. Implementation of directories in multicore architecture [9] 23 Tilera Tile64 (taken from)
  • 24. References • [1] C. Kim, D. Burger, S.W. Keckler, “An Adaptative, Non-Uniform Cache Structure for Wire-Delay Dominated On-Chip Caches”, in Proc. 10th Int. Conf. ASPLOS, San Jose, CA, 2002, pp. 1-12 • [2] J. Lira, C. Molina, A. Gonzalez, “Analysis of Non-Uniform Cache Architecture Policies for Chip-Multiprocessors Using the Parsec Benchmark Suite”, MMCS’09, Mar. 2009, pp. 1-8 • [3] M.R. Marty, M.D. Hill, “Virtual Hierarchies to Support Server Consolidation”, ISCA’07, June 2007, pp. 1-11 • [4] J.A. Brown, R. Kumar, D. Tullsen, “Proximity-Aware Directory-based Coherence for Multi-core Processor Architectures”, SPAA’07, June 2007, pp. 1-9 • [5] J. Chang, G.S. Sophi, “Cooperative Caching for Chip Multiprocessors”, Computer Architecture, ISCA '06. 33rd International Symposium on, 2006, pp.264-276 • [6] S. Cho, L. Jin, "Managing Distributed, Shared L2 Caches through OS-Level Page Allocation“, Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on, Dec. 2006, pp.455-468 • [7] H. Lee, S. Cho, B.R. Childers, "PERFECTORY: A Fault-Tolerant Directory Memory Architecture“, Computers, IEEE Transactions on , vol.59, no.5, May 2010, p.638-650 • [8] D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.C. Miao, J.F. Brown, A. Agarwal, "On-Chip Interconnection Architecture of the Tile Processor“, Micro, IEEE , vol.27, no.5, Sept.-Oct. 2007, pp.15-31 • [9] Linux Devices, “4-way chip gains Linux IDE, dev cards, design wins” [online], Linux Devices, Apr. 2008 [cited Oct. 21 2010] , available from World Wide Web: < http://thing1.linuxdevices.com/news/NS4811855366.html > 24

Editor's Notes

  1. [1] ftp://ftp.cs.utexas.edu/pub/dburger/papers/ASPLOS02.pdf
  2. [2] http://www.cercs.gatech.edu/mmcs09/papers/lira.pdf
  3. [3] http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  4. http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  5. http://cseweb.ucsd.edu/users/tullsen/spaa07.pdf
  6. [4] http://cseweb.ucsd.edu/users/tullsen/spaa07.pdf
  7. http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  8. http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  9. [5] http://pages.cs.wisc.edu/~mscalar/papers/2006/isca2006-coop-caching.pdf
  10. [3] http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  11. 1- http://www.cs.pitt.edu/cast/papers/cho-micro06.pdf 2- http://www.cs.wisc.edu/multifacet/papers/isca07_virtual_hierarchy.pdf
  12. http://www.cs.pitt.edu/cast/papers/lee-tc10.pdf
  13. http://www.cs.pitt.edu/cast/papers/lee-tc10.pdf
  14. [8] http://www.ieeexplore.ieee.org.proxy.bib.uottawa.ca/stamp/stamp.jsp?tp=&arnumber=4378780
  15. [9] http://www.linuxfordevices.com/files/misc/tilera_tile64_arch_diag2.gif