SlideShare a Scribd company logo
1 of 32
AN ENERGY EFFICIENT CACHE MEMORY
DESIGN USING VERILOG HDL
NAME: DHRITIMAN HALDER
USN: 1RE12LVS05
DEPARTMENT: M.TECH (VLSI DESIGN AND EMBEDDED
SYSTEMS)
SEMESTER: 4TH
COURSE CODE: 12LVS43
SUBJECT CODE: 12EC943
UNDER GUIDANCE OF: PROF. PRASAD S.N
CONTENTS
• Introduction to Cache Memory
• Types of Cache Memory
• Cache Read Operation
• Different Cache Mapping Techniques
• Cache Write Operation
• Different Write Policies
• Problem Definition
• Literature Survey
• Proposed Work
• Results
• Conclusion
• Publication
• References
Introduction to Cache Memory
• Processor requires data and instructions while performing a
specific task.
• Data and instructions are stored in main memory.
• A cache memory keeps necessary data and instructions to
accelerate the speed of operation.
Cache and Main Memory
Types of Cache Memory
• Data Cache: A data cache is used to speed up data fetch and
store which is usually organized as a hierarchy of more levels
(L1,L2 etc).
• Instruction Cache: An instruction cache to speed up
executable instruction fetch.
• Translation Lookside Buffer: A translation lookside buffer
(TLB) translates the virtual address into physical address of
requested data and instruction faster.
Cache Read Operation
• Flowchart of Cache Read Operation
Different Cache Mapping Techniques
• Direct Mapping: Any main memory location can be loaded
into one fixed location in cache.
– Advantage- No search is required as there is only one location in
cache for each main memory location.
– Disadvantage- Hit ratio is poor as there is only one fixed location
for each main memory location.
Different Cache Mapping Techniques
• Fully Associative Mapping: Any main memory location can be
placed in any location in cache.
– Advantage- Hit ratio is increased as a main memory location can
take any location in any cache.
– Disadvantage- Consumes a lot of energy because controller unit
needs to search all tag patterns to ensure the data is present or not.
Different Cache Mapping Techniques
• Set Associative Mapping: Any main memory location can be
loaded to at least two or more memory location in cache.
– Advantage- Hit ratio is more compared to direct mapped cache and
amount of energy consumption is less as compared to fully
associative cache because limited no tag patterns needs to be
searched.
Cache Write Operation
• Flowchart of cache write operation
Different Write Policies
• Write-back Policy: In write-back cache only cache memory is
updated during write operation and marked with a dirty bit. Main
memory is updated later when the data block is to be replaced.
– Advantage- Consumes less energy during write operation because
main memory is not updated simultaneously.
– Disadvantage- Algorithm of write operation is complex and often
result data inconsistency.
Different Write Policies
• Write-through Policy: In write-through cache both cache
memory and main memory is updated simultaneously during
write operation.
– Advantage- Maintains data consistency through the memory
hierarchy.
– Disadvantage- Consumes a lot of energy due to increased access at
lower level.
Different Write Policies
• Write-around Policy: In write-around cache only main
memory is updated during write operation.
– Advantage- Consumes less energy and maintains data consistency.
– Disadvantage- Very much application specific and used where
recently executed data will not be required again.
Problem Definition
• Cache Coherence: In a multiprocessor system or in a multi-core
processor data inconsistency may occur between adjacent levels or
within same level because of data sharing with main memory, process
migration etc.
• Soft Error: Due to radiation effect data stored in memory becomes
erroneous which is known as soft error.
• Solution: Write-through is preferred because it updates data in main
memory simultaneously and maintains data consistency.
• Problem: Write-through cache consumes a lot of energy due to
increased access at lower level.
Literature Survey
• Partitioning Cache Data Array into Sub-banks
– Cache data array is partitioned into several segments horizontally.
– Each segment can be powered up individually.
– The only segment that contains required data / instruction is
powered up.
– The amount of power savings is reduced by eliminating
unnecessary access.
Literature Survey
• Division of Bit-Lines into Small Segmentation
– Each column of bit-lines are split into several segmentation.
– All segments are connected to a common line (pre-charged high).
– Address decoder identifies the segment targeted by row address and
isolates all but targeted segment.
– Power consumption is less as capacitive loading is reduced.
Literature Survey
• Way Concatenation Technique
– The memory address is split into a line-offset field, an index field
and a tag field.
– The cache decodes index field of address and compares tag with
address tag field.
– If the is a match the multiplexor routes the cache data to the output.
Literature Survey
• Location Cache
– Works in parallel with TLB and L1 cache.
– On read miss the way information is available because the physical
address is translated by TLB.
– The L2 cache is accessed in direct mapped cache.
Proposed Work
• Conventional Cache Architecture
Proposed Work
• Way-tagged Cache Architecture
Proposed Work
• Amount of Energy Consumption
 Estimated by CACTI 5.3 for a 90-nm CMOS process
Results
• Main Memory
Results
• L1 Cache Reset
Results
• L2 Cache Reset
Results
• Cache Data Load
Results
• L1 Cache Read Hit
Results
• L2 Cache Read Hit
Results
• Read Miss
Results
• Cache Write
Conclusion
• Several new components has been introduced in L1 cache
such as way-tag array, way-tag buffer and way decoder.
• L1 cache remains inbuilt with processor and area overhead is
a major drawback in this architecture.
• Layout designers has to handle place and route process very
carefully.
Publication
References
• [1]. An Energy-Efficient L2 Cache Architecture Using Way Tag Information Under Write-Through Policy,
Jianwei Dai and Lei Wang, Senior Member, IEEE, IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, Vol. 21, No. 1, January 2013
• [2]. G. Konstadinidis, K. Normoyle, S. Wong, S. Bhutani, H. Stuimer, T. Johnson, A. Smith, D. Cheung, F.
Romano, S. Yu, S. Oh, V.Melamed, S. Narayanan, D. Bunsey, C. Khieu, K. J. Wu, R. Schmitt, A. Dumlao,
M. Sutera, J. Chau, andK. J. Lin, “Implementation of a third-generation 1.1-GHz 64-bit microprocessor,”
IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1461–1469, Nov. 2002.
• [3]. S. Rusu, J. Stinson, S. Tam, J. Leung, H. Muljono, and B. Cherkauer, “A 1.5-GHz 130-nm itanium 2
processor with 6-MB on-die L3 cache,” IEEE J. Solid-State Circuits, vol. 38, no. 11, pp. 1887–1895, Nov.
2003.
• [4]. D. Wendell, J. Lin, P. Kaushik, S. Seshadri, A. Wang, V. Sundararaman, P. Wang, H. McIntyre, S. Kim,
W. Hsu, H. Park, G. Levinsky, J. Lu, M. Chirania, R. Heald, and P. Lazar, “A 4 MB on-chip L2 cache for a
90 nm 1.6 GHz 64 bit SPARC microprocessor,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.
Papers, 2004, pp. 66–67.
• [5]. http://en.wikipedia.org/wiki/CPU_cache
• [6]. C. Su and A. Despain, “Cache design tradeoffs for power and performance optimization: A case study,”
in Proc. Int. Symp. Low Power Electron. Design, 1997, pp. 63–68.
• [7]. K. Ghose and M. B.Kamble, “Reducing power in superscalar processor caches using subbanking,
multiple line buffers and bit-line segmentation,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp.
70–75.
• [8]. C. Zhang, F. Vahid, and W. Najjar, “A highly-configurable cache architecture for embedded systems,”
in Proc. Int. Symp. Comput. Arch., 2003, pp. 136–146.
• [9]. K. Inoue, T. Ishihara, and K. Murakami, “Way-predicting set-associative cache for high performance
and low energy consumption,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp. 273–275.
• [10]. A.Ma, M. Zhang, and K.Asanovi, “Way memoization to reduce fetch energy in instruction caches,” in
Proc. ISCA Workshop Complexity Effective Design, 2001, pp. 1–9.
• [11]. T. Ishihara and F. Fallah, “A way memorization technique for reducing power consumption of caches
in application specific integrated processors,” in Proc. Design Autom. Test Euro. Conf., 2005, pp. 358–363.
• [12]. R. Min, W. Jone, and Y. Hu, “Location cache: A low-power L2 cache system,” in Proc. Int. Symp.
Low Power Electron. Design, 2004, pp. 120–125.
Project Presentation Final

More Related Content

What's hot

Memory management early_systems
Memory management early_systemsMemory management early_systems
Memory management early_systems
Mybej Che
 
Addressing mode Computer Architecture
Addressing mode  Computer ArchitectureAddressing mode  Computer Architecture
Addressing mode Computer Architecture
Haris456
 

What's hot (20)

Dns resource record
Dns resource recordDns resource record
Dns resource record
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
Introduction to System Calls
Introduction to System CallsIntroduction to System Calls
Introduction to System Calls
 
Introduction to Assembly Language Programming
Introduction to Assembly Language ProgrammingIntroduction to Assembly Language Programming
Introduction to Assembly Language Programming
 
Cache
CacheCache
Cache
 
Memory management early_systems
Memory management early_systemsMemory management early_systems
Memory management early_systems
 
OS Components and Structure
OS Components and StructureOS Components and Structure
OS Components and Structure
 
Addressing mode Computer Architecture
Addressing mode  Computer ArchitectureAddressing mode  Computer Architecture
Addressing mode Computer Architecture
 
Cache memory
Cache memoryCache memory
Cache memory
 
Chapter 10 - File System Interface
Chapter 10 - File System InterfaceChapter 10 - File System Interface
Chapter 10 - File System Interface
 
Memory Management in OS
Memory Management in OSMemory Management in OS
Memory Management in OS
 
File System Implementation - Part1
File System Implementation - Part1File System Implementation - Part1
File System Implementation - Part1
 
File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing models
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
 
Logical addressing
Logical  addressingLogical  addressing
Logical addressing
 
Advanced computer architechture -Memory Hierarchies and its Properties and Type
Advanced computer architechture -Memory Hierarchies and its Properties and TypeAdvanced computer architechture -Memory Hierarchies and its Properties and Type
Advanced computer architechture -Memory Hierarchies and its Properties and Type
 
Cache memory
Cache memoryCache memory
Cache memory
 
Distributed File Systems
Distributed File Systems Distributed File Systems
Distributed File Systems
 
Distributed Shared Memory
Distributed Shared MemoryDistributed Shared Memory
Distributed Shared Memory
 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
 

Viewers also liked

Emerging Non-Volatile Memories patent landscape 2014
Emerging Non-Volatile Memories patent landscape 2014Emerging Non-Volatile Memories patent landscape 2014
Emerging Non-Volatile Memories patent landscape 2014
Knowmade
 

Viewers also liked (10)

A Novel Architecture Design & Characterization of CAM Controller IP Core with...
A Novel Architecture Design & Characterization of CAM Controller IP Core with...A Novel Architecture Design & Characterization of CAM Controller IP Core with...
A Novel Architecture Design & Characterization of CAM Controller IP Core with...
 
TCAM Design using Flash Transistors
TCAM Design using Flash TransistorsTCAM Design using Flash Transistors
TCAM Design using Flash Transistors
 
Emerging Non-Volatile Memories patent landscape 2014
Emerging Non-Volatile Memories patent landscape 2014Emerging Non-Volatile Memories patent landscape 2014
Emerging Non-Volatile Memories patent landscape 2014
 
Low power vlsi design
Low power vlsi designLow power vlsi design
Low power vlsi design
 
Memristor overview
Memristor overviewMemristor overview
Memristor overview
 
Low power VLSI design
Low power VLSI designLow power VLSI design
Low power VLSI design
 
Memristor
MemristorMemristor
Memristor
 
Memristor
MemristorMemristor
Memristor
 
SRAM
SRAMSRAM
SRAM
 
Memristor ppt
Memristor pptMemristor ppt
Memristor ppt
 

Similar to Project Presentation Final

Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Community
 

Similar to Project Presentation Final (20)

Best storage engine for MySQL
Best storage engine for MySQLBest storage engine for MySQL
Best storage engine for MySQL
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Computre_Engineering_Introduction_FPGA.ppt
Computre_Engineering_Introduction_FPGA.pptComputre_Engineering_Introduction_FPGA.ppt
Computre_Engineering_Introduction_FPGA.ppt
 
No sql presentation
No sql presentationNo sql presentation
No sql presentation
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Main Memory
Main MemoryMain Memory
Main Memory
 
NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
Auxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptxAuxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptx
 
Efficient node bootstrapping for decentralised shared-nothing Key-Value Stores
Efficient node bootstrapping for decentralised shared-nothing Key-Value StoresEfficient node bootstrapping for decentralised shared-nothing Key-Value Stores
Efficient node bootstrapping for decentralised shared-nothing Key-Value Stores
 
What should be done to IR algorithms to meet current, and possible future, ha...
What should be done to IR algorithms to meet current, and possible future, ha...What should be done to IR algorithms to meet current, and possible future, ha...
What should be done to IR algorithms to meet current, and possible future, ha...
 
Cache Memory.pptx
Cache Memory.pptxCache Memory.pptx
Cache Memory.pptx
 
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
 
PAGIN AND SEGMENTATION.docx
PAGIN AND SEGMENTATION.docxPAGIN AND SEGMENTATION.docx
PAGIN AND SEGMENTATION.docx
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
 
Os unit 3
Os unit 3Os unit 3
Os unit 3
 
Postgres db performance improvements
Postgres db performance improvementsPostgres db performance improvements
Postgres db performance improvements
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
 

Project Presentation Final

  • 1. AN ENERGY EFFICIENT CACHE MEMORY DESIGN USING VERILOG HDL NAME: DHRITIMAN HALDER USN: 1RE12LVS05 DEPARTMENT: M.TECH (VLSI DESIGN AND EMBEDDED SYSTEMS) SEMESTER: 4TH COURSE CODE: 12LVS43 SUBJECT CODE: 12EC943 UNDER GUIDANCE OF: PROF. PRASAD S.N
  • 2. CONTENTS • Introduction to Cache Memory • Types of Cache Memory • Cache Read Operation • Different Cache Mapping Techniques • Cache Write Operation • Different Write Policies • Problem Definition • Literature Survey • Proposed Work • Results • Conclusion • Publication • References
  • 3. Introduction to Cache Memory • Processor requires data and instructions while performing a specific task. • Data and instructions are stored in main memory. • A cache memory keeps necessary data and instructions to accelerate the speed of operation. Cache and Main Memory
  • 4. Types of Cache Memory • Data Cache: A data cache is used to speed up data fetch and store which is usually organized as a hierarchy of more levels (L1,L2 etc). • Instruction Cache: An instruction cache to speed up executable instruction fetch. • Translation Lookside Buffer: A translation lookside buffer (TLB) translates the virtual address into physical address of requested data and instruction faster.
  • 5. Cache Read Operation • Flowchart of Cache Read Operation
  • 6. Different Cache Mapping Techniques • Direct Mapping: Any main memory location can be loaded into one fixed location in cache. – Advantage- No search is required as there is only one location in cache for each main memory location. – Disadvantage- Hit ratio is poor as there is only one fixed location for each main memory location.
  • 7. Different Cache Mapping Techniques • Fully Associative Mapping: Any main memory location can be placed in any location in cache. – Advantage- Hit ratio is increased as a main memory location can take any location in any cache. – Disadvantage- Consumes a lot of energy because controller unit needs to search all tag patterns to ensure the data is present or not.
  • 8. Different Cache Mapping Techniques • Set Associative Mapping: Any main memory location can be loaded to at least two or more memory location in cache. – Advantage- Hit ratio is more compared to direct mapped cache and amount of energy consumption is less as compared to fully associative cache because limited no tag patterns needs to be searched.
  • 9. Cache Write Operation • Flowchart of cache write operation
  • 10. Different Write Policies • Write-back Policy: In write-back cache only cache memory is updated during write operation and marked with a dirty bit. Main memory is updated later when the data block is to be replaced. – Advantage- Consumes less energy during write operation because main memory is not updated simultaneously. – Disadvantage- Algorithm of write operation is complex and often result data inconsistency.
  • 11. Different Write Policies • Write-through Policy: In write-through cache both cache memory and main memory is updated simultaneously during write operation. – Advantage- Maintains data consistency through the memory hierarchy. – Disadvantage- Consumes a lot of energy due to increased access at lower level.
  • 12. Different Write Policies • Write-around Policy: In write-around cache only main memory is updated during write operation. – Advantage- Consumes less energy and maintains data consistency. – Disadvantage- Very much application specific and used where recently executed data will not be required again.
  • 13. Problem Definition • Cache Coherence: In a multiprocessor system or in a multi-core processor data inconsistency may occur between adjacent levels or within same level because of data sharing with main memory, process migration etc. • Soft Error: Due to radiation effect data stored in memory becomes erroneous which is known as soft error. • Solution: Write-through is preferred because it updates data in main memory simultaneously and maintains data consistency. • Problem: Write-through cache consumes a lot of energy due to increased access at lower level.
  • 14. Literature Survey • Partitioning Cache Data Array into Sub-banks – Cache data array is partitioned into several segments horizontally. – Each segment can be powered up individually. – The only segment that contains required data / instruction is powered up. – The amount of power savings is reduced by eliminating unnecessary access.
  • 15. Literature Survey • Division of Bit-Lines into Small Segmentation – Each column of bit-lines are split into several segmentation. – All segments are connected to a common line (pre-charged high). – Address decoder identifies the segment targeted by row address and isolates all but targeted segment. – Power consumption is less as capacitive loading is reduced.
  • 16. Literature Survey • Way Concatenation Technique – The memory address is split into a line-offset field, an index field and a tag field. – The cache decodes index field of address and compares tag with address tag field. – If the is a match the multiplexor routes the cache data to the output.
  • 17. Literature Survey • Location Cache – Works in parallel with TLB and L1 cache. – On read miss the way information is available because the physical address is translated by TLB. – The L2 cache is accessed in direct mapped cache.
  • 18. Proposed Work • Conventional Cache Architecture
  • 19. Proposed Work • Way-tagged Cache Architecture
  • 20. Proposed Work • Amount of Energy Consumption  Estimated by CACTI 5.3 for a 90-nm CMOS process
  • 29. Conclusion • Several new components has been introduced in L1 cache such as way-tag array, way-tag buffer and way decoder. • L1 cache remains inbuilt with processor and area overhead is a major drawback in this architecture. • Layout designers has to handle place and route process very carefully.
  • 31. References • [1]. An Energy-Efficient L2 Cache Architecture Using Way Tag Information Under Write-Through Policy, Jianwei Dai and Lei Wang, Senior Member, IEEE, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 21, No. 1, January 2013 • [2]. G. Konstadinidis, K. Normoyle, S. Wong, S. Bhutani, H. Stuimer, T. Johnson, A. Smith, D. Cheung, F. Romano, S. Yu, S. Oh, V.Melamed, S. Narayanan, D. Bunsey, C. Khieu, K. J. Wu, R. Schmitt, A. Dumlao, M. Sutera, J. Chau, andK. J. Lin, “Implementation of a third-generation 1.1-GHz 64-bit microprocessor,” IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1461–1469, Nov. 2002. • [3]. S. Rusu, J. Stinson, S. Tam, J. Leung, H. Muljono, and B. Cherkauer, “A 1.5-GHz 130-nm itanium 2 processor with 6-MB on-die L3 cache,” IEEE J. Solid-State Circuits, vol. 38, no. 11, pp. 1887–1895, Nov. 2003. • [4]. D. Wendell, J. Lin, P. Kaushik, S. Seshadri, A. Wang, V. Sundararaman, P. Wang, H. McIntyre, S. Kim, W. Hsu, H. Park, G. Levinsky, J. Lu, M. Chirania, R. Heald, and P. Lazar, “A 4 MB on-chip L2 cache for a 90 nm 1.6 GHz 64 bit SPARC microprocessor,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2004, pp. 66–67. • [5]. http://en.wikipedia.org/wiki/CPU_cache • [6]. C. Su and A. Despain, “Cache design tradeoffs for power and performance optimization: A case study,” in Proc. Int. Symp. Low Power Electron. Design, 1997, pp. 63–68. • [7]. K. Ghose and M. B.Kamble, “Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp. 70–75. • [8]. C. Zhang, F. Vahid, and W. Najjar, “A highly-configurable cache architecture for embedded systems,” in Proc. Int. Symp. Comput. Arch., 2003, pp. 136–146. • [9]. K. Inoue, T. Ishihara, and K. Murakami, “Way-predicting set-associative cache for high performance and low energy consumption,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp. 273–275. • [10]. A.Ma, M. Zhang, and K.Asanovi, “Way memoization to reduce fetch energy in instruction caches,” in Proc. ISCA Workshop Complexity Effective Design, 2001, pp. 1–9. • [11]. T. Ishihara and F. Fallah, “A way memorization technique for reducing power consumption of caches in application specific integrated processors,” in Proc. Design Autom. Test Euro. Conf., 2005, pp. 358–363. • [12]. R. Min, W. Jone, and Y. Hu, “Location cache: A low-power L2 cache system,” in Proc. Int. Symp. Low Power Electron. Design, 2004, pp. 120–125.