AN ENERGY EFFICIENT CACHE MEMORY
DESIGN USING VERILOG HDL
NAME: DHRITIMAN HALDER
USN: 1RE12LVS05
DEPARTMENT: M.TECH (VLSI DESIGN AND EMBEDDED
SYSTEMS)
SEMESTER: 4TH
COURSE CODE: 12LVS43
SUBJECT CODE: 12EC943
UNDER GUIDANCE OF: PROF. PRASAD S.N
CONTENTS
• Introduction to Cache Memory
• Types of Cache Memory
• Cache Read Operation
• Different Cache Mapping Techniques
• Cache Write Operation
• Different Write Policies
• Problem Definition
• Literature Survey
• Proposed Work
• Results
• Conclusion
• Publication
• References
Introduction to Cache Memory
• Processor requires data and instructions while performing a
specific task.
• Data and instructions are stored in main memory.
• A cache memory keeps necessary data and instructions to
accelerate the speed of operation.
Cache and Main Memory
Types of Cache Memory
• Data Cache: A data cache is used to speed up data fetch and
store which is usually organized as a hierarchy of more levels
(L1,L2 etc).
• Instruction Cache: An instruction cache to speed up
executable instruction fetch.
• Translation Lookside Buffer: A translation lookside buffer
(TLB) translates the virtual address into physical address of
requested data and instruction faster.
Cache Read Operation
• Flowchart of Cache Read Operation
Different Cache Mapping Techniques
• Direct Mapping: Any main memory location can be loaded
into one fixed location in cache.
– Advantage- No search is required as there is only one location in
cache for each main memory location.
– Disadvantage- Hit ratio is poor as there is only one fixed location
for each main memory location.
Different Cache Mapping Techniques
• Fully Associative Mapping: Any main memory location can be
placed in any location in cache.
– Advantage- Hit ratio is increased as a main memory location can
take any location in any cache.
– Disadvantage- Consumes a lot of energy because controller unit
needs to search all tag patterns to ensure the data is present or not.
Different Cache Mapping Techniques
• Set Associative Mapping: Any main memory location can be
loaded to at least two or more memory location in cache.
– Advantage- Hit ratio is more compared to direct mapped cache and
amount of energy consumption is less as compared to fully
associative cache because limited no tag patterns needs to be
searched.
Cache Write Operation
• Flowchart of cache write operation
Different Write Policies
• Write-back Policy: In write-back cache only cache memory is
updated during write operation and marked with a dirty bit. Main
memory is updated later when the data block is to be replaced.
– Advantage- Consumes less energy during write operation because
main memory is not updated simultaneously.
– Disadvantage- Algorithm of write operation is complex and often
result data inconsistency.
Different Write Policies
• Write-through Policy: In write-through cache both cache
memory and main memory is updated simultaneously during
write operation.
– Advantage- Maintains data consistency through the memory
hierarchy.
– Disadvantage- Consumes a lot of energy due to increased access at
lower level.
Different Write Policies
• Write-around Policy: In write-around cache only main
memory is updated during write operation.
– Advantage- Consumes less energy and maintains data consistency.
– Disadvantage- Very much application specific and used where
recently executed data will not be required again.
Problem Definition
• Cache Coherence: In a multiprocessor system or in a multi-core
processor data inconsistency may occur between adjacent levels or
within same level because of data sharing with main memory, process
migration etc.
• Soft Error: Due to radiation effect data stored in memory becomes
erroneous which is known as soft error.
• Solution: Write-through is preferred because it updates data in main
memory simultaneously and maintains data consistency.
• Problem: Write-through cache consumes a lot of energy due to
increased access at lower level.
Literature Survey
• Partitioning Cache Data Array into Sub-banks
– Cache data array is partitioned into several segments horizontally.
– Each segment can be powered up individually.
– The only segment that contains required data / instruction is
powered up.
– The amount of power savings is reduced by eliminating
unnecessary access.
Literature Survey
• Division of Bit-Lines into Small Segmentation
– Each column of bit-lines are split into several segmentation.
– All segments are connected to a common line (pre-charged high).
– Address decoder identifies the segment targeted by row address and
isolates all but targeted segment.
– Power consumption is less as capacitive loading is reduced.
Literature Survey
• Way Concatenation Technique
– The memory address is split into a line-offset field, an index field
and a tag field.
– The cache decodes index field of address and compares tag with
address tag field.
– If the is a match the multiplexor routes the cache data to the output.
Literature Survey
• Location Cache
– Works in parallel with TLB and L1 cache.
– On read miss the way information is available because the physical
address is translated by TLB.
– The L2 cache is accessed in direct mapped cache.
Proposed Work
• Conventional Cache Architecture
Proposed Work
• Way-tagged Cache Architecture
Proposed Work
• Amount of Energy Consumption
 Estimated by CACTI 5.3 for a 90-nm CMOS process
Results
• Main Memory
Results
• L1 Cache Reset
Results
• L2 Cache Reset
Results
• Cache Data Load
Results
• L1 Cache Read Hit
Results
• L2 Cache Read Hit
Results
• Read Miss
Results
• Cache Write
Conclusion
• Several new components has been introduced in L1 cache
such as way-tag array, way-tag buffer and way decoder.
• L1 cache remains inbuilt with processor and area overhead is
a major drawback in this architecture.
• Layout designers has to handle place and route process very
carefully.
Publication
References
• [1]. An Energy-Efficient L2 Cache Architecture Using Way Tag Information Under Write-Through Policy,
Jianwei Dai and Lei Wang, Senior Member, IEEE, IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, Vol. 21, No. 1, January 2013
• [2]. G. Konstadinidis, K. Normoyle, S. Wong, S. Bhutani, H. Stuimer, T. Johnson, A. Smith, D. Cheung, F.
Romano, S. Yu, S. Oh, V.Melamed, S. Narayanan, D. Bunsey, C. Khieu, K. J. Wu, R. Schmitt, A. Dumlao,
M. Sutera, J. Chau, andK. J. Lin, “Implementation of a third-generation 1.1-GHz 64-bit microprocessor,”
IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1461–1469, Nov. 2002.
• [3]. S. Rusu, J. Stinson, S. Tam, J. Leung, H. Muljono, and B. Cherkauer, “A 1.5-GHz 130-nm itanium 2
processor with 6-MB on-die L3 cache,” IEEE J. Solid-State Circuits, vol. 38, no. 11, pp. 1887–1895, Nov.
2003.
• [4]. D. Wendell, J. Lin, P. Kaushik, S. Seshadri, A. Wang, V. Sundararaman, P. Wang, H. McIntyre, S. Kim,
W. Hsu, H. Park, G. Levinsky, J. Lu, M. Chirania, R. Heald, and P. Lazar, “A 4 MB on-chip L2 cache for a
90 nm 1.6 GHz 64 bit SPARC microprocessor,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.
Papers, 2004, pp. 66–67.
• [5]. http://en.wikipedia.org/wiki/CPU_cache
• [6]. C. Su and A. Despain, “Cache design tradeoffs for power and performance optimization: A case study,”
in Proc. Int. Symp. Low Power Electron. Design, 1997, pp. 63–68.
• [7]. K. Ghose and M. B.Kamble, “Reducing power in superscalar processor caches using subbanking,
multiple line buffers and bit-line segmentation,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp.
70–75.
• [8]. C. Zhang, F. Vahid, and W. Najjar, “A highly-configurable cache architecture for embedded systems,”
in Proc. Int. Symp. Comput. Arch., 2003, pp. 136–146.
• [9]. K. Inoue, T. Ishihara, and K. Murakami, “Way-predicting set-associative cache for high performance
and low energy consumption,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp. 273–275.
• [10]. A.Ma, M. Zhang, and K.Asanovi, “Way memoization to reduce fetch energy in instruction caches,” in
Proc. ISCA Workshop Complexity Effective Design, 2001, pp. 1–9.
• [11]. T. Ishihara and F. Fallah, “A way memorization technique for reducing power consumption of caches
in application specific integrated processors,” in Proc. Design Autom. Test Euro. Conf., 2005, pp. 358–363.
• [12]. R. Min, W. Jone, and Y. Hu, “Location cache: A low-power L2 cache system,” in Proc. Int. Symp.
Low Power Electron. Design, 2004, pp. 120–125.
Project Presentation Final

Project Presentation Final

  • 1.
    AN ENERGY EFFICIENTCACHE MEMORY DESIGN USING VERILOG HDL NAME: DHRITIMAN HALDER USN: 1RE12LVS05 DEPARTMENT: M.TECH (VLSI DESIGN AND EMBEDDED SYSTEMS) SEMESTER: 4TH COURSE CODE: 12LVS43 SUBJECT CODE: 12EC943 UNDER GUIDANCE OF: PROF. PRASAD S.N
  • 2.
    CONTENTS • Introduction toCache Memory • Types of Cache Memory • Cache Read Operation • Different Cache Mapping Techniques • Cache Write Operation • Different Write Policies • Problem Definition • Literature Survey • Proposed Work • Results • Conclusion • Publication • References
  • 3.
    Introduction to CacheMemory • Processor requires data and instructions while performing a specific task. • Data and instructions are stored in main memory. • A cache memory keeps necessary data and instructions to accelerate the speed of operation. Cache and Main Memory
  • 4.
    Types of CacheMemory • Data Cache: A data cache is used to speed up data fetch and store which is usually organized as a hierarchy of more levels (L1,L2 etc). • Instruction Cache: An instruction cache to speed up executable instruction fetch. • Translation Lookside Buffer: A translation lookside buffer (TLB) translates the virtual address into physical address of requested data and instruction faster.
  • 5.
    Cache Read Operation •Flowchart of Cache Read Operation
  • 6.
    Different Cache MappingTechniques • Direct Mapping: Any main memory location can be loaded into one fixed location in cache. – Advantage- No search is required as there is only one location in cache for each main memory location. – Disadvantage- Hit ratio is poor as there is only one fixed location for each main memory location.
  • 7.
    Different Cache MappingTechniques • Fully Associative Mapping: Any main memory location can be placed in any location in cache. – Advantage- Hit ratio is increased as a main memory location can take any location in any cache. – Disadvantage- Consumes a lot of energy because controller unit needs to search all tag patterns to ensure the data is present or not.
  • 8.
    Different Cache MappingTechniques • Set Associative Mapping: Any main memory location can be loaded to at least two or more memory location in cache. – Advantage- Hit ratio is more compared to direct mapped cache and amount of energy consumption is less as compared to fully associative cache because limited no tag patterns needs to be searched.
  • 9.
    Cache Write Operation •Flowchart of cache write operation
  • 10.
    Different Write Policies •Write-back Policy: In write-back cache only cache memory is updated during write operation and marked with a dirty bit. Main memory is updated later when the data block is to be replaced. – Advantage- Consumes less energy during write operation because main memory is not updated simultaneously. – Disadvantage- Algorithm of write operation is complex and often result data inconsistency.
  • 11.
    Different Write Policies •Write-through Policy: In write-through cache both cache memory and main memory is updated simultaneously during write operation. – Advantage- Maintains data consistency through the memory hierarchy. – Disadvantage- Consumes a lot of energy due to increased access at lower level.
  • 12.
    Different Write Policies •Write-around Policy: In write-around cache only main memory is updated during write operation. – Advantage- Consumes less energy and maintains data consistency. – Disadvantage- Very much application specific and used where recently executed data will not be required again.
  • 13.
    Problem Definition • CacheCoherence: In a multiprocessor system or in a multi-core processor data inconsistency may occur between adjacent levels or within same level because of data sharing with main memory, process migration etc. • Soft Error: Due to radiation effect data stored in memory becomes erroneous which is known as soft error. • Solution: Write-through is preferred because it updates data in main memory simultaneously and maintains data consistency. • Problem: Write-through cache consumes a lot of energy due to increased access at lower level.
  • 14.
    Literature Survey • PartitioningCache Data Array into Sub-banks – Cache data array is partitioned into several segments horizontally. – Each segment can be powered up individually. – The only segment that contains required data / instruction is powered up. – The amount of power savings is reduced by eliminating unnecessary access.
  • 15.
    Literature Survey • Divisionof Bit-Lines into Small Segmentation – Each column of bit-lines are split into several segmentation. – All segments are connected to a common line (pre-charged high). – Address decoder identifies the segment targeted by row address and isolates all but targeted segment. – Power consumption is less as capacitive loading is reduced.
  • 16.
    Literature Survey • WayConcatenation Technique – The memory address is split into a line-offset field, an index field and a tag field. – The cache decodes index field of address and compares tag with address tag field. – If the is a match the multiplexor routes the cache data to the output.
  • 17.
    Literature Survey • LocationCache – Works in parallel with TLB and L1 cache. – On read miss the way information is available because the physical address is translated by TLB. – The L2 cache is accessed in direct mapped cache.
  • 18.
  • 19.
    Proposed Work • Way-taggedCache Architecture
  • 20.
    Proposed Work • Amountof Energy Consumption  Estimated by CACTI 5.3 for a 90-nm CMOS process
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
    Conclusion • Several newcomponents has been introduced in L1 cache such as way-tag array, way-tag buffer and way decoder. • L1 cache remains inbuilt with processor and area overhead is a major drawback in this architecture. • Layout designers has to handle place and route process very carefully.
  • 30.
  • 31.
    References • [1]. AnEnergy-Efficient L2 Cache Architecture Using Way Tag Information Under Write-Through Policy, Jianwei Dai and Lei Wang, Senior Member, IEEE, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 21, No. 1, January 2013 • [2]. G. Konstadinidis, K. Normoyle, S. Wong, S. Bhutani, H. Stuimer, T. Johnson, A. Smith, D. Cheung, F. Romano, S. Yu, S. Oh, V.Melamed, S. Narayanan, D. Bunsey, C. Khieu, K. J. Wu, R. Schmitt, A. Dumlao, M. Sutera, J. Chau, andK. J. Lin, “Implementation of a third-generation 1.1-GHz 64-bit microprocessor,” IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1461–1469, Nov. 2002. • [3]. S. Rusu, J. Stinson, S. Tam, J. Leung, H. Muljono, and B. Cherkauer, “A 1.5-GHz 130-nm itanium 2 processor with 6-MB on-die L3 cache,” IEEE J. Solid-State Circuits, vol. 38, no. 11, pp. 1887–1895, Nov. 2003. • [4]. D. Wendell, J. Lin, P. Kaushik, S. Seshadri, A. Wang, V. Sundararaman, P. Wang, H. McIntyre, S. Kim, W. Hsu, H. Park, G. Levinsky, J. Lu, M. Chirania, R. Heald, and P. Lazar, “A 4 MB on-chip L2 cache for a 90 nm 1.6 GHz 64 bit SPARC microprocessor,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2004, pp. 66–67. • [5]. http://en.wikipedia.org/wiki/CPU_cache • [6]. C. Su and A. Despain, “Cache design tradeoffs for power and performance optimization: A case study,” in Proc. Int. Symp. Low Power Electron. Design, 1997, pp. 63–68. • [7]. K. Ghose and M. B.Kamble, “Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp. 70–75. • [8]. C. Zhang, F. Vahid, and W. Najjar, “A highly-configurable cache architecture for embedded systems,” in Proc. Int. Symp. Comput. Arch., 2003, pp. 136–146. • [9]. K. Inoue, T. Ishihara, and K. Murakami, “Way-predicting set-associative cache for high performance and low energy consumption,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp. 273–275. • [10]. A.Ma, M. Zhang, and K.Asanovi, “Way memoization to reduce fetch energy in instruction caches,” in Proc. ISCA Workshop Complexity Effective Design, 2001, pp. 1–9. • [11]. T. Ishihara and F. Fallah, “A way memorization technique for reducing power consumption of caches in application specific integrated processors,” in Proc. Design Autom. Test Euro. Conf., 2005, pp. 358–363. • [12]. R. Min, W. Jone, and Y. Hu, “Location cache: A low-power L2 cache system,” in Proc. Int. Symp. Low Power Electron. Design, 2004, pp. 120–125.