Energy-Efficient Virtual Memory
       System Design for SSD
                      Chia-Lin Yang
                Embedded ...
Embedded Computing Lab

Low-Power 3D graphics processing unit (GPU)
design
  Power gating and DVS strategies at the archit...
Embedded System Complexity Is
             Increasing




                                   Personal information
        ...
Operating System for Supporting
          Multi-tasking & Virtual Memory System



                       Operating System...
Disk as Storage

 Traditional virtual memory                         Page Table
system assumes disk as the                ...
Flash Memory as Storage
Flash memory has become popular
storages in mobile devices
  • low power
  • light weight
  • shoc...
Flash Memory as the
                       Secondary Storage ..
 Flash memory has very different
                         ...
Outline
Motivation
Background on Flash Memory
Interplay between VM and FM
Proposed Energy-Efficient VM Design
  Subpaging
...
Flash Storage System
            Architecture
             Logic Block Address

             address translation table
   ...
Organization of a Typical
            NAND Flash Memory
                       1 Page
Read/Write
 one page


  Block 0
  B...
Flash Memory Characteristics

Write once
  Written page can not be overwritten




                               Flash bl...
Flash Memory Characteristics
 Write once
   Written page can not be overwritten



                     Flash Memory


   ...
Flash Memory Characteristics

Write once
  Written page can not be overwritten
Out-place update

                    Flash...
Flash Memory
               Characteristics (cont’d)
When # of free pages <= GCt (Garbage Collection Threshold)
 Trigger G...
Flash Memory
          Characteristics (cont’d)
Garbage collection to reclaim dead pages
  Live data copying
  Block erase...
Flash Memory
          Characteristics (cont’d)
Garbage collection overheads
  Live data copying
  Block erase
           ...
Writes are Problematic

Writes consume more energy than reads
Frequent writes result in dead pages on
flash memory
  Trigg...
Key Design Principles for Energy-
            Efficient Flash Memory

Reduce writes to flash memory
Efficient garbage coll...
Interplay between VM and FM

   A memory page contains n flash pages
       At a page fault, n flash pages of the victim v...
Unnecessary Writes from
                Replacing a Virtual Page
     In conventional virtual memory system, a
     full v...
Unnecessary Writes from
                Replacing a Virtual Page
     In conventional virtual memory system, a
     full v...
Dirty Ratio

  A victim page often contains a significant
  amount of unmodified data.

Application   kword     mozilla   ...
What is Intra-page Locality?

 Flash pages in one main memory page
 are written to flash memory back to back
          Mai...
Why is Preserving Intra-Page
                       Locality Important?

It affects the efficiency of garbage
collection
 ...
Garbage Collection Threshold
                                             vs. Intra-Page Locality
GCt : garbage collection...
Proposed Energy Efficient
          VM Design
Reduce # of writes to flash memory
  Subpaging
  HotCache
Efficient garbage ...
Subpaging
   Divide a virtual memory page into a set of
   subpages in the granularity of flash page size
   Each subpage ...
HotCache

HotCache                            address translation table
                                                  ...
How to Capture Hot Data?
Three management policies
  Two-level LRU (2L)
  Time frequency (TF)
    Replace the HotCache blo...
Duplication-Aware
                             Garbage Collection
     Exploit data redundancy between the main
     memor...
Duplication-Aware Garbage
               Collection (cont’d.)

                        SWAP system
Read(LBA) Write(LBA,PID...
Experimental Setup
Trace-driven simulation             Application   Description
   Valgrind: captures the memory
   acces...
Subpaging
                                  1
Normalized Energy Consumption   0.9
                                0.8
    ...
HotCache:
               Hit Rates & Energy Savings

Cache size                      512KB
Replacement
                   ...
HotCache:
Normalized Energy Consumption                                 Energy Breakdown
                                 ...
Duplication-Aware Garbage
                                                        Collection
                             ...
HotCache + Subpaging + DA-
                                                        GC
                                 1

...
Conclusion
We revisit virtual memory system design with
flash memory as the secondary storage
Three energy-efficient VM de...
On-Going Works
                                                    Flash Memory Controller

SSD in server platform        ...
Upcoming SlideShare
Loading in …5
×

Energy-Efficient Virtual Memory System Design for SSD

634 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
634
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Energy-Efficient Virtual Memory System Design for SSD

  1. 1. Energy-Efficient Virtual Memory System Design for SSD Chia-Lin Yang Embedded Computing Lab Department of Computer Science and Information Engineering National Taiwan University
  2. 2. Embedded Computing Lab Low-Power 3D graphics processing unit (GPU) design Power gating and DVS strategies at the architectural level PPT: Joint Power/Performance/Thermal management of DRAMs in multi-core systems Orchestrating thread scheduling and page allocations RARE: Resource-Aware Runtime Environment for thread scheduling in multi-core systems Energy-efficient flash memory system
  3. 3. Embedded System Complexity Is Increasing Personal information management Video conference GPS
  4. 4. Operating System for Supporting Multi-tasking & Virtual Memory System Operating System Page Table 0 Task Task 1 Task 1 Secondary storage 0 Task 1 Task 0 0 Task 1 Main Memory
  5. 5. Disk as Storage Traditional virtual memory Page Table system assumes disk as the 0 1 secondary storage Disk Buffer 1 0 1 0 Hard Disk 0 1 Main Memory
  6. 6. Flash Memory as Storage Flash memory has become popular storages in mobile devices • low power • light weight • shock resistant
  7. 7. Flash Memory as the Secondary Storage .. Flash memory has very different Page Table characteristics from disk 0 1 Write once 1 0 Out-place update 1 0 0 Garbage collection Flash Memory 1 Main Memory 1. The need to revisit virtual memory system design 2. Energy-efficiency is the main design concern Energy-Aware Flash Memory Management in Virtual Memory System , IEEE transaction on VLSI An Energy-Efficient Virtual Memory System with Flash Memory as the Secondary Storage, islped’07
  8. 8. Outline Motivation Background on Flash Memory Interplay between VM and FM Proposed Energy-Efficient VM Design Subpaging HotCache Duplication-Aware Garbage Collection Experimental Results Conclusions
  9. 9. Flash Storage System Architecture Logic Block Address address translation table LBA Physical address (bank, block, page) Garbage 0 (0, 0, 3) Collection FTL layer 1 (0, 1, 2) 2 (1, 2, 1) … … Physical address MTD layer Command translation Physical device Flash Memory
  10. 10. Organization of a Typical NAND Flash Memory 1 Page Read/Write one page Block 0 Block 1 Block 2 Erase one Block 3 block … … … … Samsung K9F1208R0B 1 Block = 32 pages 1 Page = 512B
  11. 11. Flash Memory Characteristics Write once Written page can not be overwritten Flash block Flash Memory A free page
  12. 12. Flash Memory Characteristics Write once Written page can not be overwritten Flash Memory Live page Flash block A A live page A free page
  13. 13. Flash Memory Characteristics Write once Written page can not be overwritten Out-place update Flash Memory Dead page Flash block A’ New data A dead page A live page A free page
  14. 14. Flash Memory Characteristics (cont’d) When # of free pages <= GCt (Garbage Collection Threshold) Trigger Garbage collection to reclaim dead pages Via erase operations Basic unit of erase operations is a block Flash Memory Flash block A dead page A live page A free page
  15. 15. Flash Memory Characteristics (cont’d) Garbage collection to reclaim dead pages Live data copying Block erase Flash Memory Flash block A dead page A live page A free page
  16. 16. Flash Memory Characteristics (cont’d) Garbage collection overheads Live data copying Block erase Flash Memory Flash block A dead page A live page A free page
  17. 17. Writes are Problematic Writes consume more energy than reads Frequent writes result in dead pages on flash memory Trigger frequent garbage collections Operation Latency Energy Read (page) 47.2 ns 679 nJ Write (page) 533 us 7.66 mJ Erase (block) 3 ms 43.21 mJ
  18. 18. Key Design Principles for Energy- Efficient Flash Memory Reduce writes to flash memory Efficient garbage collection Block X recycle block x 2 writes, gain 14 free pages Block Y recycle block y 11 writes, gain 5 free pages An invalided page A live page
  19. 19. Interplay between VM and FM A memory page contains n flash pages At a page fault, n flash pages of the victim virtual page are written back to back to flash memory Two important observations Unnecessary writes from replacing a virtual page Intra-page locality Memory Page Flash Memory Flash 1 Page Size 2 Swap_out() 3 n Writes n
  20. 20. Unnecessary Writes from Replacing a Virtual Page In conventional virtual memory system, a full victim page is written to the secondary storage. Memory Page Flash Page Size Clean Data Flash Memory Dirty Data Four Writes Clean Data Clean Data
  21. 21. Unnecessary Writes from Replacing a Virtual Page In conventional virtual memory system, a full victim page is written to the secondary storage. Memory Page Flash Page Size Clean Data Flash Memory Dirty Data Four Writes Clean Data Clean Data
  22. 22. Dirty Ratio A victim page often contains a significant amount of unmodified data. Application kword mozilla kspread openoffice gqview Dirty Ratio 89.73% 48.49% 88.61% 66.59% 98.86% Application kword mozilla kspread openoffice gqview +juk +juk +juk +juk +juk Dirty Ratio 69.41% 40.90% 72.40% 59.31% 97.62% Dirty ratio = the number of dirty 512B block in a dirty memory page the number of 512B blocks in a main memory page
  23. 23. What is Intra-page Locality? Flash pages in one main memory page are written to flash memory back to back Main memory D CD B A0BCD Block X Block Y A1BCD A2BCD Swap_out(A) A0 A1 A2 A3 C0 C1 C2 C3 Virtual A3BCD A4 A5 A6 A7 C4 C5 C6 C7 page A4BCD A0, A1, A2, A3, B0 B1 B2 B3 D0 D1 D2 D3 A5BCD A4, A5, A6, A7 B4 B5 B6 B7 D4 D5 D6 D7 A6BC A7
  24. 24. Why is Preserving Intra-Page Locality Important? It affects the efficiency of garbage collection Block X Block Y Block X Block Y A A A A C C C C After page A, B C C C C A A A A C C C C are swapped out C C C C B B B B D D D D D D D D B B B B D D D D D D D D Block X Block Y Block X Block Y After page A, B A A A A B B B B are swapped out C C C C D D D D C C C C D D D D C C C C D D D D C C C C D D D D B B B B A A A A
  25. 25. Garbage Collection Threshold vs. Intra-Page Locality GCt : garbage collection threshold m: # of flash pages in one memory page n: # of flash pages in one flash block • GCt mod m = 0 • GCt mod n ≥ n − m • GCt mod m ≠ 0 and GCt mod n < n − m 1.55 1.5 ption 1.45 onsum 1.4 1.35 N alized Energy C 1.3 1.25 1.2 1.15 1.1 orm 1.05 1 0.95 255 257 259 261 263 265 267 269 271 273 275 277 279 281 283 285 287 Garbage collection threshold
  26. 26. Proposed Energy Efficient VM Design Reduce # of writes to flash memory Subpaging HotCache Efficient garbage collection Duplication-aware garbage collection
  27. 27. Subpaging Divide a virtual memory page into a set of subpages in the granularity of flash page size Each subpage is associated with a dirty bit. Memory Page Flash 0 Clean Data Page Size Flash Memory One Write 1 Dirty Data to Flash 0 Clean Data 0 Clean Data Dirty Bit
  28. 28. HotCache HotCache address translation table HotCache Physical address Management Policy LBA (f/s, bank, block, page) Management 0 (f, 0, 0, 3) Caching writes only 1 (f, 0, 1, 2) Garbage 2 (s, -, 2, -) Collection Preserving intra- FTL layer … … page locality Capturing hot data MTD layer Command translation Physical device HotCache Flash Memory
  29. 29. How to Capture Hot Data? Three management policies Two-level LRU (2L) Time frequency (TF) Replace the HotCache block with smallest timestamp * write_counts Time frequency locality (TFL) TF policy with intra-page locality preserved head tail ……… 1st level list ……… 2nd level list head tail Two-level LRU
  30. 30. Duplication-Aware Garbage Collection Exploit data redundancy between the main memory and flash memory to eliminate unnecessary live page copying during garbage collection Main Memory Main Memory Dirty Bit 1 Dirty Bit 1 1 D 1 D 0 D D 1 D D B D B D 0 B D D 1 B D D A A A B D D A B D D B Duplication- B A A A Aware Garbage A Collection Flash Memory Flash Memory A A A A C C C C B B B B C C C C An invalided page A free page
  31. 31. Duplication-Aware Garbage Collection (cont’d.) SWAP system Read(LBA) Write(LBA,PID,VPN) Swap_clean(LBA) Swap_free(PID,VPN) FTL Address Translation Table (ATT) LBA Physical address (bank, block, page) Garbage 0 (0, 0, 1) Collection 1 (0, 1, 2) 2 (1, 2, 5) … … Block Allocation Map (BAM) Physical address LBA State PID VPN In_memory (bank, block, page) (0, 0, 1) 0 Free 0 3 1 (0, 0, 2) 9 Invalid 1 8 1 (0, 0, 3) 8 Valid 1 7 0 … … …
  32. 32. Experimental Setup Trace-driven simulation Application Description Valgrind: captures the memory access trace while application executing in real-time. kword word processor Applications kword, kspread, mozilla, kspread spreadsheet openoffice, gqview application Multi-programming workloads: kword+juk, kspread+juk, mozilla web browser mozilla+juk, openoffice+juk, gqview+juk openoffice popular office Configuration suite similar to Main memory Microsoft office 16MB, 4K page size. gqview image viewer Flash memory 16K block, 512B page 128K block, 2KB page. juk MP3 jukebox program
  33. 33. Subpaging 1 Normalized Energy Consumption 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 kword mozilla kspread openoffice gqview kword mozilla kspread openoffice gqview +juk +juk +juk +juk +juk Flash page size = 2KB Flash page size = 512B Smaller flash page size leads to more energy reduction 512B page size: 20% energy reduction on average 2KB page size: 8% energy reduction on average Save more energy for multiprogramming workload Single-program workload: openoffice (14% energy reduction) Multi-program workload: openoffice+juk (31% energy reduction)
  34. 34. HotCache: Hit Rates & Energy Savings Cache size 512KB Replacement FIFO LRU 2L TF TFL Policy Average hit rate 0.41% 0.49% 4.72% 5.02% 5% Cache size 1MB Replacement FIFO LRU 2L TF TFL Policy Average hit rate 3.34% 3.91% 8.62% 10% 9.98%
  35. 35. HotCache: Normalized Energy Consumption Energy Breakdown 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 2L TFTFL 2L TFTFL 2L TFTFL 2L TFTFL 2L TFTFL 2L TF TFL 2L TF TFL 2L TF TFL 2L TF TFL 2L TFTFL kword mozilla kspread openoffice gqview kword+juk mozilla +juk kspread openoffice gqview +juk +juk +juk SRAM read write energy garbage collection TF causes higher overhead per GC due to breaking intra-page locality
  36. 36. Duplication-Aware Garbage Collection 1 0.9 0.8 Normalized Energy Consumption 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 kword mozilla kspread openoffice gqview kword +juk mozilla kspread openoffice gqview average +juk +juk +juk +juk Up to 50% of energy reduction Average energy reduction rate is 24%
  37. 37. HotCache + Subpaging + DA- GC 1 0.9 Normalized Energy Consumption 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 kword mozilla kspread openoffice gqview kword +juk mozilla kspread openoffice gqview +juk +juk +juk +juk 1MB HotCache & 512KB flash pages Energy reduction of HotCache + Subpaging + DA-GC Ranging from 9.3% to 75%
  38. 38. Conclusion We revisit virtual memory system design with flash memory as the secondary storage Three energy-efficient VM design Subpaging HotCache management Duplication-aware garbage collection Joint use of Subpaging & TFL policy & DA-GC Reduce up to 75% of flash memory energy
  39. 39. On-Going Works Flash Memory Controller SSD in server platform CPU core SRAM High-throughput multi-bank Host Interface flash system Host Flash Flash memory bus Interface Interface Data placement in Flash memory chips SLC/MLC Reliability issue SLC MLC MLC ATA or SATA SRAM MLC MLC MLC micro controller MLC MLC MLC

×