Adaptive Page Grouping for Energy Efficiency
    in Hybrid PRAM-DRAM Main Memory


           Dong-Jae Shin, Sung Kyu Park,
           Seong Min Kim and Kyu Ho Park

      Korea Advanced Institute of Science & Technology



           Presenter : Dong-Jae Shin
           Date    : 2012/10/24
OUTLINE

1            Introduction


2             Motivation


3   Adaptive Page Grouping System


4            Evaluation


5            Conclusion



                                    2 / 22
1. Introduction

           Energy Consumption in Computer Systems
 • Energy consumption of computer system is increasing
      – Data centers consumes 1.5% of the world's energy use and doubled
        in 5 years [1]
      – DRAM is significant factor for consumed energy
                                                                      Effects
                           Networking
                               5%                                       Electric charge is more than 10%
                                                                         of operation cost in data centers[1]
             DRAM
              30%                                                       It continues to increase steadily
                                            CPUs
                                            33%                       Challenging issues
                                                                        How can we reduce energy
                                                                         consumption in main memory

                                                                        Consumed power in memory
                   Other          Disks
                   22%            10%                                   Static power
                                                                        Dynamic power
  Distribution of power usage by Google Data Center          [2]


                  [1] Harnessing Green IT: Principles and Practices,” IEEE IT Professional, January–February 2008
                  [2] L. Barroso and U. Holzle. The Datacenter as a Computer: An Introduction to the Design of      3 / 22
                  Warehouse-Scale Machines. Morgan& Claypool, 2009
1. Introduction

           DRAM and Phase change memory(PRAM)
• DRAM                                                   • PRAM
     –   Charge capacitor and read                              –   Store value with resistance
     –   Leakage current                                        –   No refresh power
     –   Need refresh power                                     –   Non-volatile
     –   Limit of capacitor size                                –   Multi-level Cell(MLC)[4] available




                DRAM Cell [3]                                                      PRAM Cell [3]
              (Charge Memory)                                                   (Resistive Memory)


          Good performance                                            Low static power

          High static power                                           Relatively low performance


                       [3] Architecting Phase Change Memory as a Scalable DRAM Alternative, ISCA 2009
                       [4] Drift-Tolerant Multilevel Phase-Change Memory, IMW 2011                      4 / 22
1. Introduction

                      DRAM & PRAM Technology

 • Single Chip Capacity                                                       DRAM         PRAM


Design Rule                                                                                8Gb
                                                                                     4Gb

   20nm                                                  2Gb                               8Gb
   40nm

   60nm                                                                1Gb
                          512Mb
   80nm
              256Mb
  100nm                      256Mb
                                                                                             Year


              2004             2006                   2008                    2010         2012

                       [5] Memory Technology and Solutions Roadmap, SAMSUNG
                                                                                             5 / 22
2. Motivation

                          PRAM as a main memory
                       Summarized comparison between DRAM and PRAM [6]
                  Attributes                      DRAM                             PRAM
                  Non-volatility                    No                               Yes
                  Read latency                   10~60ns                            48ns
                  Write latency                  10~60ns                          40~150ns
                  Read energy                     2pJ / bit                        2pJ / bit
                  Write energy                    2pJ / bit                       10pJ / bit
                   Idle power                     ~W/GB                          ~0.05W.GB
                Write Endurance                     1015                             108



• Challenging issues of
  PRAM main memory
    – Endurance problem                                                           Only PRAM is
         • Less than 1 years                                                      not solution!
    – High write latency
    – High write power

                [6] Non-Volatile Memory: Emerging Technologies And Their Impacts on Memory Systems, 2010
                                                                                                           6 / 22
2. Motivation

                     Hybrid Main Memory Architecture

                     CPU
                                        • Motivation
                                           – To exploit both of
             Core           Core
                                             high performance of DRAM and low
             Core           Core             static power of PRAM

                Last Level Cache
                                        • Necessity for management
                Memory Controller
                                           – In worst case (Hot pages in PRAM)
                                               • consumes high dynamic power
                                               • has high latency
            Linear address space
                                               • PRAM cell wears out rapidly
       DRAM                    PRAM

      Hybrid Main Memory Architecture



  Direct access to PRAM                    Elaborate management is needed


                                                                                 7 / 22
2. Motivation

                Access Pattern with Spatial Locality

 • Access count and difference of PFN(Page Frame Number)




                   gzip                                           mcf

                          Spatial locality was found over pages
                                                                        8 / 22
3. GPM System

                       Overall System Architecture

 • Hybrid memory management daemon works periodically
     – Monitoring module
         • Look up and store access information of pages for hot/cold decision
     – Page Grouping Module
         • Adaptive page grouping
     – Migration Module
         • Page placement                                                        DRAM
                                      Access Page
                  Kernel                 Table

            Hybrid Memory
          Management Daemon                                                             Hot Page
                                                                                         Group
               Monitoring &                                                             Migration
              Shifting Module
                                       Migration

              Page Grouping                                                        Cold Page
                 Module                                                             Group
                                                                                   Migration
              Migration Module
             (DRAM  PRAM)

                                                                                 PRAM
                                                   Free Page   Hot Group   Cold Group
                                                                                                    9 / 22
3. GPM System

                                       4-1. Store Writing History
  Hybrid Memory
Management Daemon
                            • Dirty bit shifting
     Monitoring &
                                   – Store writing history per page for hot/cold decision
    Shifting Module
                                   – Use available space in a page table entry (PTE)
    Page Grouping                         • No additional space for storing history, different from
       Module
                                            other works
    Migration Module
                                                                        Page Table Entry                               D = Dirty bit
   (DRAM  PRAM)
                               63 62 61 60 59            52 51                             12 11           9   8   7   6   5   4   3   2   1   0
                                                                                                                 P     P P U R
                               N                                                               Used by
                                 4 3 2 1        Available     Physical-Page Frame Number                       G A D A C W / / P
                               X                                                                 OS
                                                                                                                 T     D T S W



                                    To store
                                    dirty bit

                                                   4th            3rd           2nd                1st         Dirty Bit
                                                                                                                                        Dirty Bit
                                                    v                                              v                   v               Monitoring
                                                                   v             v                                                         and
                       Page Table                                                                                                      Bit Shifting
                       (512 entries                 v              v                               v                   v
                      per directory)
                                                                                 …



                                                            Bit Shifting       v : bit is 1            v : bit will be 1                           10 / 22
3. GPM System

                                        4-2. Adaptive Page Grouping
  Hybrid Memory                     •      Adaptive page grouping
Management Daemon                            – Physically near pages have similar access count due to Linux
         Monitoring &                          buddy system
                                             – These pages are grouped  Migration ↓, Accuracy↑
        Shifting Module

        Page Grouping
           Module
                                             – Algorithm
                                                     • Step ① : Calculate difference between Page Frame Numbers(PFN)
     Migration Module
    (DRAM  PRAM)
                                                     • Step ② : If (diff. < threshold), then add to group
                                                                                                                                            ①          ②
63 62 61 60 59                  52 51                                12 11             9   8   7   6   5   4   3   2   1   0
                                                                                                                                          Diff. of
                                                                                             P     P P U R                      PFN                  Grouping
N                                                                            Used by                                                       PFN
    4   3   2   1   Available           Physical-Page Frame Number                         G A D A C W / / P
X                                                                              OS
                                                                                             T     D T S W                     0x000C0
                                              Page Table Entry                                                                 0x000C1       1
                                                                                                                                                       Group
                                                                                                                               0x000C2       1

                                                                                                                               0x000C3       1
                                             Linux Buddy System
                                                                                                                               0x02041     8062
                                    4KB
                                                                                                                               0x02042       1         Group
        Memory Block
                                    8KB                                                                                        0x02045       3
                                   16KB                                                                                        0x08892    26701         X

                                                                                                                               0x0631B     9591         X
                                                             …




                                                                                                                                         Page Table         11 / 22
3. GPM System

                                                 4-3. Hot/Cold Criterion
  Hybrid Memory                     • Hot cold decision for migration
Management Daemon
                                                – Average of hotness in page group
                                                – Weighted value to exploit temporal locality
     Monitoring &
    Shifting Module

    Page Grouping
                                            63 62 61 60 59                           52 51
                                                                                                      Page Table Entry 12   11             9   8   7   6   5    4   3   2    1   0
       Module
                                                                                                                                                 P     P P U R
                                            N                                                                                    Used by
                                                 4   3     2   1         Available           Physical-Page Frame Number                        G A D A C W / / P
    Migration Module                        X                                                                                      OS
                                                                                                                                                 T     D T S W
   (DRAM  PRAM)



                                                ①                                ②              ③          ④
                          4th          3rd           2nd           1st        Hotness         Avg.
                       (Weight 1)   (Weight 2)   (Weight 4)    (Weight 8)                            Decision             Characteristic                       Criterion
                                                                               (page)        Hotness

                           v                                       v                 9
                                                                                                                                  Hot              Avg. >= Hot threshold
                                                     v             v             12
                                                                                              11.25        Hot
                                        v                          v             10
    Page Table
                                        v            v             v             14                                                                  Hot threshold >
                                                                                                                                 Warm
                                                                                                                                                   Avg. > Cold threshold
                           v                                                         1

                           v                                                         1         1.33       Cold

                                        v                                            2                                           Cold              Cold threshold >= Avg.

                                        v            v                               6          6         Warm

                           v                                       v                 9          9         Warm
                                                                                                                                                                            12 / 22
3. GPM System

                                     4-4. Migration Policy
  Hybrid Memory             • First allocation from DRAM
Management Daemon
                                     – Firstly allocated page is used immediately[7]
     Monitoring &
    Shifting Module
                            • Kernel memory is allocated to DRAM
                                     – Kernel memory is frequently used[7]
                            • Hot groups to DRAM, and Cold groups to PRAM
    Page Grouping
       Module

    Migration Module
                                     – To exploit both advantages of DRAM and PRAM
   (DRAM  PRAM)
                            • Warm groups do not migrate
                                     – To reduce migration
                               4th          3rd         2nd           1st       Hotness    Avg.
                            (Weight 1)   (Weight 2)   (Weight 4)   (Weight 8)                     Decision
                                                                                 (page)   Hotness
                                v                                      v          9
                                                                                                             If page group are in
                                                          v            v          12
                                                                                           11.25    Hot            PRAM,
                                             v                         v          10                          migrate to DRAM
               Page Table                    v            v            v          14
                                v                                                 1
                                                                                                             If pages group are in
                                v                                                 1        1.33     Cold            DRAM,
                                             v                                    2                            migrate to PRAM

                                             v            v                       6         6      Warm
                                                                                                               Do not migrate
                                v                                      v          9         9      Warm


                                     [7] Page Placement in Hybrid Memory Systems, ICS 2011                                    13 / 22
4. Evaluation

                           Experiment Environment

 • Implementation                                                                             Hybrid Memory
     – Implement in Linux kernel as a                                                       Management Daemon
       kernel thread (daemon)                                                                     Monitoring &

     – without additional hardware
                                                                                                 Shifting Module

        only software patch                                                                      Page Grouping
                                                                                                     Module


 • Specification
                                                                                                 Migration Module
                                                                                                (DRAM  PRAM)

     – CPU : Intel Dual Core 2.4Ghz
     – Memory : 1GB DRAM, 1GB PRAM
     – OS : Linux 2.6.31 kernel

 • Power & Timing calculation
     – Counting access count using Pin-tool[8]


                [8] Pin: building customized program analysis tools with dynamic instrumentation, PLDI 2005
                                                                                                                    14 / 22
4. Evaluation

                            Workload and Parameters
•   Workload ( SPEC 2000 + MaaS[9] )                           • DRAM & PRAM characteristic[3]
       Workload          Working Set Size                                Parameter                 DRAM         PRAM
                                                                            Power characteristics (per GB)
    176.gcc.ref                     97MB
                                                                   Row read power                  210mW        78mW
    188.ammp.ref                    40MB
                                                                   Row write power                 195mW        773mW
    172.mgrid.ref                   69MB                           Active power                     75mW        25mW
    183.equake.ref                  64MB                           Standby power                    90mW        45mW

    181.mcf.ref                    165MB                           Refresh power                    4mW         0mW
                                                                                 Timing characteristics
    164.gzip.ref                   193MB
                                                                   Initial row read                  15ns       28ns
    173.applu.ref                  195MB                           Initial row write                 22ns       150ns
    MaaS                           480MB                           Same row read/write               15ns       15ns

• Comparable System
      – DRAM only (2GB) , PRAM only (2GB)
      – Hybrid memory (DRAM 1GB , PRAM 1GB)
           • PDRAM [10] , Second chance algorithm[11], Adaptive Page Grouping (APG)

                     [9] Multimedia Matching as a Service: Technical Challenges and Blueprints, ITC-CSCC 2011
                     [10] PDRAM : A Hybrid PRAM and DRAM Main Memory System, DAC 2009
                     [11] Power-Aware Memory Management for Hybrid Main Memory, ICNIT 2011                         15 / 22
4. Evaluation

                              Energy Consumption of Memory System

                                           Normalized Energy Consumption in Memory System
                                      DRAM Only           PRAM Only        PDRAM          S.chance          APG
                       2
 Normalized to DRAM




                      1.5




                       1




                      0.5




                       0

                            mcf          gzip     applu        gcc    equake       ammp   mgrid      maas         avg.

                                  •   24% of energy was reduced compared to DRAM
                                  •   8.5% was reduced compared to power-aware management[7]

                                                                                                                    16 / 22
4. Evaluation

                        Reason 1 : Reduction of Write count in PRAM

                                          Write Count per PRAM Page
               1500
                                                                                             PDRAM
                                                                                             S.chance
                                                                                             APG


               1000
 Write Count




                500




                  0

                      mcf      gzip     applu     gcc     equake   ammp      mgrid    maas   avg.


                            Write count was effectively reduced by Adaptive Page Grouping

                                                                                                17 / 22
4. Evaluation

                                                Reason 2 : Reduction of Migration Energy

                                                                                         Total Migration Energy
                        1200
                                                                                                                                                                        PRAM->DRAM

                        1000                                                                                                                                            DRAM->PRAM
 Migration Energy (J)




                         800


                         600


                         400


                         200


                           0
                                          APG



                                                            APG



                                                                             APG



                                                                                              APG



                                                                                                               APG



                                                                                                                                APG



                                                                                                                                                 APG



                                                                                                                                                                  APG



                                                                                                                                                                                    APG
                               S.chance



                                                 S.chance



                                                                  S.chance



                                                                                   S.chance



                                                                                                    S.chance



                                                                                                                     S.chance



                                                                                                                                      S.chance



                                                                                                                                                       S.chance



                                                                                                                                                                         S.chance
                                                      On average, 76% of migration energy was reduced compared to
                                                                     power-aware management[11]

                                                                                                                                                                                    18 / 22
4. Evaluation

                                   Performance of Memory System

                                                       Normalized Access Time
                                DRAM Only           PRAM Only       PDRAM          S.chance          APG
                      5



                      4
 Normalized to DRAM




                      3



                      2



                      1



                      0

                          mcf      gzip     applu        gcc    equake      ammp   mgrid      maas         avg.

                                On average, only 8% of access was increased compared to DRAM,
                                  38% was reduced compared to power-aware management[11]

                                                                                                             19 / 22
5. Conclusion

                             Overhead Analysis

 • Overhead factor                           • Overhead result
     – Algorithm overhead                       – Including 4 factors

         • Page table scanning                                  Overhead
                                                 Workload
         • Bit shifting                                           (%)
         • Adaptive page grouping                   mcf            0.63
     – Migration overhead                           gzip           0.52
                                                   applu           0.41
                                                    gcc            0.88
                                                  equake           0.32
                                                   ammp            0.15
                                                   mgrid           0.22
                                                   MaaS            1.23
      On average, less than 1% of overhead
                                                  average          0.55



                                                                           20 / 22
5. Conclusion

                                        Conclusion
 • Design and implement Adaptive Page Grouping
     – Dirty bit shifting
         • Storing access information per page without additional space
         • Weighted dirty bit for exploiting temporal locality
     – Adaptive page grouping
         • Exploit spatial locality over pages
         • Reduce migration operation between DRAM and PRAM
     – Grouped page migration
         • Effectively reduce write operation on PRAM

 • Power saving
     – 24% reduction compared to DRAM only system
     – 8.5% reduction compared to power aware management[7]
     – With less than 1% performance degradation

                                  Granularity           Locality          Level        Additional H/W
                                                        Temporal
     Adaptive Page Grouping        Page Group                               OS              none
                                                        & Spatial
                     [7] Power-Aware Memory Management for Hybrid Main Memory, ICNIT 2011           21 / 22
5. Conclusion

                             Further Works

 • Further works

     – Evaluation on mobile system
         • Power saving increases lifetime of battery
     – Combining with Wear leveling scheme
         • Word-level shifting
         • Line shifting
         • Page swapping




                                                        22 / 22
6. Conclusion

                                    Reference

 •   [1] Harnessing Green IT: Principles and Practices,” IEEE IT Professional, January–
     February 2008
 •   [2] L. Barroso and U. Holzle. The Datacenter as a Computer: An Introduction to the
     Design of Warehouse-Scale Machines. Morgan& Claypool, 2009
 •   [3] Architecting Phase Change Memory as a Scalable DRAM Alternative, ISCA 2009
 •   [4] Drift-Tolerant Multilevel Phase-Change Memory, IMW 2011
 •   [5] Memory Technology and Solutions Roadmap, SAMSUNG
 •   [6] Non-Volatile Memory: Emerging Technologies And Their Impacts on Memory
     Systems, 2010
 •   [7] Page Placement in Hybrid Memory Systems, ICS 2011
 •   [8] Pin: building customized program analysis tools with dynamic
     instrumentation, PLDI 2005
 •   [9] Multimedia Matching as a Service: Technical Challenges and Blueprints, ITC-
     CSCC 2011
 •   [10] PDRAM : A Hybrid PRAM and DRAM Main Memory System, DAC 2009
 •   [11] Power-Aware Memory Management for Hybrid Main Memory, ICNIT 2011




                                                                                    23 / 22
Thank you

Racs2012 djshin

  • 1.
    Adaptive Page Groupingfor Energy Efficiency in Hybrid PRAM-DRAM Main Memory Dong-Jae Shin, Sung Kyu Park, Seong Min Kim and Kyu Ho Park Korea Advanced Institute of Science & Technology Presenter : Dong-Jae Shin Date : 2012/10/24
  • 2.
    OUTLINE 1 Introduction 2 Motivation 3 Adaptive Page Grouping System 4 Evaluation 5 Conclusion 2 / 22
  • 3.
    1. Introduction Energy Consumption in Computer Systems • Energy consumption of computer system is increasing – Data centers consumes 1.5% of the world's energy use and doubled in 5 years [1] – DRAM is significant factor for consumed energy  Effects Networking 5%  Electric charge is more than 10% of operation cost in data centers[1] DRAM 30%  It continues to increase steadily CPUs 33%  Challenging issues  How can we reduce energy consumption in main memory  Consumed power in memory Other Disks 22% 10%  Static power  Dynamic power Distribution of power usage by Google Data Center [2] [1] Harnessing Green IT: Principles and Practices,” IEEE IT Professional, January–February 2008 [2] L. Barroso and U. Holzle. The Datacenter as a Computer: An Introduction to the Design of 3 / 22 Warehouse-Scale Machines. Morgan& Claypool, 2009
  • 4.
    1. Introduction DRAM and Phase change memory(PRAM) • DRAM • PRAM – Charge capacitor and read – Store value with resistance – Leakage current – No refresh power – Need refresh power – Non-volatile – Limit of capacitor size – Multi-level Cell(MLC)[4] available DRAM Cell [3] PRAM Cell [3] (Charge Memory) (Resistive Memory)  Good performance  Low static power  High static power  Relatively low performance [3] Architecting Phase Change Memory as a Scalable DRAM Alternative, ISCA 2009 [4] Drift-Tolerant Multilevel Phase-Change Memory, IMW 2011 4 / 22
  • 5.
    1. Introduction DRAM & PRAM Technology • Single Chip Capacity DRAM PRAM Design Rule 8Gb 4Gb 20nm 2Gb 8Gb 40nm 60nm 1Gb 512Mb 80nm 256Mb 100nm 256Mb Year 2004 2006 2008 2010 2012 [5] Memory Technology and Solutions Roadmap, SAMSUNG 5 / 22
  • 6.
    2. Motivation PRAM as a main memory Summarized comparison between DRAM and PRAM [6] Attributes DRAM PRAM Non-volatility No Yes Read latency 10~60ns 48ns Write latency 10~60ns 40~150ns Read energy 2pJ / bit 2pJ / bit Write energy 2pJ / bit 10pJ / bit Idle power ~W/GB ~0.05W.GB Write Endurance 1015 108 • Challenging issues of PRAM main memory – Endurance problem Only PRAM is • Less than 1 years not solution! – High write latency – High write power [6] Non-Volatile Memory: Emerging Technologies And Their Impacts on Memory Systems, 2010 6 / 22
  • 7.
    2. Motivation Hybrid Main Memory Architecture CPU • Motivation – To exploit both of Core Core high performance of DRAM and low Core Core static power of PRAM Last Level Cache • Necessity for management Memory Controller – In worst case (Hot pages in PRAM) • consumes high dynamic power • has high latency Linear address space • PRAM cell wears out rapidly DRAM PRAM Hybrid Main Memory Architecture  Direct access to PRAM  Elaborate management is needed 7 / 22
  • 8.
    2. Motivation Access Pattern with Spatial Locality • Access count and difference of PFN(Page Frame Number) gzip mcf Spatial locality was found over pages 8 / 22
  • 9.
    3. GPM System Overall System Architecture • Hybrid memory management daemon works periodically – Monitoring module • Look up and store access information of pages for hot/cold decision – Page Grouping Module • Adaptive page grouping – Migration Module • Page placement DRAM Access Page Kernel Table Hybrid Memory Management Daemon Hot Page Group Monitoring & Migration Shifting Module Migration Page Grouping Cold Page Module Group Migration Migration Module (DRAM  PRAM) PRAM Free Page Hot Group Cold Group 9 / 22
  • 10.
    3. GPM System 4-1. Store Writing History Hybrid Memory Management Daemon • Dirty bit shifting Monitoring & – Store writing history per page for hot/cold decision Shifting Module – Use available space in a page table entry (PTE) Page Grouping • No additional space for storing history, different from Module other works Migration Module Page Table Entry D = Dirty bit (DRAM  PRAM) 63 62 61 60 59 52 51 12 11 9 8 7 6 5 4 3 2 1 0 P P P U R N Used by 4 3 2 1 Available Physical-Page Frame Number G A D A C W / / P X OS T D T S W To store dirty bit 4th 3rd 2nd 1st Dirty Bit Dirty Bit v v v Monitoring v v and Page Table Bit Shifting (512 entries v v v v per directory) … Bit Shifting v : bit is 1 v : bit will be 1 10 / 22
  • 11.
    3. GPM System 4-2. Adaptive Page Grouping Hybrid Memory • Adaptive page grouping Management Daemon – Physically near pages have similar access count due to Linux Monitoring & buddy system – These pages are grouped  Migration ↓, Accuracy↑ Shifting Module Page Grouping Module – Algorithm • Step ① : Calculate difference between Page Frame Numbers(PFN) Migration Module (DRAM  PRAM) • Step ② : If (diff. < threshold), then add to group ① ② 63 62 61 60 59 52 51 12 11 9 8 7 6 5 4 3 2 1 0 Diff. of P P P U R PFN Grouping N Used by PFN 4 3 2 1 Available Physical-Page Frame Number G A D A C W / / P X OS T D T S W 0x000C0 Page Table Entry 0x000C1 1 Group 0x000C2 1 0x000C3 1 Linux Buddy System 0x02041 8062 4KB 0x02042 1 Group Memory Block 8KB 0x02045 3 16KB 0x08892 26701 X 0x0631B 9591 X … Page Table 11 / 22
  • 12.
    3. GPM System 4-3. Hot/Cold Criterion Hybrid Memory • Hot cold decision for migration Management Daemon – Average of hotness in page group – Weighted value to exploit temporal locality Monitoring & Shifting Module Page Grouping 63 62 61 60 59 52 51 Page Table Entry 12 11 9 8 7 6 5 4 3 2 1 0 Module P P P U R N Used by 4 3 2 1 Available Physical-Page Frame Number G A D A C W / / P Migration Module X OS T D T S W (DRAM  PRAM) ① ② ③ ④ 4th 3rd 2nd 1st Hotness Avg. (Weight 1) (Weight 2) (Weight 4) (Weight 8) Decision Characteristic Criterion (page) Hotness v v 9 Hot Avg. >= Hot threshold v v 12 11.25 Hot v v 10 Page Table v v v 14 Hot threshold > Warm Avg. > Cold threshold v 1 v 1 1.33 Cold v 2 Cold Cold threshold >= Avg. v v 6 6 Warm v v 9 9 Warm 12 / 22
  • 13.
    3. GPM System 4-4. Migration Policy Hybrid Memory • First allocation from DRAM Management Daemon – Firstly allocated page is used immediately[7] Monitoring & Shifting Module • Kernel memory is allocated to DRAM – Kernel memory is frequently used[7] • Hot groups to DRAM, and Cold groups to PRAM Page Grouping Module Migration Module – To exploit both advantages of DRAM and PRAM (DRAM  PRAM) • Warm groups do not migrate – To reduce migration 4th 3rd 2nd 1st Hotness Avg. (Weight 1) (Weight 2) (Weight 4) (Weight 8) Decision (page) Hotness v v 9 If page group are in v v 12 11.25 Hot PRAM, v v 10 migrate to DRAM Page Table v v v 14 v 1 If pages group are in v 1 1.33 Cold DRAM, v 2 migrate to PRAM v v 6 6 Warm Do not migrate v v 9 9 Warm [7] Page Placement in Hybrid Memory Systems, ICS 2011 13 / 22
  • 14.
    4. Evaluation Experiment Environment • Implementation Hybrid Memory – Implement in Linux kernel as a Management Daemon kernel thread (daemon) Monitoring & – without additional hardware Shifting Module  only software patch Page Grouping Module • Specification Migration Module (DRAM  PRAM) – CPU : Intel Dual Core 2.4Ghz – Memory : 1GB DRAM, 1GB PRAM – OS : Linux 2.6.31 kernel • Power & Timing calculation – Counting access count using Pin-tool[8] [8] Pin: building customized program analysis tools with dynamic instrumentation, PLDI 2005 14 / 22
  • 15.
    4. Evaluation Workload and Parameters • Workload ( SPEC 2000 + MaaS[9] ) • DRAM & PRAM characteristic[3] Workload Working Set Size Parameter DRAM PRAM Power characteristics (per GB) 176.gcc.ref 97MB Row read power 210mW 78mW 188.ammp.ref 40MB Row write power 195mW 773mW 172.mgrid.ref 69MB Active power 75mW 25mW 183.equake.ref 64MB Standby power 90mW 45mW 181.mcf.ref 165MB Refresh power 4mW 0mW Timing characteristics 164.gzip.ref 193MB Initial row read 15ns 28ns 173.applu.ref 195MB Initial row write 22ns 150ns MaaS 480MB Same row read/write 15ns 15ns • Comparable System – DRAM only (2GB) , PRAM only (2GB) – Hybrid memory (DRAM 1GB , PRAM 1GB) • PDRAM [10] , Second chance algorithm[11], Adaptive Page Grouping (APG) [9] Multimedia Matching as a Service: Technical Challenges and Blueprints, ITC-CSCC 2011 [10] PDRAM : A Hybrid PRAM and DRAM Main Memory System, DAC 2009 [11] Power-Aware Memory Management for Hybrid Main Memory, ICNIT 2011 15 / 22
  • 16.
    4. Evaluation Energy Consumption of Memory System Normalized Energy Consumption in Memory System DRAM Only PRAM Only PDRAM S.chance APG 2 Normalized to DRAM 1.5 1 0.5 0 mcf gzip applu gcc equake ammp mgrid maas avg. • 24% of energy was reduced compared to DRAM • 8.5% was reduced compared to power-aware management[7] 16 / 22
  • 17.
    4. Evaluation Reason 1 : Reduction of Write count in PRAM Write Count per PRAM Page 1500 PDRAM S.chance APG 1000 Write Count 500 0 mcf gzip applu gcc equake ammp mgrid maas avg. Write count was effectively reduced by Adaptive Page Grouping 17 / 22
  • 18.
    4. Evaluation Reason 2 : Reduction of Migration Energy Total Migration Energy 1200 PRAM->DRAM 1000 DRAM->PRAM Migration Energy (J) 800 600 400 200 0 APG APG APG APG APG APG APG APG APG S.chance S.chance S.chance S.chance S.chance S.chance S.chance S.chance S.chance On average, 76% of migration energy was reduced compared to power-aware management[11] 18 / 22
  • 19.
    4. Evaluation Performance of Memory System Normalized Access Time DRAM Only PRAM Only PDRAM S.chance APG 5 4 Normalized to DRAM 3 2 1 0 mcf gzip applu gcc equake ammp mgrid maas avg. On average, only 8% of access was increased compared to DRAM, 38% was reduced compared to power-aware management[11] 19 / 22
  • 20.
    5. Conclusion Overhead Analysis • Overhead factor • Overhead result – Algorithm overhead – Including 4 factors • Page table scanning Overhead Workload • Bit shifting (%) • Adaptive page grouping mcf 0.63 – Migration overhead gzip 0.52 applu 0.41 gcc 0.88 equake 0.32 ammp 0.15 mgrid 0.22 MaaS 1.23 On average, less than 1% of overhead average 0.55 20 / 22
  • 21.
    5. Conclusion Conclusion • Design and implement Adaptive Page Grouping – Dirty bit shifting • Storing access information per page without additional space • Weighted dirty bit for exploiting temporal locality – Adaptive page grouping • Exploit spatial locality over pages • Reduce migration operation between DRAM and PRAM – Grouped page migration • Effectively reduce write operation on PRAM • Power saving – 24% reduction compared to DRAM only system – 8.5% reduction compared to power aware management[7] – With less than 1% performance degradation Granularity Locality Level Additional H/W Temporal Adaptive Page Grouping Page Group OS none & Spatial [7] Power-Aware Memory Management for Hybrid Main Memory, ICNIT 2011 21 / 22
  • 22.
    5. Conclusion Further Works • Further works – Evaluation on mobile system • Power saving increases lifetime of battery – Combining with Wear leveling scheme • Word-level shifting • Line shifting • Page swapping 22 / 22
  • 23.
    6. Conclusion Reference • [1] Harnessing Green IT: Principles and Practices,” IEEE IT Professional, January– February 2008 • [2] L. Barroso and U. Holzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan& Claypool, 2009 • [3] Architecting Phase Change Memory as a Scalable DRAM Alternative, ISCA 2009 • [4] Drift-Tolerant Multilevel Phase-Change Memory, IMW 2011 • [5] Memory Technology and Solutions Roadmap, SAMSUNG • [6] Non-Volatile Memory: Emerging Technologies And Their Impacts on Memory Systems, 2010 • [7] Page Placement in Hybrid Memory Systems, ICS 2011 • [8] Pin: building customized program analysis tools with dynamic instrumentation, PLDI 2005 • [9] Multimedia Matching as a Service: Technical Challenges and Blueprints, ITC- CSCC 2011 • [10] PDRAM : A Hybrid PRAM and DRAM Main Memory System, DAC 2009 • [11] Power-Aware Memory Management for Hybrid Main Memory, ICNIT 2011 23 / 22
  • 24.