SlideShare a Scribd company logo
A SURVEY ON EXPLORING
MEMORY OPTIMIZATIONS IN
SMARTPHONES
-KARTHIKEYAN RAMKUMAR
ABSTRACT
•

Many memory optimizations have been explored for computer systems and in this survey we explore
their applicability to smartphone hardware.

•

Memory technologies such as Mobile RAM (M-RAM), Power Aware Virtual Memory
(PAVM), Dynamic RAM (DRAM) and On-demand mechanisms such as Immediate Power Down
(IPD) mechanism and Immediate Self Refresh (ISR) mechanism are described in this survey.

•

Newly emerging technologies such as Phase Change Memory (PCM) and a hybrid approach consisting
of both Phase Change Memory and Mobile RAM are also surveyed.
INTRODUCTION
•

Additional features and improved user experience, provided by fast processors, copious
memory, resource demanding software, and power-hungry hardware makes energy a precious resource.
With hardware continuously improving in performance and price, vendors are able to build systems
with higher-performance and higher power components trying to meet users’ ever increasing demands
and compete for customers.

•

However, this results in systems that are over-provisioned with components that provide more
capacity, more throughput, and more processing power than needed for the typical workload, and as a
result, it is becoming more difficult to maintain long battery life in these devices.

•

While a smartphone contains many energy hungry components, such as CPU, display, and multiple
radios, energy consumed by memory subsystem has been given limited consideration.

•

Therefore, we explore the efficiency of the existing energy management mechanisms on smartphones.
MEMORY TECHNOLOGIES
The memory technologies discussed in this paper include Dynamic RAM (DRAM) which is the most
widely used memory technology in mobile devices and is otherwise referred to as Mobile RAM (MRAM). A recent contender for main memory technology is Phase Change Memory (PCM) which is a type
of non-volatile random-access memory that eliminates idle power due to its non-volatile nature but offers
lower performance than M-RAM. Another memory technology that is described is the Power Aware
Virtual Memory (PAVM), which reduces the energy consumed by the memory in response to workloads
becoming increasingly data-centric. This section describes the various memory technologies and how
they are optimized in smartphones to give a better performance.
1. DYNAMIC RAM (DRAM)
•

Dynamic random-access memory (DRAM) is a type of random-access memory that stores each bit of
data in a separate capacitor within an integrated circuit.

•

As applications are becoming increasingly data-centric, we expect main memory to remain as a
significant energy consumer because achieving good overall system performance will be more likely to
depend on having higher-performance and larger-capacity DRAM.

•

We use the terminology of the Double-Data Rate (DDR) memory simply because DDR is becoming the
most common type of memory used in today's PC and server systems. This approach is not limited to
only DDR but this technique can also be applied to other memory types, e.g., SDR and RDRAM.
1.1 MEMORY TRAFFIC RESHAPING

•

To reshape the memory traffic for our benefit, we must make memory access less random and more
controllable.

•

We use a 4-rank system wherein memory requests are likely to be randomly distributed among the 4
ranks and this creates a large number of small and medium sized idle periods.

•

To elongate idle periods, the concepts of hot and cold ranks are introduced.

•

Since more opportunities are created on cold ranks since Self Refresh can be more utilized, more
valuable opportunities are created in it.
1.1 MEMORY TRAFFIC RESHAPING

In the experiments conducted, the average
interarrival time was elongated by almost 2
orders of magnitude on cold ranks.

An example showing that if memory traffic
is left unshaped, power management cannot
take full advantage of deeper power-saving
states since most idle periods are too short.
1.2 EFFECT OF RESHAPING ON MEMORY TRAFFIC
•

To study the effect of memory traffic reshaping in more detail, we compare the results of migrating
1%, 5%, and 10% of pages.

•

Migrating only 1 % of pages gives only limited benefits in power reduction. On the other
hand, migrating 10% of pages does not give any additional energy benefit beyond that of migrating
5%. In addition, it also suffers from more performance penalty due to having to migrate more pages.

•

Therefore, migrating 5% of pages gives the best result for the workloads we ran.
1.2 EFFECT OF RESHAPING ON MEMORY TRAFFIC
•

•

To solve the problem at its root, it calls for an.
alternative main memory design, where we
should use high-performance, highly parallel
memory on hot ranks and low-performance lowpower memory on cold ranks.

•
Effects of actively reshaping memory
traffic by migrating 1%, 5%, and 10% of
pages for the low memory intensive
workload (above) and high memoryintensive workload (below).

As we can see from the Figure, migrating 1% as
opposed to 5% of pages does not give much
benefit in reducing performance penalty.

Results shows that a 35.63-38.87% additional
energy can be saved by complementing existing
power management techniques with this
technique.
2. PHASE CHANGE MEMORY (PCM)
•

Phase change memory is a type of non-volatile random access memory and provides a non-volatile
storage mechanism agreeable to process scaling.

•

However, for a DRAM alternative, we must architect PCM for feasibility in main memory within
general-purpose systems.

•

Drawn from a rigorous survey of PCM device and circuit prototypes published within the last five
years and comparing against modern DRAM memory subsystems, we examine the following: Buffer
Organization and Partial Writes.
2.1 BUFFER ORGANIZATION
•

We examine PCM buffer organizations that satisfy DRAM imposed area constraints.

•

PCM buffer reorganizations reduce application execution time from 1.6x to 1.2x and memory energy
from 2.2x to 1.0x, relative to DRAM-based systems.

Evaluation:
On optimizing average delay and energy across the workloads, we find four 512B-wide buffers most
effective. Executing on effectively buffered PCM, more than half the benchmarks achieve within 5
percent of their DRAM performance. Although each PCM array write requires 43.1x more energy than a
DRAM array write, these energy costs are mitigated by narrow buffer widths and additional rows, which
reduce the granularity of buffer evictions and expose opportunities for write coalescing, respectively.
2.2 PARTIAL WRITES
•

Partial writes, which track data modifications and write only modified cache lines or words to the PCM
array are utilized. Using an endurance model to estimate lifetime, we expect write coalescing and
partial writes to deliver a memory module average lifetime of 5.6 years.

•

Scaling improves PCM endurance, extending lifetimes by four orders of magnitude at 32nm.

Evaluation:

•

In a baseline architecture with a single 2048B-wide buffer, average module lifetime is approximately
525 hours.

•

For our memory intensive workloads, we observe 32.8 percent memory bus utilization. Scaling by
application-specific write intensity, we find 6.9 percent of memory bus cycles are utilized by writes.

•

On average, the four 512B-wide buffers coalesce 38.9 percent of writes emerging from the memory
bus, which is 47.0 percent utilized. Writes alone utilize 11.0 percent of the bus. Buffers use partial
writes so that only a fraction of the buffer’s bits is written to the array.

•
PHASE CHANGE MEMORY (PCM)
•

Collectively, these results indicate PCM is a viable DRAM alternative, with architectural solutions
providing competitive performance, comparable energy, and feasible lifetimes.

•

On utilizing PCM as a viable alternative to M-RAM, we need to note that it consumes more energy to
perform I/O operations, particularly write operations, since the cell state has to be changed.

•

However, PCM consumes significantly less idle power than M-RAM, especially in the low-power state
where the power consumption is reduced to 0. Therefore, we should leverage the tradeoffs between
performance and energy efficiency to apply PCM technology in mobile devices.
3. CHARACTERIZING MOBILE SOFTWARE
The applications selected for this survey are shown in the table

This table lists 12 popular Android applications selected from the Android market along with their trace
statistics. We is e a T-Mobile G1 smartphone is used to collect the application traces. Each trace consists of
task intervals with the task execution length and the number of memory I/Os
3. CHARACTERIZING MOBILE SOFTWARE
•

Compared to the CPU speed, human interactions are extremely slow, such that a mobile system is idle and waiting
for user input for the majority of time.

•

Prior study has shown that human perception threshold is between 50ms and 100ms and any event shorter than the
perception threshold appears instantaneous to the user.

•

Completing task execution earlier than the perception threshold is meaningless since the user will not notice this
amount of time and cannot initiate new tasks any sooner. This observation is the key to enabling energy
optimizations without impacting observed application performance

•

The majority of tasks are very short as more than 90% of all tasks complete within 10ms. Moreover, 95% of all
tasks are shorter than 50ms, indicating that these tasks can be extended to the 50ms perception threshold deadline
without any performance penalty. Similarly, for the remaining 5% of long tasks, any additional extension less than
50ms will not be noticed by the user, avoiding performance degradation.
4. MECHANISM COMPARISON
•

M-RAM needs to refresh the storage cells regularly for data retention, therefore consuming nonnegligible power even in the low-power state. PCM is able to completely eliminate idle power due to
non-volatile nature. We will evaluate the effectiveness of various energy management mechanisms on
M-RAM and PCM under the same execution environment.

•

For this survey, a simulator that models the system configuration of a T-Mobile G1 smartphone is used
 System Configuration of a T-Mobile G1

•

The memory subsystem consists of a memory controller and three 64MB ranks (192MB totally), for
either M-RAM or PCM. The simulator feeds with the traces, determines the memory power state, and
conducts task execution under the current CPU and memory state. The memory controller conducts
memory I/O operations, and executes power state transitions for each rank based on the energy
management mechanism
4.1 POWER AWARE VIRTUAL MEMORY (PAVM)
•

In mobile applications when the smartphone is waiting for user input, idle periods are common and
therefore powering down the memory devices during this period can help in reducing the energy
consumption.

•

Power-Aware Virtual Memory (PAVM) is a simple and efficient way to provide energy management. It
keeps the memory devices occupied by the currently running process in the active state while keeping
all other memory devices in a low-power state to save energy. Memory devices used by the newly
scheduled process are powered up during the context switch time to minimize the delays exposed to the
user due to power state transitions.
 Memory energy consumption
with a standard system (ON) and
the PAVM mechanism. The left
two bars for each application
show the energy of M-RAM and
PCM in standard system, while
the right two bars show the
energy for the PAVM mechanism.
5. ON-DEMAND MECHANISMS
•

Despite PAVM’s benefits to the standard system, it fails to address the energy efficiency of the active
rank accessed during the process execution.

•

Immediate Power Down (IPD) mechanism and Immediate Self Refresh (ISR) mechanism have been
proposed for RAM to provide on-demand power state transitions and improve energy efficiency of
active ranks.

•

As soon as an I/O request arrives at the memory controller, the rank to be accessed is transitioned to the
PRE state, and transitioned back to a low-power immediately after the I/O completes. Each energy bar
is normalized to M-RAM with the PAVM mechanism.
5. ON-DEMAND MECHANISMS
 Memory energy consumption for
on-demand mechanisms
normalized to the PAVM
mechanism on M-RAM

•

The first bar shows the energy consumption of PCM with the PAVM mechanism and the other bars show the energy
of on-demand mechanisms.

•

The two on-demand mechanisms outperform the PAVM mechanism on PCM and as a result, PCM’s inferior I/O
efficiency can’t offset its energy savings from idle periods, except for lightly loaded applications Amazon, Music
and Twidroid.

•

The PCM OFF mechanism completely eliminates the active idle energy, resulting in 44% energy reduction over the
PAVM mechanism on PCM. Compared to the IPD and ISR mechanisms on M-RAM, the PCM OFF mechanism
offers 18% and 22% energy savings respectively.
5. ON-DEMAND MECHANISMS
 The distribution of extended tasks
that expose delays for on-demand
mechanisms

• We can observe that the IPD mechanism achieves the best performance with negligible delays exposed.
The ISR and PCM OFF mechanisms, on the other hand, incur more evident degradation due to the
141.5ns long transition latency
• energy is the only concern, the novel PCM technology with on-demand mechanism surpasses the
traditional MRAM. However, taking into account the performance as well, M-RAM still has the chance
to beat PCM,
• We therefore need an approach to balance energy and performance more efficiently than any standalone
memory technology.
6. HYBRID MEMORY ARCHITECTURE
•

From the previous analysis, we can see that PCM is superior to M-RAM for its lower idle power
consumption, while M-RAM excels PCM for faster I/O speed and lower I/O energy. Therefore, a
hybrid memory consisting of M-RAM and PCM can improve both the energy efficiency and
performance.

•

When an application is invoked and its image does not reside in M-RAM, it is loaded into M-RAM
either from secondary storage or PCM, and the corresponding process identifier is put at the head of the
LRU list.

•

When an application is closed, its memory image will stay in M-RAM until it is swapped out.

•

The hybrid approach preserves more than 99% of IPD’s performance and achieves the best energy
efficiency among all mechanisms while maintaining almost full memory performance.
CONCLUSIONS
•

The PAVM mechanism saves more than 90% energy as compared to the standard system with no
energy management.

•

Additional energy savings are provided by the on-demand mechanisms which offer around 40% more
savings compared to the PAVM mechanism, for both M-RAM and PCM.

•

The energy efficiency can be improved further by a hybrid approach consisting of mixed memory
technologies and mechanisms and this approach provides an energy savings of 98% with negligible
performance overheads as compared to the standard system.

More Related Content

What's hot

Parallel Batch Performance Considerations
Parallel Batch Performance ConsiderationsParallel Batch Performance Considerations
Parallel Batch Performance Considerations
Martin Packer
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
Mehul Patel
 
zIIP Capacity Planning
zIIP Capacity PlanningzIIP Capacity Planning
zIIP Capacity Planning
Martin Packer
 
Munich 2016 - Z011598 Martin Packer - He Picks On CICS
Munich 2016 - Z011598 Martin Packer - He Picks On CICSMunich 2016 - Z011598 Martin Packer - He Picks On CICS
Munich 2016 - Z011598 Martin Packer - He Picks On CICS
Martin Packer
 
2 colin walls - how to measure rtos performance
2    colin walls - how to measure rtos performance2    colin walls - how to measure rtos performance
2 colin walls - how to measure rtos performance
Ievgenii Katsan
 
Time For D.I.M.E?
Time For D.I.M.E?Time For D.I.M.E?
Time For D.I.M.E?
Martin Packer
 

What's hot (6)

Parallel Batch Performance Considerations
Parallel Batch Performance ConsiderationsParallel Batch Performance Considerations
Parallel Batch Performance Considerations
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
 
zIIP Capacity Planning
zIIP Capacity PlanningzIIP Capacity Planning
zIIP Capacity Planning
 
Munich 2016 - Z011598 Martin Packer - He Picks On CICS
Munich 2016 - Z011598 Martin Packer - He Picks On CICSMunich 2016 - Z011598 Martin Packer - He Picks On CICS
Munich 2016 - Z011598 Martin Packer - He Picks On CICS
 
2 colin walls - how to measure rtos performance
2    colin walls - how to measure rtos performance2    colin walls - how to measure rtos performance
2 colin walls - how to measure rtos performance
 
Time For D.I.M.E?
Time For D.I.M.E?Time For D.I.M.E?
Time For D.I.M.E?
 

Similar to A survey on exploring memory optimizations in smartphones

Literature survey presentation
Literature survey presentationLiterature survey presentation
Literature survey presentationKarthik Iyr
 
CA assignment group.pptx
CA assignment group.pptxCA assignment group.pptx
CA assignment group.pptx
HAIDERALICH3
 
RAMinate Invited Talk at NII
RAMinate Invited Talk at NIIRAMinate Invited Talk at NII
RAMinate Invited Talk at NII
Takahiro Hirofuchi
 
MAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORY
MAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORYMAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORY
MAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORY
caijjournal
 
RAMinate ACM SoCC 2016 Talk
RAMinate ACM SoCC 2016 TalkRAMinate ACM SoCC 2016 Talk
RAMinate ACM SoCC 2016 Talk
Takahiro Hirofuchi
 
Virtualization for Emerging Memory Devices
Virtualization for Emerging Memory DevicesVirtualization for Emerging Memory Devices
Virtualization for Emerging Memory Devices
Takahiro Hirofuchi
 
High performance operating system controlled memory compression
High performance operating system controlled memory compressionHigh performance operating system controlled memory compression
High performance operating system controlled memory compressionMr. Chanuwan
 
DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4
Barbara Aichinger
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmp
eSAT Publishing House
 
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMSENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ijdms
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
BaliThorat1
 
Bg4103362367
Bg4103362367Bg4103362367
Bg4103362367
IJERA Editor
 
Dynamic Frequency Scaling Regarding Memory for Energy Efficiency of Embedded...
Dynamic Frequency Scaling Regarding Memory for  Energy Efficiency of Embedded...Dynamic Frequency Scaling Regarding Memory for  Energy Efficiency of Embedded...
Dynamic Frequency Scaling Regarding Memory for Energy Efficiency of Embedded...
IJECEIAES
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
inventionjournals
 
Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...
Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...
Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...
JAYAPRAKASH JPINFOTECH
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
Syed Zaid Irshad
 

Similar to A survey on exploring memory optimizations in smartphones (20)

Literature survey presentation
Literature survey presentationLiterature survey presentation
Literature survey presentation
 
Aqeel
AqeelAqeel
Aqeel
 
CA assignment group.pptx
CA assignment group.pptxCA assignment group.pptx
CA assignment group.pptx
 
RAMinate Invited Talk at NII
RAMinate Invited Talk at NIIRAMinate Invited Talk at NII
RAMinate Invited Talk at NII
 
MAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORY
MAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORYMAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORY
MAC: A NOVEL SYSTEMATICALLY MULTILEVEL CACHE REPLACEMENT POLICY FOR PCM MEMORY
 
RAMinate ACM SoCC 2016 Talk
RAMinate ACM SoCC 2016 TalkRAMinate ACM SoCC 2016 Talk
RAMinate ACM SoCC 2016 Talk
 
Virtualization for Emerging Memory Devices
Virtualization for Emerging Memory DevicesVirtualization for Emerging Memory Devices
Virtualization for Emerging Memory Devices
 
High performance operating system controlled memory compression
High performance operating system controlled memory compressionHigh performance operating system controlled memory compression
High performance operating system controlled memory compression
 
Improving DRAM performance
Improving DRAM performanceImproving DRAM performance
Improving DRAM performance
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmp
 
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMSENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
ENERGY-AWARE DISK STORAGE MANAGEMENT: ONLINE APPROACH WITH APPLICATION IN DBMS
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Bg4103362367
Bg4103362367Bg4103362367
Bg4103362367
 
Dynamic Frequency Scaling Regarding Memory for Energy Efficiency of Embedded...
Dynamic Frequency Scaling Regarding Memory for  Energy Efficiency of Embedded...Dynamic Frequency Scaling Regarding Memory for  Energy Efficiency of Embedded...
Dynamic Frequency Scaling Regarding Memory for Energy Efficiency of Embedded...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...
Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...
Excavating the Hidden Parallelism Inside DRAM Architectures With Buffered Com...
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
 
G1034853
G1034853G1034853
G1034853
 

Recently uploaded

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 

Recently uploaded (20)

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 

A survey on exploring memory optimizations in smartphones

  • 1. A SURVEY ON EXPLORING MEMORY OPTIMIZATIONS IN SMARTPHONES -KARTHIKEYAN RAMKUMAR
  • 2. ABSTRACT • Many memory optimizations have been explored for computer systems and in this survey we explore their applicability to smartphone hardware. • Memory technologies such as Mobile RAM (M-RAM), Power Aware Virtual Memory (PAVM), Dynamic RAM (DRAM) and On-demand mechanisms such as Immediate Power Down (IPD) mechanism and Immediate Self Refresh (ISR) mechanism are described in this survey. • Newly emerging technologies such as Phase Change Memory (PCM) and a hybrid approach consisting of both Phase Change Memory and Mobile RAM are also surveyed.
  • 3. INTRODUCTION • Additional features and improved user experience, provided by fast processors, copious memory, resource demanding software, and power-hungry hardware makes energy a precious resource. With hardware continuously improving in performance and price, vendors are able to build systems with higher-performance and higher power components trying to meet users’ ever increasing demands and compete for customers. • However, this results in systems that are over-provisioned with components that provide more capacity, more throughput, and more processing power than needed for the typical workload, and as a result, it is becoming more difficult to maintain long battery life in these devices. • While a smartphone contains many energy hungry components, such as CPU, display, and multiple radios, energy consumed by memory subsystem has been given limited consideration. • Therefore, we explore the efficiency of the existing energy management mechanisms on smartphones.
  • 4. MEMORY TECHNOLOGIES The memory technologies discussed in this paper include Dynamic RAM (DRAM) which is the most widely used memory technology in mobile devices and is otherwise referred to as Mobile RAM (MRAM). A recent contender for main memory technology is Phase Change Memory (PCM) which is a type of non-volatile random-access memory that eliminates idle power due to its non-volatile nature but offers lower performance than M-RAM. Another memory technology that is described is the Power Aware Virtual Memory (PAVM), which reduces the energy consumed by the memory in response to workloads becoming increasingly data-centric. This section describes the various memory technologies and how they are optimized in smartphones to give a better performance.
  • 5. 1. DYNAMIC RAM (DRAM) • Dynamic random-access memory (DRAM) is a type of random-access memory that stores each bit of data in a separate capacitor within an integrated circuit. • As applications are becoming increasingly data-centric, we expect main memory to remain as a significant energy consumer because achieving good overall system performance will be more likely to depend on having higher-performance and larger-capacity DRAM. • We use the terminology of the Double-Data Rate (DDR) memory simply because DDR is becoming the most common type of memory used in today's PC and server systems. This approach is not limited to only DDR but this technique can also be applied to other memory types, e.g., SDR and RDRAM.
  • 6. 1.1 MEMORY TRAFFIC RESHAPING • To reshape the memory traffic for our benefit, we must make memory access less random and more controllable. • We use a 4-rank system wherein memory requests are likely to be randomly distributed among the 4 ranks and this creates a large number of small and medium sized idle periods. • To elongate idle periods, the concepts of hot and cold ranks are introduced. • Since more opportunities are created on cold ranks since Self Refresh can be more utilized, more valuable opportunities are created in it.
  • 7. 1.1 MEMORY TRAFFIC RESHAPING In the experiments conducted, the average interarrival time was elongated by almost 2 orders of magnitude on cold ranks. An example showing that if memory traffic is left unshaped, power management cannot take full advantage of deeper power-saving states since most idle periods are too short.
  • 8. 1.2 EFFECT OF RESHAPING ON MEMORY TRAFFIC • To study the effect of memory traffic reshaping in more detail, we compare the results of migrating 1%, 5%, and 10% of pages. • Migrating only 1 % of pages gives only limited benefits in power reduction. On the other hand, migrating 10% of pages does not give any additional energy benefit beyond that of migrating 5%. In addition, it also suffers from more performance penalty due to having to migrate more pages. • Therefore, migrating 5% of pages gives the best result for the workloads we ran.
  • 9. 1.2 EFFECT OF RESHAPING ON MEMORY TRAFFIC • • To solve the problem at its root, it calls for an. alternative main memory design, where we should use high-performance, highly parallel memory on hot ranks and low-performance lowpower memory on cold ranks. • Effects of actively reshaping memory traffic by migrating 1%, 5%, and 10% of pages for the low memory intensive workload (above) and high memoryintensive workload (below). As we can see from the Figure, migrating 1% as opposed to 5% of pages does not give much benefit in reducing performance penalty. Results shows that a 35.63-38.87% additional energy can be saved by complementing existing power management techniques with this technique.
  • 10. 2. PHASE CHANGE MEMORY (PCM) • Phase change memory is a type of non-volatile random access memory and provides a non-volatile storage mechanism agreeable to process scaling. • However, for a DRAM alternative, we must architect PCM for feasibility in main memory within general-purpose systems. • Drawn from a rigorous survey of PCM device and circuit prototypes published within the last five years and comparing against modern DRAM memory subsystems, we examine the following: Buffer Organization and Partial Writes.
  • 11. 2.1 BUFFER ORGANIZATION • We examine PCM buffer organizations that satisfy DRAM imposed area constraints. • PCM buffer reorganizations reduce application execution time from 1.6x to 1.2x and memory energy from 2.2x to 1.0x, relative to DRAM-based systems. Evaluation: On optimizing average delay and energy across the workloads, we find four 512B-wide buffers most effective. Executing on effectively buffered PCM, more than half the benchmarks achieve within 5 percent of their DRAM performance. Although each PCM array write requires 43.1x more energy than a DRAM array write, these energy costs are mitigated by narrow buffer widths and additional rows, which reduce the granularity of buffer evictions and expose opportunities for write coalescing, respectively.
  • 12. 2.2 PARTIAL WRITES • Partial writes, which track data modifications and write only modified cache lines or words to the PCM array are utilized. Using an endurance model to estimate lifetime, we expect write coalescing and partial writes to deliver a memory module average lifetime of 5.6 years. • Scaling improves PCM endurance, extending lifetimes by four orders of magnitude at 32nm. Evaluation: • In a baseline architecture with a single 2048B-wide buffer, average module lifetime is approximately 525 hours. • For our memory intensive workloads, we observe 32.8 percent memory bus utilization. Scaling by application-specific write intensity, we find 6.9 percent of memory bus cycles are utilized by writes. • On average, the four 512B-wide buffers coalesce 38.9 percent of writes emerging from the memory bus, which is 47.0 percent utilized. Writes alone utilize 11.0 percent of the bus. Buffers use partial writes so that only a fraction of the buffer’s bits is written to the array. •
  • 13. PHASE CHANGE MEMORY (PCM) • Collectively, these results indicate PCM is a viable DRAM alternative, with architectural solutions providing competitive performance, comparable energy, and feasible lifetimes. • On utilizing PCM as a viable alternative to M-RAM, we need to note that it consumes more energy to perform I/O operations, particularly write operations, since the cell state has to be changed. • However, PCM consumes significantly less idle power than M-RAM, especially in the low-power state where the power consumption is reduced to 0. Therefore, we should leverage the tradeoffs between performance and energy efficiency to apply PCM technology in mobile devices.
  • 14. 3. CHARACTERIZING MOBILE SOFTWARE The applications selected for this survey are shown in the table This table lists 12 popular Android applications selected from the Android market along with their trace statistics. We is e a T-Mobile G1 smartphone is used to collect the application traces. Each trace consists of task intervals with the task execution length and the number of memory I/Os
  • 15. 3. CHARACTERIZING MOBILE SOFTWARE • Compared to the CPU speed, human interactions are extremely slow, such that a mobile system is idle and waiting for user input for the majority of time. • Prior study has shown that human perception threshold is between 50ms and 100ms and any event shorter than the perception threshold appears instantaneous to the user. • Completing task execution earlier than the perception threshold is meaningless since the user will not notice this amount of time and cannot initiate new tasks any sooner. This observation is the key to enabling energy optimizations without impacting observed application performance • The majority of tasks are very short as more than 90% of all tasks complete within 10ms. Moreover, 95% of all tasks are shorter than 50ms, indicating that these tasks can be extended to the 50ms perception threshold deadline without any performance penalty. Similarly, for the remaining 5% of long tasks, any additional extension less than 50ms will not be noticed by the user, avoiding performance degradation.
  • 16. 4. MECHANISM COMPARISON • M-RAM needs to refresh the storage cells regularly for data retention, therefore consuming nonnegligible power even in the low-power state. PCM is able to completely eliminate idle power due to non-volatile nature. We will evaluate the effectiveness of various energy management mechanisms on M-RAM and PCM under the same execution environment. • For this survey, a simulator that models the system configuration of a T-Mobile G1 smartphone is used  System Configuration of a T-Mobile G1 • The memory subsystem consists of a memory controller and three 64MB ranks (192MB totally), for either M-RAM or PCM. The simulator feeds with the traces, determines the memory power state, and conducts task execution under the current CPU and memory state. The memory controller conducts memory I/O operations, and executes power state transitions for each rank based on the energy management mechanism
  • 17. 4.1 POWER AWARE VIRTUAL MEMORY (PAVM) • In mobile applications when the smartphone is waiting for user input, idle periods are common and therefore powering down the memory devices during this period can help in reducing the energy consumption. • Power-Aware Virtual Memory (PAVM) is a simple and efficient way to provide energy management. It keeps the memory devices occupied by the currently running process in the active state while keeping all other memory devices in a low-power state to save energy. Memory devices used by the newly scheduled process are powered up during the context switch time to minimize the delays exposed to the user due to power state transitions.  Memory energy consumption with a standard system (ON) and the PAVM mechanism. The left two bars for each application show the energy of M-RAM and PCM in standard system, while the right two bars show the energy for the PAVM mechanism.
  • 18. 5. ON-DEMAND MECHANISMS • Despite PAVM’s benefits to the standard system, it fails to address the energy efficiency of the active rank accessed during the process execution. • Immediate Power Down (IPD) mechanism and Immediate Self Refresh (ISR) mechanism have been proposed for RAM to provide on-demand power state transitions and improve energy efficiency of active ranks. • As soon as an I/O request arrives at the memory controller, the rank to be accessed is transitioned to the PRE state, and transitioned back to a low-power immediately after the I/O completes. Each energy bar is normalized to M-RAM with the PAVM mechanism.
  • 19. 5. ON-DEMAND MECHANISMS  Memory energy consumption for on-demand mechanisms normalized to the PAVM mechanism on M-RAM • The first bar shows the energy consumption of PCM with the PAVM mechanism and the other bars show the energy of on-demand mechanisms. • The two on-demand mechanisms outperform the PAVM mechanism on PCM and as a result, PCM’s inferior I/O efficiency can’t offset its energy savings from idle periods, except for lightly loaded applications Amazon, Music and Twidroid. • The PCM OFF mechanism completely eliminates the active idle energy, resulting in 44% energy reduction over the PAVM mechanism on PCM. Compared to the IPD and ISR mechanisms on M-RAM, the PCM OFF mechanism offers 18% and 22% energy savings respectively.
  • 20. 5. ON-DEMAND MECHANISMS  The distribution of extended tasks that expose delays for on-demand mechanisms • We can observe that the IPD mechanism achieves the best performance with negligible delays exposed. The ISR and PCM OFF mechanisms, on the other hand, incur more evident degradation due to the 141.5ns long transition latency • energy is the only concern, the novel PCM technology with on-demand mechanism surpasses the traditional MRAM. However, taking into account the performance as well, M-RAM still has the chance to beat PCM, • We therefore need an approach to balance energy and performance more efficiently than any standalone memory technology.
  • 21. 6. HYBRID MEMORY ARCHITECTURE • From the previous analysis, we can see that PCM is superior to M-RAM for its lower idle power consumption, while M-RAM excels PCM for faster I/O speed and lower I/O energy. Therefore, a hybrid memory consisting of M-RAM and PCM can improve both the energy efficiency and performance. • When an application is invoked and its image does not reside in M-RAM, it is loaded into M-RAM either from secondary storage or PCM, and the corresponding process identifier is put at the head of the LRU list. • When an application is closed, its memory image will stay in M-RAM until it is swapped out. • The hybrid approach preserves more than 99% of IPD’s performance and achieves the best energy efficiency among all mechanisms while maintaining almost full memory performance.
  • 22. CONCLUSIONS • The PAVM mechanism saves more than 90% energy as compared to the standard system with no energy management. • Additional energy savings are provided by the on-demand mechanisms which offer around 40% more savings compared to the PAVM mechanism, for both M-RAM and PCM. • The energy efficiency can be improved further by a hybrid approach consisting of mixed memory technologies and mechanisms and this approach provides an energy savings of 98% with negligible performance overheads as compared to the standard system.

Editor's Notes

  1. As smartphones are becoming widely used and an indispensable part of our lives, limiting their energy consumption is critical. With battery technology improving at a much slower pace than hardware technology thereby making the gap between energy supply and demand increasingly larger, energy efficiency has become one of the most important factors in designing smartphones.