SlideShare a Scribd company logo
1 of 38
Leveling to the Last Mile: Near-zero-
cost Bit Level Wear Leveling for
PCM-based Main Memory
Speaker: Po-Chuan, Chen
Table of contents
• Abstract
• Introduction
• Related work : Existing wear leveling
• Bit-level wear leveling : necessary & challenges
• Intra – line flipping (ILF)
• Basic scheme of ILF
• Flipping frequency
• Flipping granularity
• Case study : combining start-gap and enhanced ILF
• Evaluation
• Conclusion
Abstract
• Phase change memory (PCM)
characteristics of non-volatility, scalability and near-zero leakage power.
• Drawbacks
the comparatively poor endurance of PCM largely limits its adoption.
• Solution
propose a near-zero-cost bit-level wear leveling strategy to improve PCM endurance
Introduction
• To fully exploit PCM write potential, many techniques have been
proposed for endurance enhancement, which can be summarized in
two categories
 Reducing the write counts in order to elongate PCM lifetime
 Wear leveling through migrating writes from heavily burdened regions to
less written regions to relief the wear-out risk
Contribution
 A bit-level wear leveling design
 Near-zero-cost intra-line flipping scheme (ILF) for PCM endurance
enhancement and then extend it to an enhanced, dynamic ILF scheme.
 Evaluate the efficacy of the proposed schemes by combining them with
existing coarser-grained wear leveling approaches.
Wear leveling can be conducted at various granularities
 Segment level
aim at balancing writes across segments
 Page level
evenly spread writes across pages
 Line level
balance write counts across memory lines
Some other things that hasn’t done it before
Few work considers wear leveling at the bit level since it is impractical to
precisely record write counts of different bits.
However, bit-level wear leveling is of great necessity due to the significant
write imbalance within memory lines caused by program characteristics.
In this work
• A near-zero-cost bit-level wear leveling scheme, intra-line flipping (ILF).
 It periodically flips the bit mapping in a memory line so as to swap the
writes on hot and cold bits
 And solve this problem in a regular manner without any counter or
address mapping table.
Table of contents
• Abstract
• Introduction
• Related work : Existing wear leveling
• Bit-level wear leveling : necessary & challenges
• Intra – line flipping (ILF)
• Basic scheme of ILF
• Flipping frequency
• Flipping granularity
• Case study : combining start-gap and enhanced ILF
• Evaluation
• Conclusion
Existing related works (Categories)
Existing related works (Strategies)
Row shifting
• It conducts bit-level wear leveling, and is most closely related to this paper.
 a counter recording the write counts for each line is maintained.
 Data are shifted periodically based on the write count information.
 The shift location is also recorded for each line.
• Compared to this approach, the proposed technique does not record
detailed write information.
Table of contents
• Abstract
• Introduction
• Related work : Existing wear leveling
• Bit-level wear leveling : necessary & challenges
• Intra – line flipping (ILF)
• Basic scheme of ILF
• Flipping frequency
• Flipping granularity
• Case study : combining start-gap and enhanced ILF
• Evaluation
• Conclusion
Bit-level wear leveling : necessary & challenges
To examine the write distribution within lines, we
record the bit flip rate at different bit positions for
SPEC2006 benchmarks
We quantify this potential of endurance
improvement using Max(counterArray) /
Ave(counterArray)
Database searching
Chess playing
C++ program library 286%
227%
280%
824%
417%
Endurance improvement
Prohibitive overhead
• For pervious studies, if quasi start-gap is applied, it will need two
counters as well as one empty bit for each 512-bit memory line,
imposing an overhead of 9*2+1=19 bits.
• The balancing scheme needs to take into consideration application-
specific write patterns to achieve endurance benefit.
How to solve it ?
Yet the challenge is to develop a wear leveling scheme effective for
applications of different write patterns, with no need of any counter or
address mapping table.
In this paper, we achieve this goal through a cost-efficient intra-line
flipping (ILF) scheme.
Table of contents
• Abstract
• Introduction
• Related work : Existing wear leveling
• Bit-level wear leveling : necessary & challenges
• Intra – line flipping (ILF)
• Basic scheme of ILF
• Flipping frequency
• Flipping granularity
• Case study : combining start-gap and enhanced ILF
• Evaluation
• Conclusion
Basic scheme
• Motivated by the fact that in most programs the least significant bits
are more frequently programmed than the most significant bits
Two important characteristics
There are two key issues pertaining to flipping frequency and flipping
granularity in designing the ILF scheme.
 Flipping frequency defines how often the storage direction is flipped.
 Flipping granularity defines the unit size for flipping.
Flipping frequency
When ILF is used exclusively, a counter can be used to record the
desired flipping frequency and trigger flipping periodically.
A more efficient usage is to combine bit-level wear leveling with
coarser-grained wear leveling strategies.
Table of contents
• Abstract
• Introduction
• Related work : Existing wear leveling
• Bit-level wear leveling : necessary & challenges
• Intra – line flipping (ILF)
• Basic scheme of ILF
• Flipping frequency
• Flipping granularity
• Case study : combining start-gap and enhanced ILF
• Evaluation
• Conclusion
Flip_bit
The status bit flip bit is updated accordingly.
 ILF eliminates the overhead for maintaining counters, and the flipping
frequency is the same as the data migration frequency in random page
swap.
 The same strategy can be applied when ILF is combined with other
coarser-grained wear leveling techniques.
Flipping granularity
• Application specific ILF
(The best flipping granularity)
• Uniform ILF :
the most effective flipping granularity for all benchmarks on average
• Enhanced ILF :
almost all the benchmarks benefit from enhanced ILF when compared
against the uniform ILF
Application specific ILF
Media applications usually process data in 8-byte unit, and thus can
adopt a 64-bit flipping granularity to even out the writes within every
64 bits.
Computation intensive applications can choose the granularity that
effectively flips the least significant and the most significant bits to
even out the writes within one line.
Flipping granularity
• Application specific ILF
(The best flipping granularity)
• Uniform ILF :
the most effective flipping granularity for all benchmarks on average
• Enhanced ILF :
almost all the benchmarks benefit from enhanced ILF when compared
against the uniform ILF
Uniform ILF
We varied the granularity among {8, 16, 32, 64, 128, 256, 512} bits,
collected the resulting average memory lifetime, and found that
granularity of 128-bit is generally the best for all tested benchmarks.
With this uniform flipping granularity, the runtime flipping can be
conducted in a regular and effective manner, with no need of
application specific information.
Flipping granularity
• Application specific ILF
(The best flipping granularity)
• Uniform ILF :
the most effective flipping granularity for all benchmarks on average
• Enhanced ILF :
almost all the benchmarks benefit from enhanced ILF when compared
against the uniform ILF
Enhanced ILF
The flipping granularity can be iteratively chosen from the set of {8, 16,
32, 64, 128, 256, 512} bits, and a three-bit vector flip bits can be
maintained to record the flip status.
As the scheme is furthermore application independent and input-
variance tolerant, it is applied in the rest of this paper.
In this way, flipping is periodically
triggered by
 Data migrations in coarser-grained
wear leveling and
 The writes within memory lines can
be evenly distributed.
Case study (start-gap + enhanced ILF)
• Coarser-grained wear leveling : start – gap (memory line level)
• Bit level wear leveling : enhanced ILF
Table of contents
• Abstract
• Introduction
• Related work : Existing wear leveling
• Bit-level wear leveling : necessary & challenges
• Intra – line flipping (ILF)
• Basic scheme of ILF
• Flipping frequency
• Flipping granularity
• Case study : combining start-gap and enhanced ILF
• Evaluation
• Conclusion
Setup
Our experiments adopt all the settings suggested in the original work, including
Name Setup Name Setup
segment size 1MB swapping frequency
(segment swap)
2 × 106 writes for
segment swap
page size 4KB swapping frequency
(random page swap)
once per 512
writes
Line size 64 bytes swapping frequency
(start-gap line wear
leveling)
once per 100
write
Evaluation
Comparing 3 different setup in experiment
 Only start – gap (line wear leveling scheme)
 Start – gap + row shifting
 Start – gap + intra – line flipping
199.2 % enhancement
39.7 % enhancement
66.8 % enhancement
Compare with row shifting
Row shifting Intra – line flipping
Start – gap 124.8 % 199.2 %
Random page swap 23.0 % 39.7 %
Segment swap 55.8 % 66.8 %
Overhead
• Row shifting :
8 bit counter + 6 bit shifting location register
(8 + 6) / 512 (each memory line) = 2.73 %
• ILF :
3 bit flip bits
3 / (4096 / 8) = 0.01 %
Overhead
• Row shifting :
512 bit outputs + 6 bit control signals
512 (64 to 1 mux)
• ILF :
512 bit outputs + 3 bit control signals
512 (8 to 1 mux)
The performance and power consumption of the ILF
flippers is 1/9 of the shifters used in row shifting.
Conclusion
• Intra – line flipping (PCM wear leveling)
• Combining with coarser – grained wear leveling strategy
(start – gap line level wear leveling)
• 34 % higher endurance
• Low storage, performance and energy overhead

More Related Content

Similar to Leveling to the Last Mile: Near-zero-cost Bit Level Wear Leveling for PCM-based Main Memory.pptx

Data Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelData Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelNikhil Sharma
 
Computer architecture page replacement algorithms
Computer architecture page replacement algorithmsComputer architecture page replacement algorithms
Computer architecture page replacement algorithmsMazin Alwaaly
 
Distributed Model Validation with Epsilon
Distributed Model Validation with EpsilonDistributed Model Validation with Epsilon
Distributed Model Validation with EpsilonSina Madani
 
Distributed Convex Optimization Thesis - Behroz Sikander
Distributed Convex Optimization Thesis - Behroz SikanderDistributed Convex Optimization Thesis - Behroz Sikander
Distributed Convex Optimization Thesis - Behroz Sikanderrogerz1234567
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with PipeliningAneesh Raveendran
 
An application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic forAn application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic forKhyati Rajput
 
Instruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) LimitationsInstruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) LimitationsJose Pinilla
 
Morph : a novel accelerator
Morph : a novel acceleratorMorph : a novel accelerator
Morph : a novel acceleratorBaharJV
 
In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitJinwon Lee
 
Understanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceUnderstanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceGlenn K. Lockwood
 
GraphChi big graph processing
GraphChi big graph processingGraphChi big graph processing
GraphChi big graph processinghuguk
 
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...Bharath Sudharsan
 
Low power correlation for IEEE 802.16 OFDM synchronisation using FPGA
Low power correlation for IEEE 802.16 OFDM  synchronisation using FPGA Low power correlation for IEEE 802.16 OFDM  synchronisation using FPGA
Low power correlation for IEEE 802.16 OFDM synchronisation using FPGA Brundha Sholaganga
 

Similar to Leveling to the Last Mile: Near-zero-cost Bit Level Wear Leveling for PCM-based Main Memory.pptx (20)

Data Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelData Parallel and Object Oriented Model
Data Parallel and Object Oriented Model
 
Computer architecture page replacement algorithms
Computer architecture page replacement algorithmsComputer architecture page replacement algorithms
Computer architecture page replacement algorithms
 
Distributed Model Validation with Epsilon
Distributed Model Validation with EpsilonDistributed Model Validation with Epsilon
Distributed Model Validation with Epsilon
 
Distributed Convex Optimization Thesis - Behroz Sikander
Distributed Convex Optimization Thesis - Behroz SikanderDistributed Convex Optimization Thesis - Behroz Sikander
Distributed Convex Optimization Thesis - Behroz Sikander
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with Pipelining
 
An application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic forAn application classification guided cache tuning heuristic for
An application classification guided cache tuning heuristic for
 
Instruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) LimitationsInstruction Level Parallelism (ILP) Limitations
Instruction Level Parallelism (ILP) Limitations
 
Morph : a novel accelerator
Morph : a novel acceleratorMorph : a novel accelerator
Morph : a novel accelerator
 
Programming Techniques.pptx
Programming Techniques.pptxProgramming Techniques.pptx
Programming Techniques.pptx
 
In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
 
Intel IA 64
Intel IA 64Intel IA 64
Intel IA 64
 
Understanding and Measuring I/O Performance
Understanding and Measuring I/O PerformanceUnderstanding and Measuring I/O Performance
Understanding and Measuring I/O Performance
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
GraphChi big graph processing
GraphChi big graph processingGraphChi big graph processing
GraphChi big graph processing
 
Nbvtalkataitamimageprocessingconf
NbvtalkataitamimageprocessingconfNbvtalkataitamimageprocessingconf
Nbvtalkataitamimageprocessingconf
 
lect13_programmable_dp.pptx
lect13_programmable_dp.pptxlect13_programmable_dp.pptx
lect13_programmable_dp.pptx
 
PraveenBOUT++
PraveenBOUT++PraveenBOUT++
PraveenBOUT++
 
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
ECML PKDD 2021 ML meets IoT Tutorial Part III: Deep Optimizations of CNNs and...
 
Conditional branches
Conditional branchesConditional branches
Conditional branches
 
Low power correlation for IEEE 802.16 OFDM synchronisation using FPGA
Low power correlation for IEEE 802.16 OFDM  synchronisation using FPGA Low power correlation for IEEE 802.16 OFDM  synchronisation using FPGA
Low power correlation for IEEE 802.16 OFDM synchronisation using FPGA
 

More from Po-Chuan Chen

E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdfE-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdfPo-Chuan Chen
 
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...Po-Chuan Chen
 
Quark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdfQuark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdfPo-Chuan Chen
 
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...Po-Chuan Chen
 
On the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdfOn the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdfPo-Chuan Chen
 
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...Po-Chuan Chen
 
A Statistical Perspective on Retrieval-Based Models.pdf
A Statistical Perspective on Retrieval-Based Models.pdfA Statistical Perspective on Retrieval-Based Models.pdf
A Statistical Perspective on Retrieval-Based Models.pdfPo-Chuan Chen
 
A Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdfA Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdfPo-Chuan Chen
 
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdfAdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdfPo-Chuan Chen
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...Po-Chuan Chen
 
Active Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdfActive Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdfPo-Chuan Chen
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfPo-Chuan Chen
 
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfCold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfPo-Chuan Chen
 
Image_to_Prompts.pdf
Image_to_Prompts.pdfImage_to_Prompts.pdf
Image_to_Prompts.pdfPo-Chuan Chen
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen
 
Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfPo-Chuan Chen
 
Off-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfOff-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfPo-Chuan Chen
 
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdfA Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdfPo-Chuan Chen
 
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
Is Reinforcement Learning (Not) for Natural
Language Processing.pdfIs Reinforcement Learning (Not) for Natural
Language Processing.pdf
Is Reinforcement Learning (Not) for Natural Language Processing.pdfPo-Chuan Chen
 
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfHyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfPo-Chuan Chen
 

More from Po-Chuan Chen (20)

E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdfE-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
 
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
 
Quark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdfQuark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdf
 
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
 
On the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdfOn the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdf
 
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
 
A Statistical Perspective on Retrieval-Based Models.pdf
A Statistical Perspective on Retrieval-Based Models.pdfA Statistical Perspective on Retrieval-Based Models.pdf
A Statistical Perspective on Retrieval-Based Models.pdf
 
A Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdfA Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdf
 
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdfAdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
 
Active Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdfActive Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdf
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
 
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfCold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
 
Image_to_Prompts.pdf
Image_to_Prompts.pdfImage_to_Prompts.pdf
Image_to_Prompts.pdf
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
 
Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdf
 
Off-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfOff-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdf
 
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdfA Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
 
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
Is Reinforcement Learning (Not) for Natural
Language Processing.pdfIs Reinforcement Learning (Not) for Natural
Language Processing.pdf
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
 
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfHyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
 

Recently uploaded

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 

Recently uploaded (20)

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 

Leveling to the Last Mile: Near-zero-cost Bit Level Wear Leveling for PCM-based Main Memory.pptx

  • 1. Leveling to the Last Mile: Near-zero- cost Bit Level Wear Leveling for PCM-based Main Memory Speaker: Po-Chuan, Chen
  • 2. Table of contents • Abstract • Introduction • Related work : Existing wear leveling • Bit-level wear leveling : necessary & challenges • Intra – line flipping (ILF) • Basic scheme of ILF • Flipping frequency • Flipping granularity • Case study : combining start-gap and enhanced ILF • Evaluation • Conclusion
  • 3. Abstract • Phase change memory (PCM) characteristics of non-volatility, scalability and near-zero leakage power. • Drawbacks the comparatively poor endurance of PCM largely limits its adoption. • Solution propose a near-zero-cost bit-level wear leveling strategy to improve PCM endurance
  • 4. Introduction • To fully exploit PCM write potential, many techniques have been proposed for endurance enhancement, which can be summarized in two categories  Reducing the write counts in order to elongate PCM lifetime  Wear leveling through migrating writes from heavily burdened regions to less written regions to relief the wear-out risk
  • 5. Contribution  A bit-level wear leveling design  Near-zero-cost intra-line flipping scheme (ILF) for PCM endurance enhancement and then extend it to an enhanced, dynamic ILF scheme.  Evaluate the efficacy of the proposed schemes by combining them with existing coarser-grained wear leveling approaches.
  • 6. Wear leveling can be conducted at various granularities  Segment level aim at balancing writes across segments  Page level evenly spread writes across pages  Line level balance write counts across memory lines
  • 7. Some other things that hasn’t done it before Few work considers wear leveling at the bit level since it is impractical to precisely record write counts of different bits. However, bit-level wear leveling is of great necessity due to the significant write imbalance within memory lines caused by program characteristics.
  • 8. In this work • A near-zero-cost bit-level wear leveling scheme, intra-line flipping (ILF).  It periodically flips the bit mapping in a memory line so as to swap the writes on hot and cold bits  And solve this problem in a regular manner without any counter or address mapping table.
  • 9. Table of contents • Abstract • Introduction • Related work : Existing wear leveling • Bit-level wear leveling : necessary & challenges • Intra – line flipping (ILF) • Basic scheme of ILF • Flipping frequency • Flipping granularity • Case study : combining start-gap and enhanced ILF • Evaluation • Conclusion
  • 10. Existing related works (Categories)
  • 11. Existing related works (Strategies)
  • 12. Row shifting • It conducts bit-level wear leveling, and is most closely related to this paper.  a counter recording the write counts for each line is maintained.  Data are shifted periodically based on the write count information.  The shift location is also recorded for each line. • Compared to this approach, the proposed technique does not record detailed write information.
  • 13. Table of contents • Abstract • Introduction • Related work : Existing wear leveling • Bit-level wear leveling : necessary & challenges • Intra – line flipping (ILF) • Basic scheme of ILF • Flipping frequency • Flipping granularity • Case study : combining start-gap and enhanced ILF • Evaluation • Conclusion
  • 14. Bit-level wear leveling : necessary & challenges To examine the write distribution within lines, we record the bit flip rate at different bit positions for SPEC2006 benchmarks We quantify this potential of endurance improvement using Max(counterArray) / Ave(counterArray) Database searching Chess playing C++ program library 286% 227% 280% 824% 417% Endurance improvement
  • 15. Prohibitive overhead • For pervious studies, if quasi start-gap is applied, it will need two counters as well as one empty bit for each 512-bit memory line, imposing an overhead of 9*2+1=19 bits. • The balancing scheme needs to take into consideration application- specific write patterns to achieve endurance benefit.
  • 16. How to solve it ? Yet the challenge is to develop a wear leveling scheme effective for applications of different write patterns, with no need of any counter or address mapping table. In this paper, we achieve this goal through a cost-efficient intra-line flipping (ILF) scheme.
  • 17. Table of contents • Abstract • Introduction • Related work : Existing wear leveling • Bit-level wear leveling : necessary & challenges • Intra – line flipping (ILF) • Basic scheme of ILF • Flipping frequency • Flipping granularity • Case study : combining start-gap and enhanced ILF • Evaluation • Conclusion
  • 18. Basic scheme • Motivated by the fact that in most programs the least significant bits are more frequently programmed than the most significant bits
  • 19. Two important characteristics There are two key issues pertaining to flipping frequency and flipping granularity in designing the ILF scheme.  Flipping frequency defines how often the storage direction is flipped.  Flipping granularity defines the unit size for flipping.
  • 20. Flipping frequency When ILF is used exclusively, a counter can be used to record the desired flipping frequency and trigger flipping periodically. A more efficient usage is to combine bit-level wear leveling with coarser-grained wear leveling strategies.
  • 21. Table of contents • Abstract • Introduction • Related work : Existing wear leveling • Bit-level wear leveling : necessary & challenges • Intra – line flipping (ILF) • Basic scheme of ILF • Flipping frequency • Flipping granularity • Case study : combining start-gap and enhanced ILF • Evaluation • Conclusion
  • 22. Flip_bit The status bit flip bit is updated accordingly.  ILF eliminates the overhead for maintaining counters, and the flipping frequency is the same as the data migration frequency in random page swap.  The same strategy can be applied when ILF is combined with other coarser-grained wear leveling techniques.
  • 23. Flipping granularity • Application specific ILF (The best flipping granularity) • Uniform ILF : the most effective flipping granularity for all benchmarks on average • Enhanced ILF : almost all the benchmarks benefit from enhanced ILF when compared against the uniform ILF
  • 24. Application specific ILF Media applications usually process data in 8-byte unit, and thus can adopt a 64-bit flipping granularity to even out the writes within every 64 bits. Computation intensive applications can choose the granularity that effectively flips the least significant and the most significant bits to even out the writes within one line.
  • 25. Flipping granularity • Application specific ILF (The best flipping granularity) • Uniform ILF : the most effective flipping granularity for all benchmarks on average • Enhanced ILF : almost all the benchmarks benefit from enhanced ILF when compared against the uniform ILF
  • 26. Uniform ILF We varied the granularity among {8, 16, 32, 64, 128, 256, 512} bits, collected the resulting average memory lifetime, and found that granularity of 128-bit is generally the best for all tested benchmarks. With this uniform flipping granularity, the runtime flipping can be conducted in a regular and effective manner, with no need of application specific information.
  • 27. Flipping granularity • Application specific ILF (The best flipping granularity) • Uniform ILF : the most effective flipping granularity for all benchmarks on average • Enhanced ILF : almost all the benchmarks benefit from enhanced ILF when compared against the uniform ILF
  • 28. Enhanced ILF The flipping granularity can be iteratively chosen from the set of {8, 16, 32, 64, 128, 256, 512} bits, and a three-bit vector flip bits can be maintained to record the flip status. As the scheme is furthermore application independent and input- variance tolerant, it is applied in the rest of this paper.
  • 29. In this way, flipping is periodically triggered by  Data migrations in coarser-grained wear leveling and  The writes within memory lines can be evenly distributed.
  • 30. Case study (start-gap + enhanced ILF) • Coarser-grained wear leveling : start – gap (memory line level) • Bit level wear leveling : enhanced ILF
  • 31. Table of contents • Abstract • Introduction • Related work : Existing wear leveling • Bit-level wear leveling : necessary & challenges • Intra – line flipping (ILF) • Basic scheme of ILF • Flipping frequency • Flipping granularity • Case study : combining start-gap and enhanced ILF • Evaluation • Conclusion
  • 32. Setup Our experiments adopt all the settings suggested in the original work, including Name Setup Name Setup segment size 1MB swapping frequency (segment swap) 2 × 106 writes for segment swap page size 4KB swapping frequency (random page swap) once per 512 writes Line size 64 bytes swapping frequency (start-gap line wear leveling) once per 100 write
  • 33. Evaluation Comparing 3 different setup in experiment  Only start – gap (line wear leveling scheme)  Start – gap + row shifting  Start – gap + intra – line flipping
  • 34. 199.2 % enhancement 39.7 % enhancement 66.8 % enhancement
  • 35. Compare with row shifting Row shifting Intra – line flipping Start – gap 124.8 % 199.2 % Random page swap 23.0 % 39.7 % Segment swap 55.8 % 66.8 %
  • 36. Overhead • Row shifting : 8 bit counter + 6 bit shifting location register (8 + 6) / 512 (each memory line) = 2.73 % • ILF : 3 bit flip bits 3 / (4096 / 8) = 0.01 %
  • 37. Overhead • Row shifting : 512 bit outputs + 6 bit control signals 512 (64 to 1 mux) • ILF : 512 bit outputs + 3 bit control signals 512 (8 to 1 mux) The performance and power consumption of the ILF flippers is 1/9 of the shifters used in row shifting.
  • 38. Conclusion • Intra – line flipping (PCM wear leveling) • Combining with coarser – grained wear leveling strategy (start – gap line level wear leveling) • 34 % higher endurance • Low storage, performance and energy overhead