SlideShare a Scribd company logo
Addressing Process Scaling
Challenges in Server Memory
Eric Caward, Business Development Manager
Linley Spring Processor Conference: April 11, 2018
©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject
to change without notice. All information is provided on an “AS IS” basis without warranties of any kind.
Statements regarding products, including regarding their features, availability, functionality, or
compatibility, are provided for informational purposes only and do not modify the warranty, if any,
applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron
trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their
respective owners.
0
2
4
6
8
CY16 CY17 CY18 CY19 CY20 CY21 CY22
BGB
Explosive Cloud Growth
DRAM TAM
Market
DRAM Cloud
22% 32%
2016-2022 CAGR
Delivering Data Center Access to More Bits
Test/Debug
(low volume)
Design Qualification
(ramping volume)
Full Production
(high volume)
MatureBaseline
TraditionalSupply
0%-100%
Reliable Bits w/ECC
Delivering Data Center Access to More Bits
Test/Debug
(low volume)
Design Qualification
(ramping volume)
Full Production
(high volume)
MatureBaseline
TraditionalSupply
0%-100%
Additional Supply
“Good” bits vs. “Bad” bits
Scope of Good Bits vs. Bad Bits“Good” bits vs. “Bad” bits
Undetected Detected
SoftHard
Uncorrectable Correctable
lexicon
the vocabulary of a
particular language, field,
social class, person, etc.
noun [lek-si-kon, -kuh n]
Understanding Error Types
2009 2019
50nm 1Ynm
100% 10%
DRAM Cell Scaling
Physics is the Root Cause of Refresh Related Single Bits
2009 2019
50nm 1Ynm
100% 10%
DRAM Cell Scaling
Physics is the Root Cause of Refresh Related Single Bits
2009 2019
50nm 1Ynm
100% 10%
0
16
32
48
64
2009 2011 2013 2015 2017 2019
Variable Retention Time
Effective Internal Refresh
(Ext tREF=64ms)
DRAM Cell Scaling
Physics is the Root Cause of Refresh Related Single Bits
10’s of Millions of Modules Shipped
 < 0.05% were confirmed fails
 ~ 0.005% were estimated uncorrectable
− Includes post production damage, the most
predominant of failure mechanisms
Predominate Module Field Failures are Correctable
Pass Correctable Uncorrectable
Robust Error Correction in Today’s Data Center
 100% Correctable
ECC: Single Bit
Beat Example Data (72 bits)
0 00 FF00FF01 FF00FF00
1 55 AA55AA55 AA55AA55
2 CC 33CC33CC 33CC33CC
3 99 66996699 66996699
4 88 77887788 77887788
5 44 BB44BB44 BB44BB44
6 DD 22DD22DD 22DD22DD
7 11 EE11EE11 EE11EE11
Robust Error Correction in Today’s Data Center
 100% Correctable  x8 Components
 Standard Config
 Lockstep Config
ECC: Single Bit
 x4 Components
 Able to correct errors in
different halves of burst
ECC: Multi Bits
Beat Example Data (72 bits)
0 00 FF00FF01 FF00FF00
1 55 AA55AA55 AA55AA55
2 CC 33CC33CC 33CC33CC
3 99 66996699 66996699
4 88 77887788 77887788
5 44 BB44BB44 BB44BB44
6 DD 22DD22DD 22DD22DD
7 11 EE11EE11 EE11EE11
Beat Example Data (72 bits)
0 00 FF00FE00 FF00FF00
1 55 AA55AB55 AA55AA55
2 CC 33CC32CC 33CC33CC
3 99 66996799 66996699
4 88 77887788 778877F8
5 44 BB44BB44 BB44BBF4
6 DD 22DD22DD 22DD22FD
7 11 EE11EE11 EE11EEF1
Beat Example Data (72 bits)
0 00 FF00FF00 FF00FF00
1 55 AA05AA55 AA55AA55
2 CC 330C33CC 33CC33CC
3 99 66996699 66996699
4 88 77887788 77887788
5 44 BB5BBB44 BB44BB44
6 DD 22DD22DD 22DD22DD
7 11 EE11EE11 EE11EE11
Correctable Errors Do Not Foretell Uncorrectable Errors
SETUP
Correctable Errors Do Not Foretell Uncorrectable Errors
SETUP
Correctable Errors Do Not Foretell Uncorrectable Errors
SETUP
Correctable Errors Do Not Foretell Uncorrectable Errors
SETUP
Correctable Errors Do Not Foretell Uncorrectable Errors
SETUP
Correctable Errors Do Not Foretell Uncorrectable Errors
565,113,525
0
200,000,000
400,000,000
600,000,000
0
1,000,000
2,000,000
3,000,000
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
151
161
171
181
191
201
211
221
231
241
TotalNumberofErrors
NumberofErrorsPerDay
Days tested
Number of Correctable Errors
System Correctable fails Cumulative Fails Log. (System Correctable fails)
1 241
Correctable Errors Do Not Foretell Uncorrectable Errors
565,113,525
0
200,000,000
400,000,000
600,000,000
0
1,000,000
2,000,000
3,000,000
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
151
161
171
181
191
201
211
221
231
241
TotalNumberofErrors
NumberofErrorsPerDay
Days tested
Number of Correctable Errors
System Correctable fails Cumulative Fails Log. (System Correctable fails)
1 241
Note: Application tests only finds 1,170 (or 0.00021% of errors found)
565,113,525
0
200,000,000
400,000,000
600,000,000
0
1,000,000
2,000,000
3,000,000
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
151
161
171
181
191
201
211
221
231
241
TotalNumberofErrors
NumberofErrorsPerDay
Days tested
Number of Correctable Errors
System Correctable fails Cumulative Fails Log. (System Correctable fails)
Correctable Errors Do Not Foretell Uncorrectable Errors
1 241
0 Uncorrectable Events Detected
Note: Application tests only finds 1,170 (or 0.00021% of errors found)
NoAppreciable Performance Penalty
SETUP
NoAppreciable Performance Penalty
SETUP
NoAppreciable Performance Penalty
SETUP
Context Switch Performance Cost
NoAppreciable Performance Penalty
Results
0.00000%
0.00025%
0.00050%
0.00075%
0.00100%
2 4 6 8 10 12 14
PerformanceCost@1/s
Number of Processes
lmbench lat_ctx
(size=64k ovr=2.68)
Context Switch
0.00028% @ 1/s
Context Switch Performance Cost Single Bit Performance Cost
NoAppreciable Performance Penalty
Results
0
100,000
200,000
300,000
400,000
0%
10%
20%
30%
40%
50%
60%
Errors
Bandwidth%Normalized
Monitor Data
Total data bus B/W % iMC:N0_H1_C1[4]-Errs
0.00000%
0.00025%
0.00050%
0.00075%
0.00100%
2 4 6 8 10 12 14
PerformanceCost@1/s
Number of Processes
lmbench lat_ctx
(size=64k ovr=2.68)
Context Switch
0.00028% @ 1/s
Context Switch Performance Cost Single Bit Performance Cost
Context Switch
0.00028% @ 1/s
Error Correction
0.0000319% @ 1/s
NoAppreciable Performance Penalty
Results
0
100,000
200,000
300,000
400,000
0%
10%
20%
30%
40%
50%
60%
Errors
Bandwidth%Normalized
Monitor Data
Total data bus B/W % iMC:N0_H1_C1[4]-Errs
0.00000%
0.00025%
0.00050%
0.00075%
0.00100%
2 4 6 8 10 12 14
PerformanceCost@1/s
Number of Processes
lmbench lat_ctx
(size=64k ovr=2.68)
>>
Achieve Form, Fit, Function,
Performance, and Reliability
Leverage System Level ECC to Enable More Bits Today
www.micron.com/scale

More Related Content

Similar to Micron: Addressing Process Scaling Challenges in Server Memory

Why everyone speaks about DR but only few use it?
Why everyone speaks about DR but only few use it?Why everyone speaks about DR but only few use it?
Why everyone speaks about DR but only few use it?
Francisco Alvarez
 
TestBird Mobile Game Testing Report
TestBird Mobile Game Testing ReportTestBird Mobile Game Testing Report
TestBird Mobile Game Testing Report
Jessica Miao
 
TestBird - Mobile Game Testing Report(Sample)
TestBird - Mobile Game Testing Report(Sample)TestBird - Mobile Game Testing Report(Sample)
TestBird - Mobile Game Testing Report(Sample)
TestBird
 
Blancco Bytes- Product Updates for ITADs
Blancco Bytes- Product Updates for ITADsBlancco Bytes- Product Updates for ITADs
Blancco Bytes- Product Updates for ITADs
Blancco
 
Healthcare IT
Healthcare ITHealthcare IT
Healthcare IT
RISC Networks
 
39245147 intro-es-i
39245147 intro-es-i39245147 intro-es-i
39245147 intro-es-i
Embeddedbvp
 
1120 rao mathew
1120 rao mathew1120 rao mathew
1120 rao mathew
Rising Media, Inc.
 
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesSolve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Niklas Quarfot Nielsen
 
智慧檢測技術與工業自動化
智慧檢測技術與工業自動化智慧檢測技術與工業自動化
智慧檢測技術與工業自動化
CHENHuiMei
 
Performance tuning intro
Performance tuning introPerformance tuning intro
Performance tuning intro
aioughydchapter
 
Memory built-in self-repair and correction for improving yield: a review
Memory built-in self-repair and correction for improving yield: a reviewMemory built-in self-repair and correction for improving yield: a review
Memory built-in self-repair and correction for improving yield: a review
IJECEIAES
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Sandesh Rao
 
Performance Tuning intro
Performance Tuning introPerformance Tuning intro
Performance Tuning intro
AiougVizagChapter
 
Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00
Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00
Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00
itsummaitsumma
 
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Preeya Selvarajah
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
scalawox
 
5.0 inch OLED 720x1280 MIPI Interface
5.0 inch OLED 720x1280 MIPI Interface5.0 inch OLED 720x1280 MIPI Interface
5.0 inch OLED 720x1280 MIPI Interface
Panox Display
 
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
The Linux Foundation
 
Fuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringFuzzy Control meets Software Engineering
Fuzzy Control meets Software Engineering
Pooyan Jamshidi
 
Using Machine Learning to Debug Oracle RAC Issues
Using Machine Learning to Debug Oracle RAC IssuesUsing Machine Learning to Debug Oracle RAC Issues
Using Machine Learning to Debug Oracle RAC Issues
Anil Nair
 

Similar to Micron: Addressing Process Scaling Challenges in Server Memory (20)

Why everyone speaks about DR but only few use it?
Why everyone speaks about DR but only few use it?Why everyone speaks about DR but only few use it?
Why everyone speaks about DR but only few use it?
 
TestBird Mobile Game Testing Report
TestBird Mobile Game Testing ReportTestBird Mobile Game Testing Report
TestBird Mobile Game Testing Report
 
TestBird - Mobile Game Testing Report(Sample)
TestBird - Mobile Game Testing Report(Sample)TestBird - Mobile Game Testing Report(Sample)
TestBird - Mobile Game Testing Report(Sample)
 
Blancco Bytes- Product Updates for ITADs
Blancco Bytes- Product Updates for ITADsBlancco Bytes- Product Updates for ITADs
Blancco Bytes- Product Updates for ITADs
 
Healthcare IT
Healthcare ITHealthcare IT
Healthcare IT
 
39245147 intro-es-i
39245147 intro-es-i39245147 intro-es-i
39245147 intro-es-i
 
1120 rao mathew
1120 rao mathew1120 rao mathew
1120 rao mathew
 
Solve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with KubernetesSolve the colocation conundrum: Performance and density at scale with Kubernetes
Solve the colocation conundrum: Performance and density at scale with Kubernetes
 
智慧檢測技術與工業自動化
智慧檢測技術與工業自動化智慧檢測技術與工業自動化
智慧檢測技術與工業自動化
 
Performance tuning intro
Performance tuning introPerformance tuning intro
Performance tuning intro
 
Memory built-in self-repair and correction for improving yield: a review
Memory built-in self-repair and correction for improving yield: a reviewMemory built-in self-repair and correction for improving yield: a review
Memory built-in self-repair and correction for improving yield: a review
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
 
Performance Tuning intro
Performance Tuning introPerformance Tuning intro
Performance Tuning intro
 
Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00
Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00
Intel Management Mode Firmware Runtime Update - OS Interface, revision 1.00
 
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
 
5.0 inch OLED 720x1280 MIPI Interface
5.0 inch OLED 720x1280 MIPI Interface5.0 inch OLED 720x1280 MIPI Interface
5.0 inch OLED 720x1280 MIPI Interface
 
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...
 
Fuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringFuzzy Control meets Software Engineering
Fuzzy Control meets Software Engineering
 
Using Machine Learning to Debug Oracle RAC Issues
Using Machine Learning to Debug Oracle RAC IssuesUsing Machine Learning to Debug Oracle RAC Issues
Using Machine Learning to Debug Oracle RAC Issues
 

Recently uploaded

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 

Recently uploaded (20)

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 

Micron: Addressing Process Scaling Challenges in Server Memory

  • 1. Addressing Process Scaling Challenges in Server Memory Eric Caward, Business Development Manager Linley Spring Processor Conference: April 11, 2018 ©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Statements regarding products, including regarding their features, availability, functionality, or compatibility, are provided for informational purposes only and do not modify the warranty, if any, applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their respective owners.
  • 2. 0 2 4 6 8 CY16 CY17 CY18 CY19 CY20 CY21 CY22 BGB Explosive Cloud Growth DRAM TAM Market DRAM Cloud 22% 32% 2016-2022 CAGR
  • 3. Delivering Data Center Access to More Bits Test/Debug (low volume) Design Qualification (ramping volume) Full Production (high volume) MatureBaseline TraditionalSupply 0%-100%
  • 4. Reliable Bits w/ECC Delivering Data Center Access to More Bits Test/Debug (low volume) Design Qualification (ramping volume) Full Production (high volume) MatureBaseline TraditionalSupply 0%-100% Additional Supply
  • 5. “Good” bits vs. “Bad” bits
  • 6. Scope of Good Bits vs. Bad Bits“Good” bits vs. “Bad” bits
  • 7. Undetected Detected SoftHard Uncorrectable Correctable lexicon the vocabulary of a particular language, field, social class, person, etc. noun [lek-si-kon, -kuh n] Understanding Error Types
  • 8. 2009 2019 50nm 1Ynm 100% 10% DRAM Cell Scaling Physics is the Root Cause of Refresh Related Single Bits
  • 9. 2009 2019 50nm 1Ynm 100% 10% DRAM Cell Scaling Physics is the Root Cause of Refresh Related Single Bits
  • 10. 2009 2019 50nm 1Ynm 100% 10% 0 16 32 48 64 2009 2011 2013 2015 2017 2019 Variable Retention Time Effective Internal Refresh (Ext tREF=64ms) DRAM Cell Scaling Physics is the Root Cause of Refresh Related Single Bits
  • 11. 10’s of Millions of Modules Shipped  < 0.05% were confirmed fails  ~ 0.005% were estimated uncorrectable − Includes post production damage, the most predominant of failure mechanisms Predominate Module Field Failures are Correctable Pass Correctable Uncorrectable
  • 12. Robust Error Correction in Today’s Data Center  100% Correctable ECC: Single Bit Beat Example Data (72 bits) 0 00 FF00FF01 FF00FF00 1 55 AA55AA55 AA55AA55 2 CC 33CC33CC 33CC33CC 3 99 66996699 66996699 4 88 77887788 77887788 5 44 BB44BB44 BB44BB44 6 DD 22DD22DD 22DD22DD 7 11 EE11EE11 EE11EE11
  • 13. Robust Error Correction in Today’s Data Center  100% Correctable  x8 Components  Standard Config  Lockstep Config ECC: Single Bit  x4 Components  Able to correct errors in different halves of burst ECC: Multi Bits Beat Example Data (72 bits) 0 00 FF00FF01 FF00FF00 1 55 AA55AA55 AA55AA55 2 CC 33CC33CC 33CC33CC 3 99 66996699 66996699 4 88 77887788 77887788 5 44 BB44BB44 BB44BB44 6 DD 22DD22DD 22DD22DD 7 11 EE11EE11 EE11EE11 Beat Example Data (72 bits) 0 00 FF00FE00 FF00FF00 1 55 AA55AB55 AA55AA55 2 CC 33CC32CC 33CC33CC 3 99 66996799 66996699 4 88 77887788 778877F8 5 44 BB44BB44 BB44BBF4 6 DD 22DD22DD 22DD22FD 7 11 EE11EE11 EE11EEF1 Beat Example Data (72 bits) 0 00 FF00FF00 FF00FF00 1 55 AA05AA55 AA55AA55 2 CC 330C33CC 33CC33CC 3 99 66996699 66996699 4 88 77887788 77887788 5 44 BB5BBB44 BB44BB44 6 DD 22DD22DD 22DD22DD 7 11 EE11EE11 EE11EE11
  • 14. Correctable Errors Do Not Foretell Uncorrectable Errors SETUP
  • 15. Correctable Errors Do Not Foretell Uncorrectable Errors SETUP
  • 16. Correctable Errors Do Not Foretell Uncorrectable Errors SETUP
  • 17. Correctable Errors Do Not Foretell Uncorrectable Errors SETUP
  • 18. Correctable Errors Do Not Foretell Uncorrectable Errors SETUP
  • 19. Correctable Errors Do Not Foretell Uncorrectable Errors 565,113,525 0 200,000,000 400,000,000 600,000,000 0 1,000,000 2,000,000 3,000,000 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 TotalNumberofErrors NumberofErrorsPerDay Days tested Number of Correctable Errors System Correctable fails Cumulative Fails Log. (System Correctable fails) 1 241
  • 20. Correctable Errors Do Not Foretell Uncorrectable Errors 565,113,525 0 200,000,000 400,000,000 600,000,000 0 1,000,000 2,000,000 3,000,000 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 TotalNumberofErrors NumberofErrorsPerDay Days tested Number of Correctable Errors System Correctable fails Cumulative Fails Log. (System Correctable fails) 1 241 Note: Application tests only finds 1,170 (or 0.00021% of errors found)
  • 21. 565,113,525 0 200,000,000 400,000,000 600,000,000 0 1,000,000 2,000,000 3,000,000 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 241 TotalNumberofErrors NumberofErrorsPerDay Days tested Number of Correctable Errors System Correctable fails Cumulative Fails Log. (System Correctable fails) Correctable Errors Do Not Foretell Uncorrectable Errors 1 241 0 Uncorrectable Events Detected Note: Application tests only finds 1,170 (or 0.00021% of errors found)
  • 25. Context Switch Performance Cost NoAppreciable Performance Penalty Results 0.00000% 0.00025% 0.00050% 0.00075% 0.00100% 2 4 6 8 10 12 14 PerformanceCost@1/s Number of Processes lmbench lat_ctx (size=64k ovr=2.68) Context Switch 0.00028% @ 1/s
  • 26. Context Switch Performance Cost Single Bit Performance Cost NoAppreciable Performance Penalty Results 0 100,000 200,000 300,000 400,000 0% 10% 20% 30% 40% 50% 60% Errors Bandwidth%Normalized Monitor Data Total data bus B/W % iMC:N0_H1_C1[4]-Errs 0.00000% 0.00025% 0.00050% 0.00075% 0.00100% 2 4 6 8 10 12 14 PerformanceCost@1/s Number of Processes lmbench lat_ctx (size=64k ovr=2.68) Context Switch 0.00028% @ 1/s
  • 27. Context Switch Performance Cost Single Bit Performance Cost Context Switch 0.00028% @ 1/s Error Correction 0.0000319% @ 1/s NoAppreciable Performance Penalty Results 0 100,000 200,000 300,000 400,000 0% 10% 20% 30% 40% 50% 60% Errors Bandwidth%Normalized Monitor Data Total data bus B/W % iMC:N0_H1_C1[4]-Errs 0.00000% 0.00025% 0.00050% 0.00075% 0.00100% 2 4 6 8 10 12 14 PerformanceCost@1/s Number of Processes lmbench lat_ctx (size=64k ovr=2.68) >>
  • 28. Achieve Form, Fit, Function, Performance, and Reliability Leverage System Level ECC to Enable More Bits Today www.micron.com/scale