SlideShare a Scribd company logo
Memory Expansion
with CXL Ready
Systems and Devices
Presenter:
Ravi Kiran Gummaluri
Micron Technology
Agenda
• Memory demand and scaling challenges
• CXL memory expansion
• Capacity expansion solutions
• Database performance analysis on AMD Platform
• Bandwidth expansion solutions
• AI inference performance analysis on Intel Platform
• Conclusions and Next steps
Memory Demand and Scaling challenges
3
Growing demand for Memory need in data center applications . (~26 % yoy )
Memory Latency -> is only improving 1.1 times every two years.
Processor speed -> has been doubling every two years.
DRAM is not scaling -> Memory Capacity is doubling every four years.
Increased TCO for Data Centers -> Memory is ~ 50% of the overall server cost .
How do we solve increased Memory Bandwidth , Capacity requirements and reduce TCO ?
Figure 1 : Source: https://www.statista.com/statistics/871513/worldwide-data-created/
Figure 3 : Source: Based on capacity and core counts from publicly available AMD and Intel datasheets, and public statements.
Figure 1: Growing memory usage Figure 2: Memory wall Figure 3: Memory capacity Vs CPU cores
CXL Memory expansion
 CXL Memory Expansion
 Cache-line granular access semantics.
 CXL-Memory appears to a system as a CPU-less NUMA node. (Not
dependent on CPU Arch)
 Hot Pluggable memory
 Works with various form factors E1.S, E3.S , E5.S,Add on Card etc
 Interoperable with various memory types (DDR4, DDR5, LPDDR5, NVM ..)
 CXL Memory Capacity Expansion
 CXL Direct attached Memory Tiering
1. Application Transparent
 OS Managed
 User Space Library
2. Application Managed
 Application Aware (ex: libnuma)
 Modified (ex : libmemkind)
 CXL Switch / Fabric attached Memory Tiering
 Another Memory tier added to system with higher latencies.
 CXL Memory Bandwidth Expansion
 CXL Heterogenous interleave solutions
1. Hardware based Interleave
2. Software and HW heterogenous interleave.
3. Software based NUMA interleave.
4
Figure : Memory Hierarchy
Micron Memory Expansion on AMD platform
5
System Configuration :
TPC-H: DRAM Vs Tiered memory(DRAM+CXL)
6
CXL can provide better performance for capacity intensive workloads
HW Heterogenous Interleave
 System Address map will be interleaved between
Local DRAM and CXL memory
 Pros
 Easy to configure
 Cons
 Kernel/OS cannot manage memory allocations.
⎻ Affects kernel memory.
⎻ Hides the NUMA topology from the OS.
 Fixed configuration : Not scalable for all workloads
 CMM capacity will be restricted to align with Local
DRAM capacity.
Figure : HW Heterogenous interleave
HW + SW Heterogenous Interleave
 HW : Supports associating DRAM channels to
different NUMA domains .
 SW : Interleave 4(Local ):1(CXL) NUMA domain
using numactl .
 NPS4 :Each socket is partitioned into 4 NUMA
domains. Each NUMA domain has 3 memory
channels.
 Pros
 NUMA topology is enabled.
 Kernel/OS can manage the memory allocations
 Overcomes capacity limitations imposed by HW
interleave solution .
 Cons
 Fixed configuration : Not scalable for all workloads . Figure : HW + SW 4:1 Interleave
SW Heterogenous Interleave
Figure : SW Interleave with weights
Local DRAM
CXL MEMORY
Node 1
Socket 0
Application
requesting
100-pages
80-pages
20-pages
 Memory allocations performed according to per-node
weights
 Pros
 Scalable : Not fixed configuration
o Application can configure different weights according to BW
requirements .
o This only applies when explicitly enabled for a job.
 NUMA topology is enabled.
 Kernel/OS can manage the memory allocations
 Overcomes capacity limitations imposed by HW
interleave solution .
 Cons
 CXL Switch / Fabric attached Memory Tier cannot take
advantage of this configuration.
Node 0
LLM Performance Optimization with Micron’s CXL Memory SW interleaving
10
CXL can provide better performance for bandwidth intensive workloads
Conclusion / Next Steps
11
Conclusions :
CXL memory expansion can provide a solution to increased Memory Bandwidth and
Capacity requirements .
CXL memory can help in bandwidth expansion using SW interleaving between DDR and
CXL memory. Bandwidth sensitive workloads, Such as AI and HPC benefit from this.
CXL memory when introduced as tiered memory can help in increasing memory capacity
and reducing latency impact of Storage media . Capacity sensitive workloads , Such as
database and data analytics applications can benefit from this.
Next Steps :
Application aware and optimized page allocation algorithms can further improve system
performance by utilizing various memory tiers and media characteristics .
CXL memory pooling and Fabric attached memory can help further in defining various
memory tiers to reduce system TCO.
Introducing Micron CZ120 CXL Memory Module
Delivering Capacity, Bandwidth, Flexibility
128GB / 256GB
Up to 2TB incremental server capacity supporting CXL 2.0
36GB/s
Up to 34% increased server memory bandwidth
memory bandwidth per module using PCIe® Gen5 x8
E3.S 2T x8
Industry-standard form factor for broad deployment
1. By adding 8x256GB CZ120s, system limitations apply
2. Memory Latency Checker bandwidth compared to 12-channel 4800MT/s RDIMM server
2
1
Thank You!
13

More Related Content

Similar to Q1 Memory Fabric Forum: Memory expansion with CXL-Ready Systems and Devices

MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSijcsit
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSAIRCC Publishing Corporation
 
Marvell - Transforming Cloud Data Centers with CXL
Marvell - Transforming Cloud Data Centers with CXLMarvell - Transforming Cloud Data Centers with CXL
Marvell - Transforming Cloud Data Centers with CXLMemory Fabric Forum
 
MemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemory Fabric Forum
 
Multicore Computers
Multicore ComputersMulticore Computers
Multicore ComputersA B Shinde
 
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptxUNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptxSnehaLatha68
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
I understand that physics and hardware emmaded on the use of finete .pdf
I understand that physics and hardware emmaded on the use of finete .pdfI understand that physics and hardware emmaded on the use of finete .pdf
I understand that physics and hardware emmaded on the use of finete .pdfanil0878
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_ScalabilityIsrael Gold
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_ScalabilityIsrael Gold
 
Conference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentConference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentEricsson
 
Morph : a novel accelerator
Morph : a novel acceleratorMorph : a novel accelerator
Morph : a novel acceleratorBaharJV
 
Astera Labs: Intelligent Connectivity for Cloud and AI Infrastructure
Astera Labs:  Intelligent Connectivity for Cloud and AI InfrastructureAstera Labs:  Intelligent Connectivity for Cloud and AI Infrastructure
Astera Labs: Intelligent Connectivity for Cloud and AI InfrastructureMemory Fabric Forum
 
Large Model support and Distribute deep learning
Large Model support and Distribute deep learningLarge Model support and Distribute deep learning
Large Model support and Distribute deep learningGanesan Narayanasamy
 
AMD: 4th Generation EPYC CXL Demo
AMD: 4th Generation EPYC CXL DemoAMD: 4th Generation EPYC CXL Demo
AMD: 4th Generation EPYC CXL DemoMemory Fabric Forum
 
DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4Barbara Aichinger
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxMemory Fabric Forum
 

Similar to Q1 Memory Fabric Forum: Memory expansion with CXL-Ready Systems and Devices (20)

MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
Marvell - Transforming Cloud Data Centers with CXL
Marvell - Transforming Cloud Data Centers with CXLMarvell - Transforming Cloud Data Centers with CXL
Marvell - Transforming Cloud Data Centers with CXL
 
MemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big Memory
 
Multicore Computers
Multicore ComputersMulticore Computers
Multicore Computers
 
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptxUNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
I understand that physics and hardware emmaded on the use of finete .pdf
I understand that physics and hardware emmaded on the use of finete .pdfI understand that physics and hardware emmaded on the use of finete .pdf
I understand that physics and hardware emmaded on the use of finete .pdf
 
Multi-Core on Chip Architecture *doc - IK
Multi-Core on Chip Architecture *doc - IKMulti-Core on Chip Architecture *doc - IK
Multi-Core on Chip Architecture *doc - IK
 
Open power ddl and lms
Open power ddl and lmsOpen power ddl and lms
Open power ddl and lms
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_Scalability
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_Scalability
 
Conference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environmentConference Paper: Universal Node: Towards a high-performance NFV environment
Conference Paper: Universal Node: Towards a high-performance NFV environment
 
Morph : a novel accelerator
Morph : a novel acceleratorMorph : a novel accelerator
Morph : a novel accelerator
 
Astera Labs: Intelligent Connectivity for Cloud and AI Infrastructure
Astera Labs:  Intelligent Connectivity for Cloud and AI InfrastructureAstera Labs:  Intelligent Connectivity for Cloud and AI Infrastructure
Astera Labs: Intelligent Connectivity for Cloud and AI Infrastructure
 
Large Model support and Distribute deep learning
Large Model support and Distribute deep learningLarge Model support and Distribute deep learning
Large Model support and Distribute deep learning
 
AMD: 4th Generation EPYC CXL Demo
AMD: 4th Generation EPYC CXL DemoAMD: 4th Generation EPYC CXL Demo
AMD: 4th Generation EPYC CXL Demo
 
DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4DesignCon 2015-criticalmemoryperformancemetricsforDDR4
DesignCon 2015-criticalmemoryperformancemetricsforDDR4
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
 

More from Memory Fabric Forum

H3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxH3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxMemory Fabric Forum
 
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.Memory Fabric Forum
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPMemory Fabric Forum
 
Q1 Memory Fabric Forum: About MindShare Training
Q1 Memory Fabric Forum: About MindShare TrainingQ1 Memory Fabric Forum: About MindShare Training
Q1 Memory Fabric Forum: About MindShare TrainingMemory Fabric Forum
 
Q1 Memory Fabric Forum: CXL-Related Activities within OCP
Q1 Memory Fabric Forum: CXL-Related Activities within OCPQ1 Memory Fabric Forum: CXL-Related Activities within OCP
Q1 Memory Fabric Forum: CXL-Related Activities within OCPMemory Fabric Forum
 
Q1 Memory Fabric Forum: CXL Controller by Montage Technology
Q1 Memory Fabric Forum: CXL Controller by Montage TechnologyQ1 Memory Fabric Forum: CXL Controller by Montage Technology
Q1 Memory Fabric Forum: CXL Controller by Montage TechnologyMemory Fabric Forum
 
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin Labs
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin LabsQ1 Memory Fabric Forum: Teledyne LeCroy | Austin Labs
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin LabsMemory Fabric Forum
 
Q1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupQ1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupMemory Fabric Forum
 
Q1 Memory Fabric Forum: CXL Form Factor Primer
Q1 Memory Fabric Forum: CXL Form Factor PrimerQ1 Memory Fabric Forum: CXL Form Factor Primer
Q1 Memory Fabric Forum: CXL Form Factor PrimerMemory Fabric Forum
 
Q1 Memory Fabric Forum: Memory Fabric in a Composable System
Q1 Memory Fabric Forum: Memory Fabric in a Composable SystemQ1 Memory Fabric Forum: Memory Fabric in a Composable System
Q1 Memory Fabric Forum: Memory Fabric in a Composable SystemMemory Fabric Forum
 
Q1 Memory Fabric Forum: Big Memory Computing for AI
Q1 Memory Fabric Forum: Big Memory Computing for AIQ1 Memory Fabric Forum: Big Memory Computing for AI
Q1 Memory Fabric Forum: Big Memory Computing for AIMemory Fabric Forum
 
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory Modules
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory ModulesQ1 Memory Fabric Forum: Micron CXL-Compatible Memory Modules
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory ModulesMemory Fabric Forum
 
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 UpdateQ1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 UpdateMemory Fabric Forum
 
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Memory Fabric Forum
 
Q1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIQ1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIMemory Fabric Forum
 
Q1 Memory Fabric Forum: VMware Memory Vision
Q1 Memory Fabric Forum: VMware Memory VisionQ1 Memory Fabric Forum: VMware Memory Vision
Q1 Memory Fabric Forum: VMware Memory VisionMemory Fabric Forum
 
MemVerge: Memory Expansion Without Breaking the Budget
MemVerge: Memory Expansion Without Breaking the BudgetMemVerge: Memory Expansion Without Breaking the Budget
MemVerge: Memory Expansion Without Breaking the BudgetMemory Fabric Forum
 
Micron - CXL Enabling New Pliability in the Modern Data Center.pptx
Micron - CXL Enabling New Pliability in the Modern Data Center.pptxMicron - CXL Enabling New Pliability in the Modern Data Center.pptx
Micron - CXL Enabling New Pliability in the Modern Data Center.pptxMemory Fabric Forum
 
MemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXLMemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXLMemory Fabric Forum
 
Photowave Presentation Slides - 11.8.23.pptx
Photowave Presentation Slides - 11.8.23.pptxPhotowave Presentation Slides - 11.8.23.pptx
Photowave Presentation Slides - 11.8.23.pptxMemory Fabric Forum
 

More from Memory Fabric Forum (20)

H3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxH3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptx
 
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
 
Q1 Memory Fabric Forum: About MindShare Training
Q1 Memory Fabric Forum: About MindShare TrainingQ1 Memory Fabric Forum: About MindShare Training
Q1 Memory Fabric Forum: About MindShare Training
 
Q1 Memory Fabric Forum: CXL-Related Activities within OCP
Q1 Memory Fabric Forum: CXL-Related Activities within OCPQ1 Memory Fabric Forum: CXL-Related Activities within OCP
Q1 Memory Fabric Forum: CXL-Related Activities within OCP
 
Q1 Memory Fabric Forum: CXL Controller by Montage Technology
Q1 Memory Fabric Forum: CXL Controller by Montage TechnologyQ1 Memory Fabric Forum: CXL Controller by Montage Technology
Q1 Memory Fabric Forum: CXL Controller by Montage Technology
 
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin Labs
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin LabsQ1 Memory Fabric Forum: Teledyne LeCroy | Austin Labs
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin Labs
 
Q1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product LineupQ1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: SMART CXL Product Lineup
 
Q1 Memory Fabric Forum: CXL Form Factor Primer
Q1 Memory Fabric Forum: CXL Form Factor PrimerQ1 Memory Fabric Forum: CXL Form Factor Primer
Q1 Memory Fabric Forum: CXL Form Factor Primer
 
Q1 Memory Fabric Forum: Memory Fabric in a Composable System
Q1 Memory Fabric Forum: Memory Fabric in a Composable SystemQ1 Memory Fabric Forum: Memory Fabric in a Composable System
Q1 Memory Fabric Forum: Memory Fabric in a Composable System
 
Q1 Memory Fabric Forum: Big Memory Computing for AI
Q1 Memory Fabric Forum: Big Memory Computing for AIQ1 Memory Fabric Forum: Big Memory Computing for AI
Q1 Memory Fabric Forum: Big Memory Computing for AI
 
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory Modules
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory ModulesQ1 Memory Fabric Forum: Micron CXL-Compatible Memory Modules
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory Modules
 
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 UpdateQ1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
 
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
 
Q1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AIQ1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: XConn CXL Switches for AI
 
Q1 Memory Fabric Forum: VMware Memory Vision
Q1 Memory Fabric Forum: VMware Memory VisionQ1 Memory Fabric Forum: VMware Memory Vision
Q1 Memory Fabric Forum: VMware Memory Vision
 
MemVerge: Memory Expansion Without Breaking the Budget
MemVerge: Memory Expansion Without Breaking the BudgetMemVerge: Memory Expansion Without Breaking the Budget
MemVerge: Memory Expansion Without Breaking the Budget
 
Micron - CXL Enabling New Pliability in the Modern Data Center.pptx
Micron - CXL Enabling New Pliability in the Modern Data Center.pptxMicron - CXL Enabling New Pliability in the Modern Data Center.pptx
Micron - CXL Enabling New Pliability in the Modern Data Center.pptx
 
MemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXLMemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXL
 
Photowave Presentation Slides - 11.8.23.pptx
Photowave Presentation Slides - 11.8.23.pptxPhotowave Presentation Slides - 11.8.23.pptx
Photowave Presentation Slides - 11.8.23.pptx
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesThousandEyes
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»QADay
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform EngineeringJemma Hussein Allen
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2DianaGray10
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsVlad Stirbu
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsPaul Groth
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

Q1 Memory Fabric Forum: Memory expansion with CXL-Ready Systems and Devices

  • 1. Memory Expansion with CXL Ready Systems and Devices Presenter: Ravi Kiran Gummaluri Micron Technology
  • 2. Agenda • Memory demand and scaling challenges • CXL memory expansion • Capacity expansion solutions • Database performance analysis on AMD Platform • Bandwidth expansion solutions • AI inference performance analysis on Intel Platform • Conclusions and Next steps
  • 3. Memory Demand and Scaling challenges 3 Growing demand for Memory need in data center applications . (~26 % yoy ) Memory Latency -> is only improving 1.1 times every two years. Processor speed -> has been doubling every two years. DRAM is not scaling -> Memory Capacity is doubling every four years. Increased TCO for Data Centers -> Memory is ~ 50% of the overall server cost . How do we solve increased Memory Bandwidth , Capacity requirements and reduce TCO ? Figure 1 : Source: https://www.statista.com/statistics/871513/worldwide-data-created/ Figure 3 : Source: Based on capacity and core counts from publicly available AMD and Intel datasheets, and public statements. Figure 1: Growing memory usage Figure 2: Memory wall Figure 3: Memory capacity Vs CPU cores
  • 4. CXL Memory expansion  CXL Memory Expansion  Cache-line granular access semantics.  CXL-Memory appears to a system as a CPU-less NUMA node. (Not dependent on CPU Arch)  Hot Pluggable memory  Works with various form factors E1.S, E3.S , E5.S,Add on Card etc  Interoperable with various memory types (DDR4, DDR5, LPDDR5, NVM ..)  CXL Memory Capacity Expansion  CXL Direct attached Memory Tiering 1. Application Transparent  OS Managed  User Space Library 2. Application Managed  Application Aware (ex: libnuma)  Modified (ex : libmemkind)  CXL Switch / Fabric attached Memory Tiering  Another Memory tier added to system with higher latencies.  CXL Memory Bandwidth Expansion  CXL Heterogenous interleave solutions 1. Hardware based Interleave 2. Software and HW heterogenous interleave. 3. Software based NUMA interleave. 4 Figure : Memory Hierarchy
  • 5. Micron Memory Expansion on AMD platform 5 System Configuration :
  • 6. TPC-H: DRAM Vs Tiered memory(DRAM+CXL) 6 CXL can provide better performance for capacity intensive workloads
  • 7. HW Heterogenous Interleave  System Address map will be interleaved between Local DRAM and CXL memory  Pros  Easy to configure  Cons  Kernel/OS cannot manage memory allocations. ⎻ Affects kernel memory. ⎻ Hides the NUMA topology from the OS.  Fixed configuration : Not scalable for all workloads  CMM capacity will be restricted to align with Local DRAM capacity. Figure : HW Heterogenous interleave
  • 8. HW + SW Heterogenous Interleave  HW : Supports associating DRAM channels to different NUMA domains .  SW : Interleave 4(Local ):1(CXL) NUMA domain using numactl .  NPS4 :Each socket is partitioned into 4 NUMA domains. Each NUMA domain has 3 memory channels.  Pros  NUMA topology is enabled.  Kernel/OS can manage the memory allocations  Overcomes capacity limitations imposed by HW interleave solution .  Cons  Fixed configuration : Not scalable for all workloads . Figure : HW + SW 4:1 Interleave
  • 9. SW Heterogenous Interleave Figure : SW Interleave with weights Local DRAM CXL MEMORY Node 1 Socket 0 Application requesting 100-pages 80-pages 20-pages  Memory allocations performed according to per-node weights  Pros  Scalable : Not fixed configuration o Application can configure different weights according to BW requirements . o This only applies when explicitly enabled for a job.  NUMA topology is enabled.  Kernel/OS can manage the memory allocations  Overcomes capacity limitations imposed by HW interleave solution .  Cons  CXL Switch / Fabric attached Memory Tier cannot take advantage of this configuration. Node 0
  • 10. LLM Performance Optimization with Micron’s CXL Memory SW interleaving 10 CXL can provide better performance for bandwidth intensive workloads
  • 11. Conclusion / Next Steps 11 Conclusions : CXL memory expansion can provide a solution to increased Memory Bandwidth and Capacity requirements . CXL memory can help in bandwidth expansion using SW interleaving between DDR and CXL memory. Bandwidth sensitive workloads, Such as AI and HPC benefit from this. CXL memory when introduced as tiered memory can help in increasing memory capacity and reducing latency impact of Storage media . Capacity sensitive workloads , Such as database and data analytics applications can benefit from this. Next Steps : Application aware and optimized page allocation algorithms can further improve system performance by utilizing various memory tiers and media characteristics . CXL memory pooling and Fabric attached memory can help further in defining various memory tiers to reduce system TCO.
  • 12. Introducing Micron CZ120 CXL Memory Module Delivering Capacity, Bandwidth, Flexibility 128GB / 256GB Up to 2TB incremental server capacity supporting CXL 2.0 36GB/s Up to 34% increased server memory bandwidth memory bandwidth per module using PCIe® Gen5 x8 E3.S 2T x8 Industry-standard form factor for broad deployment 1. By adding 8x256GB CZ120s, system limitations apply 2. Memory Latency Checker bandwidth compared to 12-channel 4800MT/s RDIMM server 2 1

Editor's Notes

  1. Compute Express Link™ (CXL™) is an industry-supported Cache-Coherent Interconnect for Processors, Memory Expansion and Accelerators.