SlideShare a Scribd company logo
1 of 16
Big Memory Software for HPC
Dr. Charles Fan
CEO, MemVerge
• Multi-Cloud
• Memory-Centric
• Software-Composed
Future of Infrastructure
On-Prem Data Centers
Private Clouds
Today’s Computer
DRAM
Storage
App
CPU
DRAM
Pros
• Fast
Cons
• Low Capacity
• High Cost
• Volatile
Storage
Pros
• High-Capacity
• Low-Cost
• Non-Volatile
Cons
• Slow
Apps Run in DRAM
I/O
Data Has Become Big & Fast
Capital Markets
3D Animation Oil & Gas
Big Data Analytics
Virtual Servers
AI/ML Inference
Demanding Memory-Centric Infrastructure
0%
5%
10%
15%
20%
25%
30%
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
40,000,000
2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
Shareofreal-timedata(%)
Real-timedata(PB)
WW Real-Time Data Share, 2015-2024, IDC
Real-time data (PB)
Share of real-time data with Global Datashphere (%)
The Rise of Big Memory Computing
App
CPU
DRAM + PMEM
Pros
• Fast
• High-Capacity
• Low-Cost
• Non-Volatile
DRAM
Apps Run in DRAM and PMEM
Big Memory Software
$0
$500
$1,000
$1,500
$2,000
$2,500
$3,000
Byte-Addressable PMEM Revenue, IDC ($M)
2019 2020 2021 2022 2023
$2.6B
248% CAGR 2019-
2023
Memory Machine™: World’s First
Big Memory Software
6
Memory Machine™ Platform
DRAM
Bigger Memory at Lower Cost
without Performance Compromise
• Up to 9TB memory/2-way server
• 30-50% Memory Cost Savings
• DRAM-Performance
Persistence On-demand
• ZeroIO™ In-Memory Snapshot
• Fast Crash Recovery
• Thin-Clones
No Application Change!
Big Memory Software Impacts HPC
AvailabilityPerformance Agility
• Motivation
• Large model and embedding table size
• Model size to GB level, embedding table size to TB level
• Multiple models on single server
• Online inference service: real time and low latency
• Return results in tens of ms
• Ideal solution
• Put models and embedding tables into DRAM
• Limitations
• High TCO
• Limited DRAM space
• Volatile
Inference with Large Model and Feature Embeddings
8
• Our solution
• Models and embedding tables in DRAM + PMEM
• Benefit
• Big memory can include all embedding tables on one server
• Similar read performance as DRAM, very suitable for read-heavy
scenario such as online inference
• Data persistence on PMEM
Inference on Memory Machine
9
Example 1: Facebook’s DLRM
10
• Deep learning recommendation model for personalization and recommendation systems
• Consists of dense and sparse features
• Dense feature: a vector of floating-point values
• Sparse feature: a list of sparse indices into embedding tables
• Open source:
https://github.com/facebookresearch/dlrm
M. Naumov, et al. Deep Learning Recommendation Model for Personalization and
Recommendation Systems, 2019 https://arxiv.org/abs/1906.00091
Evaluation Setup
• Hardware:
• Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (112 cores)
• 192 GB DRAM, 1.5TB PMEM, 400GB NVMe SSD
• Software
• RHEL 8.2
• Memory Machine v1.0
• Latest DRLM framework
• Testing cases: model + embedding
• In memory data size 26G/52G/104G/192G
• Features: 100 sparse features (100 embedding tables, embedding vector
dimension is 64), 512 dense features
• Measuring inference time for 20480 records in one batch (Criteo Dataset)
11
Example 1: DLRM Inference Performance
12
3592 4965
8429
174721
5487 6961 7740 8556
180778
187072
199472
203846
0
50000
100000
150000
200000
250000
26GB 52GB 104GB 192GB
Inference Time (ms)
All DRAM DRAM+PMEM DRAM+NVMe
Example 2: Image Recognition Performance
0
5
10
15
20
25
30
1 2 4 8 16 32 64 128 256
TPS
Mongo+DRAM Memory Engine
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1 2 4 8 16 32 64 128 256
Latency
Mongo+DRAM(us) Memory Engine(us)
13
• How to improve the fault tolerance of new model publishing?
• Pushing new model into production is risky
• If failed, revert to last workable version ASAP
• Rollback/Model reloading takes time (for large models) due to slow I/O
• Leveraging PMEM’s persistence
• Take a snapshot of the model serving application
• Restore a snapshot without reloading from disk or remote storage
• Snapshot can be published to many serving nodes via memory-to-memory snapshot replication
• Solution
• Instantaneous snapshot without interrupting online inference
• Instantaneous rollback without loading and publishing time
• Snapshot, rollback, and recovery are within 1 second
Persistent Memory for
Instant Model Rollback/Recovery
• Memory Machine provides
o Larger and cheaper heterogenous memory for faster inference
o Persistent memory for instant model snapshot and recovery
o No application change is needed
• Human reasons everything fully from memory
o So will machine learning in the era of Big Memory
Summary
15
Big Memory Software Will Be a $10B+ Market
Compute
Memory
Performance Storage
Capacity Storage
Compute
Big Memory
Capacity Storage

More Related Content

What's hot

Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkSizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkPaula Koziol
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTTrust S.A.
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Ganesan Narayanasamy
 
Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...Principled Technologies
 
Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Joe Krotz
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Romeo Kienzler
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...EMC
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningAdrian Cockcroft
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA Taiwan
 
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,..."Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...Edge AI and Vision Alliance
 
White Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision ProcessorWhite Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision ProcessorCEVA, Inc.
 
IBM Power Systems: Designed for Data
IBM Power Systems: Designed for DataIBM Power Systems: Designed for Data
IBM Power Systems: Designed for DataIBM Power Systems
 
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and DockerFast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and DockerIndrajit Poddar
 
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs  Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs Indrajit Poddar
 
Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...Principled Technologies
 
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...The Linux Foundation
 

What's hot (20)

Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkSizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
 
2018 bsc power9 and power ai
2018   bsc power9 and power ai 2018   bsc power9 and power ai
2018 bsc power9 and power ai
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
 
Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...
 
Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Storwize SVC presentation February 2017
Storwize SVC presentation February 2017
 
Emc World Keynote Slootman
Emc World Keynote SlootmanEmc World Keynote Slootman
Emc World Keynote Slootman
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and Tuning
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
 
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,..."Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
 
White Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision ProcessorWhite Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision Processor
 
IBM Power Systems: Designed for Data
IBM Power Systems: Designed for DataIBM Power Systems: Designed for Data
IBM Power Systems: Designed for Data
 
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and DockerFast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
 
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs  Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
 
Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...
 
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
 
BSC LMS DDL
BSC LMS DDL BSC LMS DDL
BSC LMS DDL
 

Similar to Big Memory Software for HPC Accelerates AI Workloads

Live Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryLive Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryMemVerge
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsYong Feng
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesDanny Sabour
 
MemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemory Fabric Forum
 
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...Surekha Parekh
 
A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and BeyondA Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and BeyondLaura Hood
 
The Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and ResiliencyThe Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and ResiliencyAlluxio, Inc.
 
DDR4 Compliance Testing. Its time has come!
DDR4 Compliance Testing.  Its time has come!DDR4 Compliance Testing.  Its time has come!
DDR4 Compliance Testing. Its time has come!Barbara Aichinger
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3Tony Pearson
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructurexKinAnx
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructuresolarisyourep
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureMarco Obinu
 
Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000Catalogic Software
 
Has Your Data Gone Rogue?
Has Your Data Gone Rogue?Has Your Data Gone Rogue?
Has Your Data Gone Rogue?Tony Pearson
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale finalJoe Krotz
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalAvere Systems
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors DataWorks Summit/Hadoop Summit
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Indrajit Poddar
 

Similar to Big Memory Software for HPC Accelerates AI Workloads (20)

Live Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryLive Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than Memory
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Trends in DNN compression
Trends in DNN compressionTrends in DNN compression
Trends in DNN compression
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changes
 
MemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big Memory
 
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
 
A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and BeyondA Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
 
The Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and ResiliencyThe Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and Resiliency
 
DDR4 Compliance Testing. Its time has come!
DDR4 Compliance Testing.  Its time has come!DDR4 Compliance Testing.  Its time has come!
DDR4 Compliance Testing. Its time has come!
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
 
Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000
 
Has Your Data Gone Rogue?
Has Your Data Gone Rogue?Has Your Data Gone Rogue?
Has Your Data Gone Rogue?
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
 
Univa Presentation at DAC 2020
Univa Presentation at DAC 2020 Univa Presentation at DAC 2020
Univa Presentation at DAC 2020
 

More from MemVerge

Analytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big MemoryAnalytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big MemoryMemVerge
 
Checkpointing the Uncheckpointable
Checkpointing the UncheckpointableCheckpointing the Uncheckpointable
Checkpointing the UncheckpointableMemVerge
 
HPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big MemoryHPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big MemoryMemVerge
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCMemVerge
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryMemVerge
 
MemVerge Company Overview
MemVerge Company OverviewMemVerge Company Overview
MemVerge Company OverviewMemVerge
 
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...MemVerge
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory WebcastMemVerge
 

More from MemVerge (8)

Analytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big MemoryAnalytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big Memory
 
Checkpointing the Uncheckpointable
Checkpointing the UncheckpointableCheckpointing the Uncheckpointable
Checkpointing the Uncheckpointable
 
HPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big MemoryHPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big Memory
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
 
MemVerge Company Overview
MemVerge Company OverviewMemVerge Company Overview
MemVerge Company Overview
 
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory Webcast
 

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Big Memory Software for HPC Accelerates AI Workloads

  • 1. Big Memory Software for HPC Dr. Charles Fan CEO, MemVerge
  • 2. • Multi-Cloud • Memory-Centric • Software-Composed Future of Infrastructure On-Prem Data Centers Private Clouds
  • 3. Today’s Computer DRAM Storage App CPU DRAM Pros • Fast Cons • Low Capacity • High Cost • Volatile Storage Pros • High-Capacity • Low-Cost • Non-Volatile Cons • Slow Apps Run in DRAM I/O
  • 4. Data Has Become Big & Fast Capital Markets 3D Animation Oil & Gas Big Data Analytics Virtual Servers AI/ML Inference Demanding Memory-Centric Infrastructure 0% 5% 10% 15% 20% 25% 30% 0 5,000,000 10,000,000 15,000,000 20,000,000 25,000,000 30,000,000 35,000,000 40,000,000 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 Shareofreal-timedata(%) Real-timedata(PB) WW Real-Time Data Share, 2015-2024, IDC Real-time data (PB) Share of real-time data with Global Datashphere (%)
  • 5. The Rise of Big Memory Computing App CPU DRAM + PMEM Pros • Fast • High-Capacity • Low-Cost • Non-Volatile DRAM Apps Run in DRAM and PMEM Big Memory Software $0 $500 $1,000 $1,500 $2,000 $2,500 $3,000 Byte-Addressable PMEM Revenue, IDC ($M) 2019 2020 2021 2022 2023 $2.6B 248% CAGR 2019- 2023
  • 6. Memory Machine™: World’s First Big Memory Software 6 Memory Machine™ Platform DRAM Bigger Memory at Lower Cost without Performance Compromise • Up to 9TB memory/2-way server • 30-50% Memory Cost Savings • DRAM-Performance Persistence On-demand • ZeroIO™ In-Memory Snapshot • Fast Crash Recovery • Thin-Clones No Application Change!
  • 7. Big Memory Software Impacts HPC AvailabilityPerformance Agility
  • 8. • Motivation • Large model and embedding table size • Model size to GB level, embedding table size to TB level • Multiple models on single server • Online inference service: real time and low latency • Return results in tens of ms • Ideal solution • Put models and embedding tables into DRAM • Limitations • High TCO • Limited DRAM space • Volatile Inference with Large Model and Feature Embeddings 8
  • 9. • Our solution • Models and embedding tables in DRAM + PMEM • Benefit • Big memory can include all embedding tables on one server • Similar read performance as DRAM, very suitable for read-heavy scenario such as online inference • Data persistence on PMEM Inference on Memory Machine 9
  • 10. Example 1: Facebook’s DLRM 10 • Deep learning recommendation model for personalization and recommendation systems • Consists of dense and sparse features • Dense feature: a vector of floating-point values • Sparse feature: a list of sparse indices into embedding tables • Open source: https://github.com/facebookresearch/dlrm M. Naumov, et al. Deep Learning Recommendation Model for Personalization and Recommendation Systems, 2019 https://arxiv.org/abs/1906.00091
  • 11. Evaluation Setup • Hardware: • Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (112 cores) • 192 GB DRAM, 1.5TB PMEM, 400GB NVMe SSD • Software • RHEL 8.2 • Memory Machine v1.0 • Latest DRLM framework • Testing cases: model + embedding • In memory data size 26G/52G/104G/192G • Features: 100 sparse features (100 embedding tables, embedding vector dimension is 64), 512 dense features • Measuring inference time for 20480 records in one batch (Criteo Dataset) 11
  • 12. Example 1: DLRM Inference Performance 12 3592 4965 8429 174721 5487 6961 7740 8556 180778 187072 199472 203846 0 50000 100000 150000 200000 250000 26GB 52GB 104GB 192GB Inference Time (ms) All DRAM DRAM+PMEM DRAM+NVMe
  • 13. Example 2: Image Recognition Performance 0 5 10 15 20 25 30 1 2 4 8 16 32 64 128 256 TPS Mongo+DRAM Memory Engine 0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 1 2 4 8 16 32 64 128 256 Latency Mongo+DRAM(us) Memory Engine(us) 13
  • 14. • How to improve the fault tolerance of new model publishing? • Pushing new model into production is risky • If failed, revert to last workable version ASAP • Rollback/Model reloading takes time (for large models) due to slow I/O • Leveraging PMEM’s persistence • Take a snapshot of the model serving application • Restore a snapshot without reloading from disk or remote storage • Snapshot can be published to many serving nodes via memory-to-memory snapshot replication • Solution • Instantaneous snapshot without interrupting online inference • Instantaneous rollback without loading and publishing time • Snapshot, rollback, and recovery are within 1 second Persistent Memory for Instant Model Rollback/Recovery
  • 15. • Memory Machine provides o Larger and cheaper heterogenous memory for faster inference o Persistent memory for instant model snapshot and recovery o No application change is needed • Human reasons everything fully from memory o So will machine learning in the era of Big Memory Summary 15
  • 16. Big Memory Software Will Be a $10B+ Market Compute Memory Performance Storage Capacity Storage Compute Big Memory Capacity Storage