SlideShare a Scribd company logo
Big Memory Software for HPC
Dr. Charles Fan
CEO, MemVerge
• Multi-Cloud
• Memory-Centric
• Software-Composed
Future of Infrastructure
On-Prem Data Centers
Private Clouds
Today’s Computer
DRAM
Storage
App
CPU
DRAM
Pros
• Fast
Cons
• Low Capacity
• High Cost
• Volatile
Storage
Pros
• High-Capacity
• Low-Cost
• Non-Volatile
Cons
• Slow
Apps Run in DRAM
I/O
Data Has Become Big & Fast
Capital Markets
3D Animation Oil & Gas
Big Data Analytics
Virtual Servers
AI/ML Inference
Demanding Memory-Centric Infrastructure
0%
5%
10%
15%
20%
25%
30%
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
40,000,000
2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
Shareofreal-timedata(%)
Real-timedata(PB)
WW Real-Time Data Share, 2015-2024, IDC
Real-time data (PB)
Share of real-time data with Global Datashphere (%)
The Rise of Big Memory Computing
App
CPU
DRAM + PMEM
Pros
• Fast
• High-Capacity
• Low-Cost
• Non-Volatile
DRAM
Apps Run in DRAM and PMEM
Big Memory Software
$0
$500
$1,000
$1,500
$2,000
$2,500
$3,000
Byte-Addressable PMEM Revenue, IDC ($M)
2019 2020 2021 2022 2023
$2.6B
248% CAGR 2019-
2023
Memory Machine™: World’s First
Big Memory Software
6
Memory Machine™ Platform
DRAM
Bigger Memory at Lower Cost
without Performance Compromise
• Up to 9TB memory/2-way server
• 30-50% Memory Cost Savings
• DRAM-Performance
Persistence On-demand
• ZeroIO™ In-Memory Snapshot
• Fast Crash Recovery
• Thin-Clones
No Application Change!
Big Memory Software Impacts HPC
AvailabilityPerformance Agility
• Motivation
• Large model and embedding table size
• Model size to GB level, embedding table size to TB level
• Multiple models on single server
• Online inference service: real time and low latency
• Return results in tens of ms
• Ideal solution
• Put models and embedding tables into DRAM
• Limitations
• High TCO
• Limited DRAM space
• Volatile
Inference with Large Model and Feature Embeddings
8
• Our solution
• Models and embedding tables in DRAM + PMEM
• Benefit
• Big memory can include all embedding tables on one server
• Similar read performance as DRAM, very suitable for read-heavy
scenario such as online inference
• Data persistence on PMEM
Inference on Memory Machine
9
Example 1: Facebook’s DLRM
10
• Deep learning recommendation model for personalization and recommendation systems
• Consists of dense and sparse features
• Dense feature: a vector of floating-point values
• Sparse feature: a list of sparse indices into embedding tables
• Open source:
https://github.com/facebookresearch/dlrm
M. Naumov, et al. Deep Learning Recommendation Model for Personalization and
Recommendation Systems, 2019 https://arxiv.org/abs/1906.00091
Evaluation Setup
• Hardware:
• Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (112 cores)
• 192 GB DRAM, 1.5TB PMEM, 400GB NVMe SSD
• Software
• RHEL 8.2
• Memory Machine v1.0
• Latest DRLM framework
• Testing cases: model + embedding
• In memory data size 26G/52G/104G/192G
• Features: 100 sparse features (100 embedding tables, embedding vector
dimension is 64), 512 dense features
• Measuring inference time for 20480 records in one batch (Criteo Dataset)
11
Example 1: DLRM Inference Performance
12
3592 4965
8429
174721
5487 6961 7740 8556
180778
187072
199472
203846
0
50000
100000
150000
200000
250000
26GB 52GB 104GB 192GB
Inference Time (ms)
All DRAM DRAM+PMEM DRAM+NVMe
Example 2: Image Recognition Performance
0
5
10
15
20
25
30
1 2 4 8 16 32 64 128 256
TPS
Mongo+DRAM Memory Engine
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1 2 4 8 16 32 64 128 256
Latency
Mongo+DRAM(us) Memory Engine(us)
13
• How to improve the fault tolerance of new model publishing?
• Pushing new model into production is risky
• If failed, revert to last workable version ASAP
• Rollback/Model reloading takes time (for large models) due to slow I/O
• Leveraging PMEM’s persistence
• Take a snapshot of the model serving application
• Restore a snapshot without reloading from disk or remote storage
• Snapshot can be published to many serving nodes via memory-to-memory snapshot replication
• Solution
• Instantaneous snapshot without interrupting online inference
• Instantaneous rollback without loading and publishing time
• Snapshot, rollback, and recovery are within 1 second
Persistent Memory for
Instant Model Rollback/Recovery
• Memory Machine provides
o Larger and cheaper heterogenous memory for faster inference
o Persistent memory for instant model snapshot and recovery
o No application change is needed
• Human reasons everything fully from memory
o So will machine learning in the era of Big Memory
Summary
15
Big Memory Software Will Be a $10B+ Market
Compute
Memory
Performance Storage
Capacity Storage
Compute
Big Memory
Capacity Storage

More Related Content

What's hot

Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkSizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Paula Koziol
 
2018 bsc power9 and power ai
2018   bsc power9 and power ai 2018   bsc power9 and power ai
2018 bsc power9 and power ai
Ganesan Narayanasamy
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
InTTrust S.A.
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
Ganesan Narayanasamy
 
Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...
Principled Technologies
 
Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Storwize SVC presentation February 2017
Storwize SVC presentation February 2017
Joe Krotz
 
Emc World Keynote Slootman
Emc World Keynote SlootmanEmc World Keynote Slootman
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Romeo Kienzler
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Rebekah Rodriguez
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
EMC
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and Tuning
Adrian Cockcroft
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA Taiwan
 
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,..."Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
Edge AI and Vision Alliance
 
White Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision ProcessorWhite Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision Processor
CEVA, Inc.
 
IBM Power Systems: Designed for Data
IBM Power Systems: Designed for DataIBM Power Systems: Designed for Data
IBM Power Systems: Designed for Data
IBM Power Systems
 
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and DockerFast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Indrajit Poddar
 
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs  Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Indrajit Poddar
 
Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...
Principled Technologies
 
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
The Linux Foundation
 
BSC LMS DDL
BSC LMS DDL BSC LMS DDL
BSC LMS DDL
Ganesan Narayanasamy
 

What's hot (20)

Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkSizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
 
2018 bsc power9 and power ai
2018   bsc power9 and power ai 2018   bsc power9 and power ai
2018 bsc power9 and power ai
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
 
Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...Enable greater data reduction, storage performance, and manageability with De...
Enable greater data reduction, storage performance, and manageability with De...
 
Storwize SVC presentation February 2017
Storwize SVC presentation February 2017Storwize SVC presentation February 2017
Storwize SVC presentation February 2017
 
Emc World Keynote Slootman
Emc World Keynote SlootmanEmc World Keynote Slootman
Emc World Keynote Slootman
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and Tuning
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
 
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,..."Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
 
White Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision ProcessorWhite Paper - CEVA-XM4 Intelligent Vision Processor
White Paper - CEVA-XM4 Intelligent Vision Processor
 
IBM Power Systems: Designed for Data
IBM Power Systems: Designed for DataIBM Power Systems: Designed for Data
IBM Power Systems: Designed for Data
 
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and DockerFast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
Fast Scalable Easy Machine Learning with OpenPOWER, GPUs and Docker
 
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs  Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
Build FAST Deep Learning Apps with Docker on OpenPOWER and GPUs
 
Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...Get improved performance and new features from Dell EMC PowerEdge servers wit...
Get improved performance and new features from Dell EMC PowerEdge servers wit...
 
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
 
BSC LMS DDL
BSC LMS DDL BSC LMS DDL
BSC LMS DDL
 

Similar to Big Memory for HPC

Live Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryLive Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than Memory
MemVerge
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Yong Feng
 
Trends in DNN compression
Trends in DNN compressionTrends in DNN compression
Trends in DNN compression
Kaushalya Madhawa
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changes
Danny Sabour
 
MemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big Memory
Memory Fabric Forum
 
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Surekha Parekh
 
A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and BeyondA Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
Laura Hood
 
The Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and ResiliencyThe Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and Resiliency
Alluxio, Inc.
 
DDR4 Compliance Testing. Its time has come!
DDR4 Compliance Testing.  Its time has come!DDR4 Compliance Testing.  Its time has come!
DDR4 Compliance Testing. Its time has come!
Barbara Aichinger
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3
Tony Pearson
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
solarisyourep
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
xKinAnx
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Marco Obinu
 
Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000
Catalogic Software
 
Has Your Data Gone Rogue?
Has Your Data Gone Rogue?Has Your Data Gone Rogue?
Has Your Data Gone Rogue?
Tony Pearson
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
Joe Krotz
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
Avere Systems
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
DataWorks Summit/Hadoop Summit
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Indrajit Poddar
 
Univa Presentation at DAC 2020
Univa Presentation at DAC 2020 Univa Presentation at DAC 2020
Univa Presentation at DAC 2020
Univa, an Altair Company
 

Similar to Big Memory for HPC (20)

Live Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryLive Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than Memory
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Trends in DNN compression
Trends in DNN compressionTrends in DNN compression
Trends in DNN compression
 
How AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changesHow AI and ML are driving Memory Architecture changes
How AI and ML are driving Memory Architecture changes
 
MemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big MemoryMemVerge - The Dawn of Big Memory
MemVerge - The Dawn of Big Memory
 
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...Key Note Session  IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
Key Note Session IDUG DB2 Seminar, 16th April London - Julian Stuhler .Trito...
 
A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and BeyondA Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
A Time Traveller's Guide to DB2: Technology Themes for 2014 and Beyond
 
The Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and ResiliencyThe Pandemic Changes Everything, the Need for Speed and Resiliency
The Pandemic Changes Everything, the Need for Speed and Resiliency
 
DDR4 Compliance Testing. Its time has come!
DDR4 Compliance Testing.  Its time has come!DDR4 Compliance Testing.  Its time has come!
DDR4 Compliance Testing. Its time has come!
 
S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3S de0882 new-generation-tiering-edge2015-v3
S de0882 new-generation-tiering-edge2015-v3
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Presentation architecting a cloud infrastructure
Presentation   architecting a cloud infrastructurePresentation   architecting a cloud infrastructure
Presentation architecting a cloud infrastructure
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
 
Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000Software Defined Agility for IBM FlashSystem V9000
Software Defined Agility for IBM FlashSystem V9000
 
Has Your Data Gone Rogue?
Has Your Data Gone Rogue?Has Your Data Gone Rogue?
Has Your Data Gone Rogue?
 
Spectrum Scale final
Spectrum Scale finalSpectrum Scale final
Spectrum Scale final
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
 
Univa Presentation at DAC 2020
Univa Presentation at DAC 2020 Univa Presentation at DAC 2020
Univa Presentation at DAC 2020
 

More from MemVerge

Analytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big MemoryAnalytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big Memory
MemVerge
 
Checkpointing the Uncheckpointable
Checkpointing the UncheckpointableCheckpointing the Uncheckpointable
Checkpointing the Uncheckpointable
MemVerge
 
HPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big MemoryHPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big Memory
MemVerge
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
MemVerge
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
MemVerge
 
MemVerge Company Overview
MemVerge Company OverviewMemVerge Company Overview
MemVerge Company Overview
MemVerge
 
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...
MemVerge
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory Webcast
MemVerge
 

More from MemVerge (8)

Analytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big MemoryAnalytical Biosciences Accelerates Single Cell Sequencing with Big Memory
Analytical Biosciences Accelerates Single Cell Sequencing with Big Memory
 
Checkpointing the Uncheckpointable
Checkpointing the UncheckpointableCheckpointing the Uncheckpointable
Checkpointing the Uncheckpointable
 
HPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big MemoryHPC Market Update and Observations on Big Memory
HPC Market Update and Observations on Big Memory
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
 
MemVerge Company Overview
MemVerge Company OverviewMemVerge Company Overview
MemVerge Company Overview
 
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...IDC Technology Spotlight:   Big Memory Computing Emerges to Better Enable Dat...
IDC Technology Spotlight: Big Memory Computing Emerges to Better Enable Dat...
 
Big Memory Webcast
Big Memory WebcastBig Memory Webcast
Big Memory Webcast
 

Recently uploaded

Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
leebarnesutopia
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
manji sharman06
 
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxAI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
Sunil Jagani
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
DianaGray10
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - MydbopsMySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
Mydbops
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 

Recently uploaded (20)

Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
 
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxAI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - MydbopsMySQL InnoDB Storage Engine: Deep Dive - Mydbops
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 

Big Memory for HPC

  • 1. Big Memory Software for HPC Dr. Charles Fan CEO, MemVerge
  • 2. • Multi-Cloud • Memory-Centric • Software-Composed Future of Infrastructure On-Prem Data Centers Private Clouds
  • 3. Today’s Computer DRAM Storage App CPU DRAM Pros • Fast Cons • Low Capacity • High Cost • Volatile Storage Pros • High-Capacity • Low-Cost • Non-Volatile Cons • Slow Apps Run in DRAM I/O
  • 4. Data Has Become Big & Fast Capital Markets 3D Animation Oil & Gas Big Data Analytics Virtual Servers AI/ML Inference Demanding Memory-Centric Infrastructure 0% 5% 10% 15% 20% 25% 30% 0 5,000,000 10,000,000 15,000,000 20,000,000 25,000,000 30,000,000 35,000,000 40,000,000 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 Shareofreal-timedata(%) Real-timedata(PB) WW Real-Time Data Share, 2015-2024, IDC Real-time data (PB) Share of real-time data with Global Datashphere (%)
  • 5. The Rise of Big Memory Computing App CPU DRAM + PMEM Pros • Fast • High-Capacity • Low-Cost • Non-Volatile DRAM Apps Run in DRAM and PMEM Big Memory Software $0 $500 $1,000 $1,500 $2,000 $2,500 $3,000 Byte-Addressable PMEM Revenue, IDC ($M) 2019 2020 2021 2022 2023 $2.6B 248% CAGR 2019- 2023
  • 6. Memory Machine™: World’s First Big Memory Software 6 Memory Machine™ Platform DRAM Bigger Memory at Lower Cost without Performance Compromise • Up to 9TB memory/2-way server • 30-50% Memory Cost Savings • DRAM-Performance Persistence On-demand • ZeroIO™ In-Memory Snapshot • Fast Crash Recovery • Thin-Clones No Application Change!
  • 7. Big Memory Software Impacts HPC AvailabilityPerformance Agility
  • 8. • Motivation • Large model and embedding table size • Model size to GB level, embedding table size to TB level • Multiple models on single server • Online inference service: real time and low latency • Return results in tens of ms • Ideal solution • Put models and embedding tables into DRAM • Limitations • High TCO • Limited DRAM space • Volatile Inference with Large Model and Feature Embeddings 8
  • 9. • Our solution • Models and embedding tables in DRAM + PMEM • Benefit • Big memory can include all embedding tables on one server • Similar read performance as DRAM, very suitable for read-heavy scenario such as online inference • Data persistence on PMEM Inference on Memory Machine 9
  • 10. Example 1: Facebook’s DLRM 10 • Deep learning recommendation model for personalization and recommendation systems • Consists of dense and sparse features • Dense feature: a vector of floating-point values • Sparse feature: a list of sparse indices into embedding tables • Open source: https://github.com/facebookresearch/dlrm M. Naumov, et al. Deep Learning Recommendation Model for Personalization and Recommendation Systems, 2019 https://arxiv.org/abs/1906.00091
  • 11. Evaluation Setup • Hardware: • Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (112 cores) • 192 GB DRAM, 1.5TB PMEM, 400GB NVMe SSD • Software • RHEL 8.2 • Memory Machine v1.0 • Latest DRLM framework • Testing cases: model + embedding • In memory data size 26G/52G/104G/192G • Features: 100 sparse features (100 embedding tables, embedding vector dimension is 64), 512 dense features • Measuring inference time for 20480 records in one batch (Criteo Dataset) 11
  • 12. Example 1: DLRM Inference Performance 12 3592 4965 8429 174721 5487 6961 7740 8556 180778 187072 199472 203846 0 50000 100000 150000 200000 250000 26GB 52GB 104GB 192GB Inference Time (ms) All DRAM DRAM+PMEM DRAM+NVMe
  • 13. Example 2: Image Recognition Performance 0 5 10 15 20 25 30 1 2 4 8 16 32 64 128 256 TPS Mongo+DRAM Memory Engine 0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 1 2 4 8 16 32 64 128 256 Latency Mongo+DRAM(us) Memory Engine(us) 13
  • 14. • How to improve the fault tolerance of new model publishing? • Pushing new model into production is risky • If failed, revert to last workable version ASAP • Rollback/Model reloading takes time (for large models) due to slow I/O • Leveraging PMEM’s persistence • Take a snapshot of the model serving application • Restore a snapshot without reloading from disk or remote storage • Snapshot can be published to many serving nodes via memory-to-memory snapshot replication • Solution • Instantaneous snapshot without interrupting online inference • Instantaneous rollback without loading and publishing time • Snapshot, rollback, and recovery are within 1 second Persistent Memory for Instant Model Rollback/Recovery
  • 15. • Memory Machine provides o Larger and cheaper heterogenous memory for faster inference o Persistent memory for instant model snapshot and recovery o No application change is needed • Human reasons everything fully from memory o So will machine learning in the era of Big Memory Summary 15
  • 16. Big Memory Software Will Be a $10B+ Market Compute Memory Performance Storage Capacity Storage Compute Big Memory Capacity Storage