At the Virtual HPC User Forum Special Event, CEO of MemVerge, introduces MemVerge and provides and overview of Big Memory Computing and Memory Machine software.
4. Data Has Become Big & Fast
Capital Markets
3D Animation Oil & Gas
Big Data Analytics
Virtual Servers
AI/ML Inference
Demanding Memory-Centric Infrastructure
0%
5%
10%
15%
20%
25%
30%
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
40,000,000
2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
Shareofreal-timedata(%)
Real-timedata(PB)
WW Real-Time Data Share, 2015-2024, IDC
Real-time data (PB)
Share of real-time data with Global Datashphere (%)
5. The Rise of Big Memory Computing
App
CPU
DRAM + PMEM
Pros
• Fast
• High-Capacity
• Low-Cost
• Non-Volatile
DRAM
Apps Run in DRAM and PMEM
Big Memory Software
$0
$500
$1,000
$1,500
$2,000
$2,500
$3,000
Byte-Addressable PMEM Revenue, IDC ($M)
2019 2020 2021 2022 2023
$2.6B
248% CAGR 2019-
2023
6. Memory Machine™: World’s First
Big Memory Software
6
Memory Machine™ Platform
DRAM
Bigger Memory at Lower Cost
without Performance Compromise
• Up to 9TB memory/2-way server
• 30-50% Memory Cost Savings
• DRAM-Performance
Persistence On-demand
• ZeroIO™ In-Memory Snapshot
• Fast Crash Recovery
• Thin-Clones
No Application Change!
8. • Motivation
• Large model and embedding table size
• Model size to GB level, embedding table size to TB level
• Multiple models on single server
• Online inference service: real time and low latency
• Return results in tens of ms
• Ideal solution
• Put models and embedding tables into DRAM
• Limitations
• High TCO
• Limited DRAM space
• Volatile
Inference with Large Model and Feature Embeddings
8
9. • Our solution
• Models and embedding tables in DRAM + PMEM
• Benefit
• Big memory can include all embedding tables on one server
• Similar read performance as DRAM, very suitable for read-heavy
scenario such as online inference
• Data persistence on PMEM
Inference on Memory Machine
9
10. Example 1: Facebook’s DLRM
10
• Deep learning recommendation model for personalization and recommendation systems
• Consists of dense and sparse features
• Dense feature: a vector of floating-point values
• Sparse feature: a list of sparse indices into embedding tables
• Open source:
https://github.com/facebookresearch/dlrm
M. Naumov, et al. Deep Learning Recommendation Model for Personalization and
Recommendation Systems, 2019 https://arxiv.org/abs/1906.00091
11. Evaluation Setup
• Hardware:
• Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (112 cores)
• 192 GB DRAM, 1.5TB PMEM, 400GB NVMe SSD
• Software
• RHEL 8.2
• Memory Machine v1.0
• Latest DRLM framework
• Testing cases: model + embedding
• In memory data size 26G/52G/104G/192G
• Features: 100 sparse features (100 embedding tables, embedding vector
dimension is 64), 512 dense features
• Measuring inference time for 20480 records in one batch (Criteo Dataset)
11
12. Example 1: DLRM Inference Performance
12
3592 4965
8429
174721
5487 6961 7740 8556
180778
187072
199472
203846
0
50000
100000
150000
200000
250000
26GB 52GB 104GB 192GB
Inference Time (ms)
All DRAM DRAM+PMEM DRAM+NVMe
14. • How to improve the fault tolerance of new model publishing?
• Pushing new model into production is risky
• If failed, revert to last workable version ASAP
• Rollback/Model reloading takes time (for large models) due to slow I/O
• Leveraging PMEM’s persistence
• Take a snapshot of the model serving application
• Restore a snapshot without reloading from disk or remote storage
• Snapshot can be published to many serving nodes via memory-to-memory snapshot replication
• Solution
• Instantaneous snapshot without interrupting online inference
• Instantaneous rollback without loading and publishing time
• Snapshot, rollback, and recovery are within 1 second
Persistent Memory for
Instant Model Rollback/Recovery
15. • Memory Machine provides
o Larger and cheaper heterogenous memory for faster inference
o Persistent memory for instant model snapshot and recovery
o No application change is needed
• Human reasons everything fully from memory
o So will machine learning in the era of Big Memory
Summary
15
16. Big Memory Software Will Be a $10B+ Market
Compute
Memory
Performance Storage
Capacity Storage
Compute
Big Memory
Capacity Storage