SlideShare a Scribd company logo
1 of 44
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 1
NTT Confidential
Challenges for Implementing PMEM Aware
Application with PMDK
Yoshimi Ichiyanagi
NTT Software Innovation Center
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 2
NTT Confidential
Outline
1. Introduction
2. Background and Motivation
3. How to use PMEM
4. Challenges for implementing PMEM aware
applications
5. Challenges for performance evaluation to get
valid results
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 3
NTT Confidential
Outline
1. Introduction
2. Background and Motivation
3. How to use PMEM
4. Challenges for implementing PMEM aware
applications
5. Challenges for performance evaluation to get
valid results
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 4
NTT Confidential
Introduction
 Software Innovation Center is part of NTT Laboratories
 Japanese telecommunication company
 Worked on system software
 Distributed file system (HDFS)
 Operating system (Linux kernel)
 Trying to rewrite open-source software using new storage
and new library
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 5
NTT Confidential
Outline
1. Introduction
2. Background and Motivation
3. How to use PMEM
4. Challenges for implementing PMEM aware
applications
5. Challenges for performance evaluation to get
valid results
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 6
NTT Confidential
Background
 Persistent memory (PMEM) begins to be supplied
 NVDIMM-N
 Intel® Optane™ DC Persistent Memory
 PMEM features are:
 Memory-like features
 Low-latency
 Byte-addressable
 Storage-like features
 large-capacity
 non-volatile
DRAM
Persistent memory
(PMEM)
Solid state disk (SSD)
Hard Disk (HDD)
Low
High
Small
Large
CapacityLatency
CPU
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 7
NTT Confidential
Motivation
 Trying to rewrite storage applications since PMEM
features are utilized
 RDBMS (e.g. PostgreSQL)
 Message queue systems (e.g. Apache Kafka)
 etc.
 Let me share my challenges about PMEM
 Implementation
 Performance evaluation
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 8
NTT Confidential
Outline
1. Introduction
2. Background and Motivation
3. How to use PMEM
4. Challenges for implementing PMEM aware
applications
5. Challenges for performance evaluation to get
valid results
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 9
NTT Confidential
HW/SW components
 What kind of PMEM is used?
 NVDIMM-N
 What Linux kernel support is necessary to use PMEM?
 Direct-Access for files (DAX)
 Supported by ext4 and xfs
 What user library is used?
 Persistent Memory Development Kit (PMDK)
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 10
NTT Confidential
DAX FS and PMDK
Memory-mapped file
Application
PMEM (NVDIMM-N)
Library (PMDK)
DAX FS
Page Cache
Traditional FS
Application Application
User
Kernel
File I/O
APIs
File I/O
APIs CPU instructions
Context
Switch
Context
Switch
HW
Access like memoryAccess like storageAccess like storage
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 11
NTT Confidential
Benefits of DAX FS and PMDK
 With DAX FS only
 Not necessary to rewrite applications
 With DAX FS and PMDK
 Necessary to rewrite applications
 Performance of I/O-intensive workload is
greatly improved
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 12
NTT Confidential
PMDK features
 Application accesses PMEM like memory
 Memory-mapped file (mmap file)
 Application developers can select fine-grained sync size
 Details are on the next slide
 CPU instructions suitable for copy data size are selected
 8 / 16 / 32 / 64 bytes registers
 MOVNT & SFENCE
 without CPU caches
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 13
NTT Confidential
PMDK sync function
 pmem_msync()
 File metadata and written data are flushed
 pmem_msync() calls msync syscall
 msync syscall is general sync API for mmap file
 pmem_drain()
 Only written data is flushed
 pmem_drain() is faster than pmem_msync()
pmem_msync() Pmem_drain()
Durability 〇
(Flush file metadata and written data)
△
(Flush only written data)
Performance × 〇
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 14
NTT Confidential
Implementing application with PMDK
 Trying to rewrite storage applications with PMDK
 RDBMS – PostgreSQL (PG)
 Message queue systems - Apache Kafka
 Let me share know-how gained by rewriting PG
 Checkpoint file
 Many writes occur during checkpoint
 Write ahead logging (WAL)
 Critical for transaction performance
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 15
NTT Confidential
Outline
1. Introduction
2. Background and Motivation
3. How to use PMEM
4. Challenges for implementing PMEM aware
applications
5. Challenges for performance evaluation to get
valid results
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 16
NTT Confidential
How to rewrite PG with PMDK
 Replace standard file I/O with mmap I/O
 Standard file I/O
read syscalls, write syscalls and so on
 Mmap I/O
PMDK APIs
PMEMHDD/SS
D
PG’s working
memory
Standard file I/O mmap I/O
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 17
NTT Confidential
4. Challenges for implementation
1. How to resize checkpoint file
2. How to select sync function for WAL
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 18
NTT Confidential
4. Challenges for implementation
1. How to resize checkpoint file
2. How to select sync function for WAL
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 19
NTT Confidential
Resizing checkpoint file
 Huge table and so forth consist of multiple checkpoint
files
 Variable length up to 1GB
 Necessary to resizing checkpoint file
 PMDK provides APIs for mmap file
 Difficult to resize mmap file without overhead
 Best practice is to access only fixed-size file
 We changed how to access only 1GB checkpoint file
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 20
NTT Confidential
How to resize mmap file
 Enlarge file
 Remap file
Enlarge
file
Remap
file
file
Enlarge
File
mappin
g
Only part of file can be
accessed with PMDK
Virtual address space
file
File
mappin
g
Whole file can be
accessed with PMDK
Virtual address space
file
File
mappin
g
Virtual address space
file
File
mappin
g
Virtual address space
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 21
NTT Confidential
Implementation to enlarge file
DAX FS DAX FS and PMDK
Open fd = open(path, ...); addr1 = pmem_map_file
(path, len1, ...);
Unmap pmem_unmap(addr1, len1);
Enlarge truncate(path, len2);
Remap addr2 = pmem_map_file
(path, len2, ...);
Close close(fd); pmem_unmap(addr2, len2);
 3 function calls are added to use PMDK
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 22
NTT Confidential
Implementation to shrink file
DAX FS DAX FS and PMDK
Open fd = open(path, ...); addr1 = pmem_map_file
(path, len1, ...);
Unmap pmem_unmap(addr1, len1)
Shrink ftruncate(fd, len2); truncate(path, len2);
Remap addr2 = pmem_map_file
(path, len2, ...);
Close close(fd); pmem_unmap(addr2, len2);
 2 function calls are added to use PMDK
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 23
NTT Confidential
Resizing file with PMDK
 Difficult to use PMDK unless file size is fixed
 Repeating remapping many times degrades
performance
Remapping file has large overhead
 By using PMDK, munmap()/close()/open()/mmap() syscall is
called again
 Mapping large files may make file system full
 Best practice is to use only fixed-size file
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 24
NTT Confidential
4. Challenges for implementation
1. How to resize checkpoint file
2. How to select sync function for WAL
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 25
NTT Confidential
Selecting sync function for WAL
 How to write log to WAL file
 Initialization – create file and fill file with zero
 Necessary to flush file metadata
 Synchronous logging - sequential synchronous write
 Necessary to flush only written data
 PMDK sync APIs
 pmem_msync() – file metadata and written data are
flushed
 pmem_drain() – only written data is flushed
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 26
NTT Confidential
PMDK sync function for WAL
 Initialization of WAL file
 Necessary to flush WAL file metadata
 pmem_msync()
 Synchronous logging
 Necessary to flush only written data
 pmem_drain() or pmem_msync()
 pmem_drain() is faster than pmem_msync()
 We selected pmem_drain()
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 27
NTT Confidential
Comparing PMDK sync functions
 I ran microbenchmark to compare pmem_msync()
to pmem_drain()
 Preprocessing
WAL initialization
Create file and fill file with zero and flush file metadata
 Measurement processing
Synchronous logging
Overwrite data and flush written data
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 28
NTT Confidential
Microbenchmark
1. DAX FS
fdatasync()
2. DAX FS and PMDK
pmem_msync()
3. DAX FS and PMDK
pmem_drain()
Open open() pmem_map_file() pmem_map_file()
while () { while () { while() {
Write write() pmem_memcpy_nodrain() pmem_memcpy_nodrain()
Sync fdatasync() pmem_msync() pmem_drain()
Loop N } } }
Close close() pmem_unmap() pmem_unmap()
 Synchronous logging
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 29
NTT Confidential
Evaluation setup
32 GB
DRAM
32 GB
DRAM
CPU
CPU
Node0
Node1
Running
benchmarks
NUMA node
48 GB
PMEM
Hardware
CPU E5-2667 v4 x 2 (8 cores per node)
DRAM [Node0/1] 32 GB each
PMEM (NVDIMM-N) [Node1] 48 GB (HPE 8GB NVDIMM x
6)
Software
Distro Ubuntu 16.04
Linux kernel 4.17. 9*
PMDK 1.4.1
Filesystem ext4 (DAX available)
*: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 30
NTT Confidential
Performance evaluation - microbenchmark
 pmem_drain() is fastest of 3 patterns as expected
• Total written data is 10 GB
• Block is 8 KB
[GB/s]
15 GB/s
4.3 GB/s
0.0023 GB/s
DAX FS
fdatasync()
DAX FS and PMDK
pmem_msync()
DAX FS and PMDK
pmem_drain()
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 31
NTT Confidential
PMDK sync functions
 pmem_drain() greatly improves performance of I/O-intensive
workload
 You should use pmem_drain() with caution
 pmem_drain() can’t flush file metadata
 pmem_msync() should be called by application that uses file
metadata such as
 Time of last modification
 Time of last access
 pmem_drain() doesn’t work without
pmem_memcpy_nodrain()
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 32
NTT Confidential
Outline
1. Introduction
2. Background and Motivation
3. How to use PMEM
4. Challenges for implementing PMEM aware
applications
5. Challenges for performance evaluation to get
valid results
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 33
NTT Confidential
5. Challenges for performance evaluation
 Difficult to get valid results in PG performance
evaluation
 What is valid result?
 Avoid Non-Uniform Memory Access (NUMA)
effects
 Avoid CPUs becoming hotspots
tuning application for PMEM
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 34
NTT Confidential
5. Challenges for performance evaluation
1. NUMA effect
2. Tuning application for PMEM
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 35
NTT Confidential
5. Challenges for performance evaluation
1. NUMA effect
2. Tuning application for PMEM
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 36
NTT Confidential
NUMA effect
 Synchronous write is about 1.5 times faster on
local NUMA node than on remote NUMA node
NUMA node 32 GiB
DRAM
32 GiB
DRAM
CPU
CPU
Node0
Node1
48 GiB
PMEM
Synchronous write
15 GB/s
11 GB/s
Local node Remote node
X1.5
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 37
NTT Confidential
5. Challenges for performance evaluation
1. NUMA effect
2. Tuning application for PMEM
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 38
NTT Confidential
Tuning application for PMEM
 Important to avoid calculation processing
becoming hotspot
 Better to use Stored Procedure in PG
Stored Procedure improves PG performance since
user-defined functions are pre-compiled and stored in
PG server
pgbench -c 16 -j 16 -T 1800 -r [db_name] -M
prepared
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 39
NTT Confidential
Evaluation setup
32 GB
DRAM
32 GB
DRAM
CPU
CPU
Node0
Node1
PG server
NUMA node
48 GB
PMEM
Hardware
CPU E5-2667 v4 x 2 (8 cores per node)
DRAM [Node0/1] 32 GB each
PMEM (NVDIMM-
N)
[Node1] 48 GB (HPE 8GB NVDIMM x 6)
SSD Intel® Optane™ SSD DC P4800X Series
750GB
Software
Distro Ubuntu 16.04
Linux kernel 4.17. 9
PMDK 1.4.1
Filesystem ext4 (DAX available)
PostgreSQL 10.4 Beta**: https://www.postgresql.org/message-id/C20D38E97BCB33DAD59E3A1%40lab.ntt.co.jp
750 GB
SSD
PG clients
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 40
NTT Confidential
Stored Procedure
 Improvement ratio using PMEM is improved by
12% compared with using SSD
Intel® Optane™ SSD
DC P4800X Series 750GB
PMEM
with DAX FS + PMDK
Without Stored
Procedure
With Stored
Procedure
18,396 tps
29,125 tps
x1.58
36,449 tps
21,406 tps
x1.70
[tps]
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 41
NTT Confidential
Conclusion
 Difficult to implement PMEM aware applications
 Resizing file with overhead
 Best practice is to access only fixed-size file
 Selecting inappropriate sync function seriously
degrades performance or durability
 Difficult to get valid results in performance evaluation
 Avoiding NUMA effect
 Avoiding CPUs becoming hotspots
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 42
NTT Confidential
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 43
NTT Confidential
pmem_msync() and pmem_drain()
 pmem_msync() - file metadata and dirty page are flushed
 Valid sizes of page are 4KB, 2MB, and 1GB
 pmem_drain() - only written data is flushed
File
metadata
Written
data
File
pmem_drain()
Written
data
PMEM
File
metadata
File
pmem_msync()
File
metadata
Written
data
PMEM
Dirty
page
Written
data
2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 44
NTT Confidential
Reducing number of times to copy data
 Performance may be improved by reducing
number of copies
Temporary buffer
PMEM
HDD/SS
D
Application’s working
memory
For Sequential access
and
buffering(writing bulk
data)
Random I/O is
significantly slower than
sequential I/O.
Random I/O is as fast as
sequential I/O.
Syscalls
PDMK
APIs

More Related Content

What's hot

Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane Michelle Holley
 
OCP U.S. Summit 2017 Presentation
OCP U.S. Summit 2017 PresentationOCP U.S. Summit 2017 Presentation
OCP U.S. Summit 2017 PresentationNetronome
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Summit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - Cavium
Summit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - CaviumSummit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - Cavium
Summit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - CaviumOPNFV
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
Fueling the datasphere how RISC-V enables the storage ecosystem
Fueling the datasphere   how RISC-V enables the storage ecosystemFueling the datasphere   how RISC-V enables the storage ecosystem
Fueling the datasphere how RISC-V enables the storage ecosystemRISC-V International
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
 
Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Open-NFP
 
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...KTN
 
Enabling accelerated networking - seminar by Enea at the Embedded Conference ...
Enabling accelerated networking - seminar by Enea at the Embedded Conference ...Enabling accelerated networking - seminar by Enea at the Embedded Conference ...
Enabling accelerated networking - seminar by Enea at the Embedded Conference ...EneaSoftware
 
Netronome Corporate Brochure
Netronome Corporate BrochureNetronome Corporate Brochure
Netronome Corporate BrochureNetronome
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityAPNIC
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Michelle Holley
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksRyousei Takano
 
BXI: Bull eXascale Interconnect
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnectinside-BigData.com
 
Intel® Ethernet Update
Intel® Ethernet Update Intel® Ethernet Update
Intel® Ethernet Update Michelle Holley
 

What's hot (20)

Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
OCP U.S. Summit 2017 Presentation
OCP U.S. Summit 2017 PresentationOCP U.S. Summit 2017 Presentation
OCP U.S. Summit 2017 Presentation
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Summit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - Cavium
Summit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - CaviumSummit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - Cavium
Summit 16: ARM Mini-Summit - Efficient NFV solutions for Cloud and Edge - Cavium
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
Fueling the datasphere how RISC-V enables the storage ecosystem
Fueling the datasphere   how RISC-V enables the storage ecosystemFueling the datasphere   how RISC-V enables the storage ecosystem
Fueling the datasphere how RISC-V enables the storage ecosystem
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit Data Plane and VNF Acceleration Mini Summit
Data Plane and VNF Acceleration Mini Summit
 
DOME 64-bit μDataCenter
DOME 64-bit μDataCenterDOME 64-bit μDataCenter
DOME 64-bit μDataCenter
 
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...
Fast, Scalable Quantized Neural Network Inference on FPGAs with FINN and Logi...
 
NFV Open Source projects
NFV Open Source projectsNFV Open Source projects
NFV Open Source projects
 
Enabling accelerated networking - seminar by Enea at the Embedded Conference ...
Enabling accelerated networking - seminar by Enea at the Embedded Conference ...Enabling accelerated networking - seminar by Enea at the Embedded Conference ...
Enabling accelerated networking - seminar by Enea at the Embedded Conference ...
 
Netronome Corporate Brochure
Netronome Corporate BrochureNetronome Corporate Brochure
Netronome Corporate Brochure
 
Data Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center Networks
 
BXI: Bull eXascale Interconnect
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnect
 
Intel® Ethernet Update
Intel® Ethernet Update Intel® Ethernet Update
Intel® Ethernet Update
 

Similar to Challenges for Implementing PMEM Aware Application with PMDK

In Network Computing Prototype Using P4 at KSC/KREONET 2019
In Network Computing Prototype Using P4 at KSC/KREONET 2019In Network Computing Prototype Using P4 at KSC/KREONET 2019
In Network Computing Prototype Using P4 at KSC/KREONET 2019Kentaro Ebisawa
 
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverScott Shadley, MBA,PMC-III
 
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias CrauwelsOSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias CrauwelsNETWAYS
 
Iperconvergenza come migliora gli economics del tuo IT
Iperconvergenza come migliora gli economics del tuo ITIperconvergenza come migliora gli economics del tuo IT
Iperconvergenza come migliora gli economics del tuo ITNetApp
 
Genomics Deployments - How to Get Right with Software Defined Storage
 Genomics Deployments -  How to Get Right with Software Defined Storage Genomics Deployments -  How to Get Right with Software Defined Storage
Genomics Deployments - How to Get Right with Software Defined StorageSandeep Patil
 
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18Pawel Maczka
 
P4_tutorial.pdf
P4_tutorial.pdfP4_tutorial.pdf
P4_tutorial.pdfPramodhN3
 
S104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bS104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bTony Pearson
 
Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Databricks
 
Container Attached Storage (CAS) with OpenEBS - SDC 2018
Container Attached Storage (CAS) with OpenEBS -  SDC 2018Container Attached Storage (CAS) with OpenEBS -  SDC 2018
Container Attached Storage (CAS) with OpenEBS - SDC 2018OpenEBS
 
Application Modernization with PKS / Kubernetes
Application Modernization with PKS / KubernetesApplication Modernization with PKS / Kubernetes
Application Modernization with PKS / KubernetesPaul Czarkowski
 
Persistent Memory Programming: The Current State of the Ecosystem
Persistent Memory Programming: The Current State of the EcosystemPersistent Memory Programming: The Current State of the Ecosystem
Persistent Memory Programming: The Current State of the Ecosysteminside-BigData.com
 
Backend Cloud Storage Access in Video Streaming
Backend Cloud Storage Access in Video StreamingBackend Cloud Storage Access in Video Streaming
Backend Cloud Storage Access in Video StreamingRufael Mekuria
 
Developer insight into why applications run amazingly Fast in CF 2018
Developer insight into why applications run amazingly Fast in CF 2018Developer insight into why applications run amazingly Fast in CF 2018
Developer insight into why applications run amazingly Fast in CF 2018Pavan Kumar
 
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3DataWorks Summit
 
NetApp Fabric Pool Deck
NetApp Fabric Pool DeckNetApp Fabric Pool Deck
NetApp Fabric Pool DeckAlex Tsui
 
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetAppBridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetAppMongoDB
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeLuciano Resende
 

Similar to Challenges for Implementing PMEM Aware Application with PMDK (20)

In Network Computing Prototype Using P4 at KSC/KREONET 2019
In Network Computing Prototype Using P4 at KSC/KREONET 2019In Network Computing Prototype Using P4 at KSC/KREONET 2019
In Network Computing Prototype Using P4 at KSC/KREONET 2019
 
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
 
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias CrauwelsOSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
 
Iperconvergenza come migliora gli economics del tuo IT
Iperconvergenza come migliora gli economics del tuo ITIperconvergenza come migliora gli economics del tuo IT
Iperconvergenza come migliora gli economics del tuo IT
 
Genomics Deployments - How to Get Right with Software Defined Storage
 Genomics Deployments -  How to Get Right with Software Defined Storage Genomics Deployments -  How to Get Right with Software Defined Storage
Genomics Deployments - How to Get Right with Software Defined Storage
 
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
 
P4_tutorial.pdf
P4_tutorial.pdfP4_tutorial.pdf
P4_tutorial.pdf
 
S104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bS104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809b
 
Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!
 
Container Attached Storage (CAS) with OpenEBS - SDC 2018
Container Attached Storage (CAS) with OpenEBS -  SDC 2018Container Attached Storage (CAS) with OpenEBS -  SDC 2018
Container Attached Storage (CAS) with OpenEBS - SDC 2018
 
Application Modernization with PKS / Kubernetes
Application Modernization with PKS / KubernetesApplication Modernization with PKS / Kubernetes
Application Modernization with PKS / Kubernetes
 
Persistent Memory Programming: The Current State of the Ecosystem
Persistent Memory Programming: The Current State of the EcosystemPersistent Memory Programming: The Current State of the Ecosystem
Persistent Memory Programming: The Current State of the Ecosystem
 
Backend Cloud Storage Access in Video Streaming
Backend Cloud Storage Access in Video StreamingBackend Cloud Storage Access in Video Streaming
Backend Cloud Storage Access in Video Streaming
 
Developer insight into why applications run amazingly Fast in CF 2018
Developer insight into why applications run amazingly Fast in CF 2018Developer insight into why applications run amazingly Fast in CF 2018
Developer insight into why applications run amazingly Fast in CF 2018
 
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3
 
Poc exadata 2018
Poc exadata 2018Poc exadata 2018
Poc exadata 2018
 
NetApp Fabric Pool Deck
NetApp Fabric Pool DeckNetApp Fabric Pool Deck
NetApp Fabric Pool Deck
 
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetAppBridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
Bridging Your Business Across the Enterprise and Cloud with MongoDB and NetApp
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for Code
 
Arkena IMF case study
Arkena IMF case studyArkena IMF case study
Arkena IMF case study
 

More from NTT Software Innovation Center

A Global Data Infrastructure for Data Sharing Between Businesses
A Global Data Infrastructure for Data Sharing Between BusinessesA Global Data Infrastructure for Data Sharing Between Businesses
A Global Data Infrastructure for Data Sharing Between BusinessesNTT Software Innovation Center
 
企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤NTT Software Innovation Center
 
企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤NTT Software Innovation Center
 
Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...
Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...
Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...NTT Software Innovation Center
 
2-in-1 Cluster Integration: Batch and Interactive GPU Computing
2-in-1 Cluster Integration: Batch and Interactive GPU Computing2-in-1 Cluster Integration: Batch and Interactive GPU Computing
2-in-1 Cluster Integration: Batch and Interactive GPU ComputingNTT Software Innovation Center
 
Hybrid Sourcing for Overcoming “Digital Cliff 2025”
Hybrid Sourcing for Overcoming “Digital Cliff 2025”Hybrid Sourcing for Overcoming “Digital Cliff 2025”
Hybrid Sourcing for Overcoming “Digital Cliff 2025”NTT Software Innovation Center
 
データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~
データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~
データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~NTT Software Innovation Center
 
Network Implosion: Effective Model Compression for ResNets via Static Layer P...
Network Implosion: Effective Model Compression for ResNets via Static Layer P...Network Implosion: Effective Model Compression for ResNets via Static Layer P...
Network Implosion: Effective Model Compression for ResNets via Static Layer P...NTT Software Innovation Center
 
Why and how Edge Computing matters enterprise IT strategy
Why and how Edge Computing matters enterprise IT strategyWhy and how Edge Computing matters enterprise IT strategy
Why and how Edge Computing matters enterprise IT strategyNTT Software Innovation Center
 
外部キー制約を考慮した特徴量削減手法
外部キー制約を考慮した特徴量削減手法外部キー制約を考慮した特徴量削減手法
外部キー制約を考慮した特徴量削減手法NTT Software Innovation Center
 
デジタルサービスプラットフォーム実現に向けた技術課題
デジタルサービスプラットフォーム実現に向けた技術課題デジタルサービスプラットフォーム実現に向けた技術課題
デジタルサービスプラットフォーム実現に向けた技術課題NTT Software Innovation Center
 
Building images efficiently and securely on Kubernetes with BuildKit
Building images efficiently and securely on Kubernetes with BuildKitBuilding images efficiently and securely on Kubernetes with BuildKit
Building images efficiently and securely on Kubernetes with BuildKitNTT Software Innovation Center
 
Real-time spatiotemporal data utilization for future mobility services
Real-time spatiotemporal data utilization for future mobility servicesReal-time spatiotemporal data utilization for future mobility services
Real-time spatiotemporal data utilization for future mobility servicesNTT Software Innovation Center
 
【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組
【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組
【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組NTT Software Innovation Center
 
統合ログ分析技術Lognosisと運用ログ分析の取組
統合ログ分析技術Lognosisと運用ログ分析の取組統合ログ分析技術Lognosisと運用ログ分析の取組
統合ログ分析技術Lognosisと運用ログ分析の取組NTT Software Innovation Center
 
OpenStack Swiftとそのエコシステムの最新動向
OpenStack Swiftとそのエコシステムの最新動向OpenStack Swiftとそのエコシステムの最新動向
OpenStack Swiftとそのエコシステムの最新動向NTT Software Innovation Center
 

More from NTT Software Innovation Center (20)

A Global Data Infrastructure for Data Sharing Between Businesses
A Global Data Infrastructure for Data Sharing Between BusinessesA Global Data Infrastructure for Data Sharing Between Businesses
A Global Data Infrastructure for Data Sharing Between Businesses
 
企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤
 
企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤企業間データ流通のための国際データ基盤
企業間データ流通のための国際データ基盤
 
不揮発WALバッファ
不揮発WALバッファ不揮発WALバッファ
不揮発WALバッファ
 
企業間データ流通のための国際基盤
企業間データ流通のための国際基盤企業間データ流通のための国際基盤
企業間データ流通のための国際基盤
 
企業間データ流通のための国際基盤
企業間データ流通のための国際基盤企業間データ流通のための国際基盤
企業間データ流通のための国際基盤
 
Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...
Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...
Hybrid Computing Platform for Combinatorial Optimization with the Coherent Is...
 
2-in-1 Cluster Integration: Batch and Interactive GPU Computing
2-in-1 Cluster Integration: Batch and Interactive GPU Computing2-in-1 Cluster Integration: Batch and Interactive GPU Computing
2-in-1 Cluster Integration: Batch and Interactive GPU Computing
 
Hybrid Sourcing for Overcoming “Digital Cliff 2025”
Hybrid Sourcing for Overcoming “Digital Cliff 2025”Hybrid Sourcing for Overcoming “Digital Cliff 2025”
Hybrid Sourcing for Overcoming “Digital Cliff 2025”
 
データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~
データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~
データ分析をビジネスに活かす!データ創出・活用から、分析、課題解決までのDX時代のデータ活用事例のご紹介 ~不揃いのデータとの格闘~
 
Network Implosion: Effective Model Compression for ResNets via Static Layer P...
Network Implosion: Effective Model Compression for ResNets via Static Layer P...Network Implosion: Effective Model Compression for ResNets via Static Layer P...
Network Implosion: Effective Model Compression for ResNets via Static Layer P...
 
Why and how Edge Computing matters enterprise IT strategy
Why and how Edge Computing matters enterprise IT strategyWhy and how Edge Computing matters enterprise IT strategy
Why and how Edge Computing matters enterprise IT strategy
 
外部キー制約を考慮した特徴量削減手法
外部キー制約を考慮した特徴量削減手法外部キー制約を考慮した特徴量削減手法
外部キー制約を考慮した特徴量削減手法
 
デジタルサービスプラットフォーム実現に向けた技術課題
デジタルサービスプラットフォーム実現に向けた技術課題デジタルサービスプラットフォーム実現に向けた技術課題
デジタルサービスプラットフォーム実現に向けた技術課題
 
Building images efficiently and securely on Kubernetes with BuildKit
Building images efficiently and securely on Kubernetes with BuildKitBuilding images efficiently and securely on Kubernetes with BuildKit
Building images efficiently and securely on Kubernetes with BuildKit
 
Real-time spatiotemporal data utilization for future mobility services
Real-time spatiotemporal data utilization for future mobility servicesReal-time spatiotemporal data utilization for future mobility services
Real-time spatiotemporal data utilization for future mobility services
 
【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組
【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組
【招待講演】ICM研究会 - 統合ログ分析技術Lognosisと運用ログ分析の取組
 
統合ログ分析技術Lognosisと運用ログ分析の取組
統合ログ分析技術Lognosisと運用ログ分析の取組統合ログ分析技術Lognosisと運用ログ分析の取組
統合ログ分析技術Lognosisと運用ログ分析の取組
 
MVSR Schedulerを作るための指針
MVSR Schedulerを作るための指針MVSR Schedulerを作るための指針
MVSR Schedulerを作るための指針
 
OpenStack Swiftとそのエコシステムの最新動向
OpenStack Swiftとそのエコシステムの最新動向OpenStack Swiftとそのエコシステムの最新動向
OpenStack Swiftとそのエコシステムの最新動向
 

Recently uploaded

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 

Recently uploaded (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 

Challenges for Implementing PMEM Aware Application with PMDK

  • 1. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 1 NTT Confidential Challenges for Implementing PMEM Aware Application with PMDK Yoshimi Ichiyanagi NTT Software Innovation Center
  • 2. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 2 NTT Confidential Outline 1. Introduction 2. Background and Motivation 3. How to use PMEM 4. Challenges for implementing PMEM aware applications 5. Challenges for performance evaluation to get valid results
  • 3. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 3 NTT Confidential Outline 1. Introduction 2. Background and Motivation 3. How to use PMEM 4. Challenges for implementing PMEM aware applications 5. Challenges for performance evaluation to get valid results
  • 4. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 4 NTT Confidential Introduction  Software Innovation Center is part of NTT Laboratories  Japanese telecommunication company  Worked on system software  Distributed file system (HDFS)  Operating system (Linux kernel)  Trying to rewrite open-source software using new storage and new library
  • 5. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 5 NTT Confidential Outline 1. Introduction 2. Background and Motivation 3. How to use PMEM 4. Challenges for implementing PMEM aware applications 5. Challenges for performance evaluation to get valid results
  • 6. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 6 NTT Confidential Background  Persistent memory (PMEM) begins to be supplied  NVDIMM-N  Intel® Optane™ DC Persistent Memory  PMEM features are:  Memory-like features  Low-latency  Byte-addressable  Storage-like features  large-capacity  non-volatile DRAM Persistent memory (PMEM) Solid state disk (SSD) Hard Disk (HDD) Low High Small Large CapacityLatency CPU
  • 7. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 7 NTT Confidential Motivation  Trying to rewrite storage applications since PMEM features are utilized  RDBMS (e.g. PostgreSQL)  Message queue systems (e.g. Apache Kafka)  etc.  Let me share my challenges about PMEM  Implementation  Performance evaluation
  • 8. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 8 NTT Confidential Outline 1. Introduction 2. Background and Motivation 3. How to use PMEM 4. Challenges for implementing PMEM aware applications 5. Challenges for performance evaluation to get valid results
  • 9. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 9 NTT Confidential HW/SW components  What kind of PMEM is used?  NVDIMM-N  What Linux kernel support is necessary to use PMEM?  Direct-Access for files (DAX)  Supported by ext4 and xfs  What user library is used?  Persistent Memory Development Kit (PMDK)
  • 10. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 10 NTT Confidential DAX FS and PMDK Memory-mapped file Application PMEM (NVDIMM-N) Library (PMDK) DAX FS Page Cache Traditional FS Application Application User Kernel File I/O APIs File I/O APIs CPU instructions Context Switch Context Switch HW Access like memoryAccess like storageAccess like storage
  • 11. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 11 NTT Confidential Benefits of DAX FS and PMDK  With DAX FS only  Not necessary to rewrite applications  With DAX FS and PMDK  Necessary to rewrite applications  Performance of I/O-intensive workload is greatly improved
  • 12. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 12 NTT Confidential PMDK features  Application accesses PMEM like memory  Memory-mapped file (mmap file)  Application developers can select fine-grained sync size  Details are on the next slide  CPU instructions suitable for copy data size are selected  8 / 16 / 32 / 64 bytes registers  MOVNT & SFENCE  without CPU caches
  • 13. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 13 NTT Confidential PMDK sync function  pmem_msync()  File metadata and written data are flushed  pmem_msync() calls msync syscall  msync syscall is general sync API for mmap file  pmem_drain()  Only written data is flushed  pmem_drain() is faster than pmem_msync() pmem_msync() Pmem_drain() Durability 〇 (Flush file metadata and written data) △ (Flush only written data) Performance × 〇
  • 14. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 14 NTT Confidential Implementing application with PMDK  Trying to rewrite storage applications with PMDK  RDBMS – PostgreSQL (PG)  Message queue systems - Apache Kafka  Let me share know-how gained by rewriting PG  Checkpoint file  Many writes occur during checkpoint  Write ahead logging (WAL)  Critical for transaction performance
  • 15. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 15 NTT Confidential Outline 1. Introduction 2. Background and Motivation 3. How to use PMEM 4. Challenges for implementing PMEM aware applications 5. Challenges for performance evaluation to get valid results
  • 16. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 16 NTT Confidential How to rewrite PG with PMDK  Replace standard file I/O with mmap I/O  Standard file I/O read syscalls, write syscalls and so on  Mmap I/O PMDK APIs PMEMHDD/SS D PG’s working memory Standard file I/O mmap I/O
  • 17. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 17 NTT Confidential 4. Challenges for implementation 1. How to resize checkpoint file 2. How to select sync function for WAL
  • 18. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 18 NTT Confidential 4. Challenges for implementation 1. How to resize checkpoint file 2. How to select sync function for WAL
  • 19. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 19 NTT Confidential Resizing checkpoint file  Huge table and so forth consist of multiple checkpoint files  Variable length up to 1GB  Necessary to resizing checkpoint file  PMDK provides APIs for mmap file  Difficult to resize mmap file without overhead  Best practice is to access only fixed-size file  We changed how to access only 1GB checkpoint file
  • 20. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 20 NTT Confidential How to resize mmap file  Enlarge file  Remap file Enlarge file Remap file file Enlarge File mappin g Only part of file can be accessed with PMDK Virtual address space file File mappin g Whole file can be accessed with PMDK Virtual address space file File mappin g Virtual address space file File mappin g Virtual address space
  • 21. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 21 NTT Confidential Implementation to enlarge file DAX FS DAX FS and PMDK Open fd = open(path, ...); addr1 = pmem_map_file (path, len1, ...); Unmap pmem_unmap(addr1, len1); Enlarge truncate(path, len2); Remap addr2 = pmem_map_file (path, len2, ...); Close close(fd); pmem_unmap(addr2, len2);  3 function calls are added to use PMDK
  • 22. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 22 NTT Confidential Implementation to shrink file DAX FS DAX FS and PMDK Open fd = open(path, ...); addr1 = pmem_map_file (path, len1, ...); Unmap pmem_unmap(addr1, len1) Shrink ftruncate(fd, len2); truncate(path, len2); Remap addr2 = pmem_map_file (path, len2, ...); Close close(fd); pmem_unmap(addr2, len2);  2 function calls are added to use PMDK
  • 23. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 23 NTT Confidential Resizing file with PMDK  Difficult to use PMDK unless file size is fixed  Repeating remapping many times degrades performance Remapping file has large overhead  By using PMDK, munmap()/close()/open()/mmap() syscall is called again  Mapping large files may make file system full  Best practice is to use only fixed-size file
  • 24. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 24 NTT Confidential 4. Challenges for implementation 1. How to resize checkpoint file 2. How to select sync function for WAL
  • 25. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 25 NTT Confidential Selecting sync function for WAL  How to write log to WAL file  Initialization – create file and fill file with zero  Necessary to flush file metadata  Synchronous logging - sequential synchronous write  Necessary to flush only written data  PMDK sync APIs  pmem_msync() – file metadata and written data are flushed  pmem_drain() – only written data is flushed
  • 26. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 26 NTT Confidential PMDK sync function for WAL  Initialization of WAL file  Necessary to flush WAL file metadata  pmem_msync()  Synchronous logging  Necessary to flush only written data  pmem_drain() or pmem_msync()  pmem_drain() is faster than pmem_msync()  We selected pmem_drain()
  • 27. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 27 NTT Confidential Comparing PMDK sync functions  I ran microbenchmark to compare pmem_msync() to pmem_drain()  Preprocessing WAL initialization Create file and fill file with zero and flush file metadata  Measurement processing Synchronous logging Overwrite data and flush written data
  • 28. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 28 NTT Confidential Microbenchmark 1. DAX FS fdatasync() 2. DAX FS and PMDK pmem_msync() 3. DAX FS and PMDK pmem_drain() Open open() pmem_map_file() pmem_map_file() while () { while () { while() { Write write() pmem_memcpy_nodrain() pmem_memcpy_nodrain() Sync fdatasync() pmem_msync() pmem_drain() Loop N } } } Close close() pmem_unmap() pmem_unmap()  Synchronous logging
  • 29. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 29 NTT Confidential Evaluation setup 32 GB DRAM 32 GB DRAM CPU CPU Node0 Node1 Running benchmarks NUMA node 48 GB PMEM Hardware CPU E5-2667 v4 x 2 (8 cores per node) DRAM [Node0/1] 32 GB each PMEM (NVDIMM-N) [Node1] 48 GB (HPE 8GB NVDIMM x 6) Software Distro Ubuntu 16.04 Linux kernel 4.17. 9* PMDK 1.4.1 Filesystem ext4 (DAX available) *: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
  • 30. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 30 NTT Confidential Performance evaluation - microbenchmark  pmem_drain() is fastest of 3 patterns as expected • Total written data is 10 GB • Block is 8 KB [GB/s] 15 GB/s 4.3 GB/s 0.0023 GB/s DAX FS fdatasync() DAX FS and PMDK pmem_msync() DAX FS and PMDK pmem_drain()
  • 31. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 31 NTT Confidential PMDK sync functions  pmem_drain() greatly improves performance of I/O-intensive workload  You should use pmem_drain() with caution  pmem_drain() can’t flush file metadata  pmem_msync() should be called by application that uses file metadata such as  Time of last modification  Time of last access  pmem_drain() doesn’t work without pmem_memcpy_nodrain()
  • 32. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 32 NTT Confidential Outline 1. Introduction 2. Background and Motivation 3. How to use PMEM 4. Challenges for implementing PMEM aware applications 5. Challenges for performance evaluation to get valid results
  • 33. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 33 NTT Confidential 5. Challenges for performance evaluation  Difficult to get valid results in PG performance evaluation  What is valid result?  Avoid Non-Uniform Memory Access (NUMA) effects  Avoid CPUs becoming hotspots tuning application for PMEM
  • 34. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 34 NTT Confidential 5. Challenges for performance evaluation 1. NUMA effect 2. Tuning application for PMEM
  • 35. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 35 NTT Confidential 5. Challenges for performance evaluation 1. NUMA effect 2. Tuning application for PMEM
  • 36. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 36 NTT Confidential NUMA effect  Synchronous write is about 1.5 times faster on local NUMA node than on remote NUMA node NUMA node 32 GiB DRAM 32 GiB DRAM CPU CPU Node0 Node1 48 GiB PMEM Synchronous write 15 GB/s 11 GB/s Local node Remote node X1.5
  • 37. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 37 NTT Confidential 5. Challenges for performance evaluation 1. NUMA effect 2. Tuning application for PMEM
  • 38. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 38 NTT Confidential Tuning application for PMEM  Important to avoid calculation processing becoming hotspot  Better to use Stored Procedure in PG Stored Procedure improves PG performance since user-defined functions are pre-compiled and stored in PG server pgbench -c 16 -j 16 -T 1800 -r [db_name] -M prepared
  • 39. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 39 NTT Confidential Evaluation setup 32 GB DRAM 32 GB DRAM CPU CPU Node0 Node1 PG server NUMA node 48 GB PMEM Hardware CPU E5-2667 v4 x 2 (8 cores per node) DRAM [Node0/1] 32 GB each PMEM (NVDIMM- N) [Node1] 48 GB (HPE 8GB NVDIMM x 6) SSD Intel® Optane™ SSD DC P4800X Series 750GB Software Distro Ubuntu 16.04 Linux kernel 4.17. 9 PMDK 1.4.1 Filesystem ext4 (DAX available) PostgreSQL 10.4 Beta**: https://www.postgresql.org/message-id/C20D38E97BCB33DAD59E3A1%40lab.ntt.co.jp 750 GB SSD PG clients
  • 40. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 40 NTT Confidential Stored Procedure  Improvement ratio using PMEM is improved by 12% compared with using SSD Intel® Optane™ SSD DC P4800X Series 750GB PMEM with DAX FS + PMDK Without Stored Procedure With Stored Procedure 18,396 tps 29,125 tps x1.58 36,449 tps 21,406 tps x1.70 [tps]
  • 41. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 41 NTT Confidential Conclusion  Difficult to implement PMEM aware applications  Resizing file with overhead  Best practice is to access only fixed-size file  Selecting inappropriate sync function seriously degrades performance or durability  Difficult to get valid results in performance evaluation  Avoiding NUMA effect  Avoiding CPUs becoming hotspots
  • 42. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 42 NTT Confidential
  • 43. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 43 NTT Confidential pmem_msync() and pmem_drain()  pmem_msync() - file metadata and dirty page are flushed  Valid sizes of page are 4KB, 2MB, and 1GB  pmem_drain() - only written data is flushed File metadata Written data File pmem_drain() Written data PMEM File metadata File pmem_msync() File metadata Written data PMEM Dirty page Written data
  • 44. 2018 Storage Developer Conference. © 2018 NTT Corp. All Rights Reserved. 44 NTT Confidential Reducing number of times to copy data  Performance may be improved by reducing number of copies Temporary buffer PMEM HDD/SS D Application’s working memory For Sequential access and buffering(writing bulk data) Random I/O is significantly slower than sequential I/O. Random I/O is as fast as sequential I/O. Syscalls PDMK APIs

Editor's Notes

  1. Good afternoon, everyone. Thank you for your participating to this presentation. My name is Yoshimi Ichiyanagi. I’m a open-source software engineer at NTT labs in Japan. I came from Japan a few days ago. So I have jet lag.   I'm very happy to speak to you today. I’d like to talk about the challenges for implementing PMEM aware application with PMDK. The purpose of my presentation is to share know-how gained by rewriting PMEM-aware application.
  2. This is an overview of my talk. Firstly, my introduction; secondly, our background and our motivation; thirdly, Challenges for implementing PMEM-aware applications; and finally, Challenges for performance evaluation to get valid results.
  3. First of all, let me introduce myself.
  4. I have worked at Software Innovation Center of NTT labs for twelve years. NTT is a Japanese telecommunication company. NTT Software Innovation Center is undertaking promotion of open innovation centering on the development of open source platforms. There are many committers and developers of open source software, such as Openstack, Swift, and apache projects in NTT labs. I worked on system software, which are distributed filesystem such as HDFS, and operating system such as Linux kernel. Now, I work on storage applications. Today I’d like to talk about rewriting open source software using a new storage and a new library.
  5. Next, I’d like to talk about our research background and motivation.
  6. Persistent Memory such as NVDIMM-N and Intel Optane DC Persistent Memory, begins to be supplied. PMEM has several features. Memory-like features of PMEM are Low-latency and Byte-addressable, and Storage-like features are large-capacity and non-volatile.
  7. We try to rewrite storage applications since PMEM features are utilized. I think storage applications are RDBMS such as PostgreSQL, Message queue system such as Apache Kafka and so on. Let me share my challenges to rewrite the PMEM-aware applications and evaluate their performance.
  8. Next, I’d like to talk about how to use PMEM.
  9. We used written hardware and software. There are 3 components, which are NVDIMM-N, Direct-Access for files, and Persistent Memory Developer Kid, PMDK. DAX enabled FS and PMDK are already supported on Linux and Windows. My previous speaker have talked about them. I think you know them. But I’m going to explain DAX FS and PMDK just in case someone doesn’t know them.
  10. I think 3 pattern of I/O stacks.   First, the left figure shows the current I/O stack. Many applications uses File I/O APIs such as read and write system calls.   Second, the center figure shows DAX FS. DAX filesystem is the filesystem which allows applications direct access to persistent memory without page cache. So, it can run faster than before, since unnecessary copies are removed. But context switch is needed between user and kernel spaces. So context switch becomes overhead. [ポインタを動かして、スライドさせる]   Finally, the right figure shows how to access PMEM with DAX FS and PMDK. DAX filesystem also provides memory-mapped file feature. DAX FS maps the file on PMEM directly into the virtual address space of the application. The application can use CPU instructions in order to access the file data without context switches so it can run very very faster than before. In this way, PMDK provides primitive memory functions such as [CPU cache flush or memory barrier] to make sure that the data reaches PMEM. So, by using PMDK, we don’t need to write such functions by ourselves. [end]
  11. The biggest benefit of DAX-enabled FS is that it is not necessary to rewrite applications. The memory-mapped file and PMDK can improve the performance of I/O-intensive workload, because it reduces context switches and the overhead of API calls. But PMDK cannot use without change of the application. So I rewrote PostgreSQL and apache Kafka in order to improve the application performance with PMDK.
  12. I think there are 3 features of PMDK. First, as I said, an application accesses PMEM as memory-mapped I/O. Second, an application developers can select fine-grained sync size. For details, I’ll show you it in next slides. Finally, CPU instructions suitable for copy data size are selected. 1. For example, when CPUs of the server supports SSE2, data is copied with 16(sixteen) bytes registers as possible. 2. When CPUs of the server supports AVX, data is copied with 32(thirty-two) bytes registers as possible. 3. And if recent CPUs is working on the server, data is copied with 64 bytes register as possible. Furthermore, CPU instructions, “MOVNT(move non-temporal instructions) and SFENCE” store data to memory, without CPU caches.   Next, I explain on the second feature of PMDK.
  13. The second feature is that PMDK provides some sync APIs. In particular, I’ll talk about pmem_msync() and pmem_drain(). Both are PMDK APIs.   The key difference between pmem_msync() and pmem_drain() is what kind of data is flushed. pmem_msync() calls msync syscall. The most general sync function for memory-mapped file is msync syscall. By calling msync syscall, the file metadata and written file data is flushed. On the other hand, By calling pmem_drain(), Only written file data is flushed. So, pmem_drain() is faster than pmem_msync().
  14. We try to rewrite PostgreSQL and Apache Kafka as storage applications in order to be improved I/O performance with PMDK.   Let me share know-how gained by rewriting PostgreSQL. We looked into PostgreSQL to find out what kind of files was used in it. We chose 2 points, which are checkpoint files and WAL files. Checkpoint file was chosen because many writes occurred during checkpoint. WAL files was chosen because it was critical for transaction performance.
  15. I’ll talk about challenges for implementing PostgreSQL. But before that, I’d like to talk about how to rewrite PostgreSQL.
  16. We replaced standard file I/O as some syscalls with mmap I/O as PMDK APIs simply.
  17. As I said, we chose two points. That’s checkpoint files and WAL files.
  18. First, I’ll show you how to hack checkpoint files.
  19. On PostgreSQL, the huge table and so forth consist of multiple checkpoint files. The size of checkpoint files is up to 1GB. So it’s necessary to resizing the checkpoint file.   PMDK provides APIs for memory-mapped file. And it’s difficult to resize memory-mapped file without overhead. I think that best practice is to access only fixed-size file. So, we changed how to access only 1GB checkpoint file   Then, what’s this overhead? It’s to remap a file. I’d like to talk about how to resize the memory-mapped file.
  20. When the memory-mapped file is enlarged, only part of the file can be accessed with PMDK. This(指す, other) range can’t be accessed with PMDK. So remapping the file is needed. The file is remapped, so that whole file can be accessed with PMDK.   Next, I’ll show you how to implement resizing the memory-mapped file.
  21. Here, it’s how to enlarge a file. On DAX-enabled FS, open syscall and close syscall are called. Then, the largest data offset become file size. So, application developers don’t need to write ftruncate syscall in source code. On the other hand, on DAX-enabled FS with PMDK, pmem_map_file(), pmem_unmap(), truncate syscall, pmam_map_file(), and pmem_unmap() are called. 3 function calls are added to use PMDK. 3 function become overhead.
  22. Here, it’s how to shrink a file. On DAX-enabled FS, open syscall, ftruncate syscall, and close syscall are called. On the other hand, on DAX-enabled FS with PMDK, pmem_map_file(), pmem_unmap(), truncate syscall, pmam_map_file(), and pmem_unmap() are called. 2 function calls are added to use PMDK. 2 functions become overhead. Then, if data is written to out of range, Segment fault happen. I think when the file is remapped, it’s necessary to manage this mapped address. Of course, while the file is unmapped, the file can’t accessed with PMDK. I think the lock functions are needed, and the application performance reduces. In these cases, it would be better to use only DAX FS. [end]
  23. So, it’s difficult to use PMDK unless file size is fixed. By using PMDK, munmap(), close(), open(), and mmap() syscall is called again. Repeating remapping many times degrades performance, because remapping file has large overhead. Or mapping large files may make file system full.   I think that best practice is to use only fixed-size file.
  24. Next, I’ll show you how to hack WAL files.
  25. The size of WAL files is fixed. This fixed-size file is highly suitable for memory-mapped I/O, because it is not necessary to either enlarge or shrink them.   For initialization of the WAL file. PostgreSQL create the WAL file and fill the file with zero. PostgreSQL flush file metadata at the end of initialization. This file metadata includes the size of file, the indirect node of inode, and so on. On the other hand, synchronous logging is sequential synchronous writes. It’s necessary to flush only written data.   PMDK provides some sync APIs, for example pmem_msync() and pmem_drain(). Which sync function is better for initialization of the WAL file? And which sync function is better for "synchronous logging"?
  26. It would be better to use pmem_msync() for initialization of the WAL file, because the file metadata should be flushed. On the other hand, either’s fine for synchronous logging. But pmem_drain() is faster than pmem_msync(). So We selected pmem_drain() for synchronous logging.   Next, I’d like to talk about comparing the performance between pmem_msync() and pmem_drain().
  27. I ran micro-benchmark to compare the performance between pmem_msync() and pmem_drain(). This emulates the WAL file I/O inside PostgreSQL. I measured only synchronous logging.
  28. This is the detail of micro-benchmark. I replaced write syscall with the PMDK API as pmem_memcpy_nodrain(). And I replaced fdatasync syscall with the PMDK API, which is pmem_msync or pmem_nodrain.
  29. This is the evaluation setup. I used one HPE computer server with two NUMA nodes. I ran the micro-benchmark on Node 1, because there was PMEM on Node 1.
  30. The total size of the written data is 10GB and the block size is 8KB. The micro-benchmark I/O throughput on DAX-enabled FS is 4.3 GB/sec, the throughput with pmem_msync() is 0.0023 GB/sec, and the throughput with pmem_drain() is 15GB/sec. pmem_drain() is fastest of 3 patterns as expected. [end]
  31. pmem_drain() greatly improves performance of I/O-intensive workload.   But you should use pmem_drain() with caution. pmem_drain() can’t flush file metadata. So pmem_msync() should be called by application that uses file metadata such as time of last modification, time of last access, and so on. In addition, pmem_drain() doesn’t work without pmem_memcpy_nodrain().
  32. Now, I'd like to move on to the next topic, which is challenges for performance evaluation to get valid results.
  33. It’s difficult to get valid results in PG performance evaluation.   The results such as application performance depends on NUMA and CPUs. In order to get valid results, it is better to avoid NUMA effects and CPUs becoming hotspots. It’s necessary to tuning an application for PMEM.
  34. I’d like to talk about how to evaluate the performance to get the valid result.
  35. Now, I’d like to talk about NUMA effect.
  36. Would you please look at this(指す) figure, which is our evaluation setup. For the CPU on node 1, accessing to local memory on node 1 is fast, while remote node 0 is slow. This also applies to PCIe SSD, but persistent memory is more sensitive.   I ran the micro-benchmark on local NUMA Node 1 and I also did on the remote Node 0. This benchmark is the same as before. The I/O throughput on the local node is 15GB/sec and one on the remote node is 11GB/sec. Synchronous write is about 1.5 times faster on local NUMA node than on remote NUMA node.
  37. Finally, I’d like to talk about tuning an application for persistent memory.
  38. It’s Important to avoid calculation processing becoming hotspot. For example, it’s SQL processing in PostgreSQL. Stored Procedure improves the PostgreSQL performance, since user-defined functions are pre-compiled and stored in the PostgreSQL server. For that purpose, we used pgbench command with prepared option, which is this(指す) command line. I ran pgbench client with “the prepared option”, and I also did without “the prepared option”. I compared them. This pgbench is a popular PostgreSQL benchmarking tools used by many developers to do quick performance test.
  39. I ran this PostgreSQL server on Node 1 and I ran the client on Node 0. I wrote patches for PMEM-aware PostgreSQL, it’s available on pgsql-hackers mailing list.
  40. The improvement ratio using PMEM is improved by 12% compared with using SSD. On intel optane SSD, in original PostgreSQL, pgbench result with Stored Procedure is 18,396 tps, and one without Stored Procedure is 29,125 tps. The result with Stored Procedure is 1.58 times faster than without Stored Procedure. On Persistent memory, in PMEM-aware PostgreSQL, the pgbench result with Stored Procedure is 21,406 tps and one without Stored Procedure is 36,449 tps. The result with Stored Procedure is 1.70 times faster than without Stored Procedure. Improvement ratio using PMEM is higher than using intel Optane SSD as expected.
  41. I’d like to give you a quick summary of what we’ve seen today. The topics that were covered today were how to implement a PMEM-aware application and how to evaluate it to get valid results. By applying PMDK into PostgreSQL, I understood it was difficult to implement PMEM-aware applications and to get valid results in performance evaluation.   I hope that my presentation answered your questions. Thank you so much for your kind attention. Now, are there any questions?[end] Q&A That’s a good question. Or Thank you for your question. Could you speak a little slower, please? Or Could you repeat the question? わかりましたか? Does it make a sense? Or Do you have any questions so far? ほとんどの人はOptane SSDで満足できる。低遅延が必要なシステムは、PMEMが必要である。 Most people are satisfied with the use of Optane SSD. The systems that require low latency require Optane memory. AとBの違いはなんですか? What is the difference between A and B? If not, thank you very much.
  42. DAX FS tries to map 2M linear space of PMEM to each file as possible. Linux does not know the virtual address where the application has written data. And Linux manages memory on a page basis. Linux makes a page dirty when a write page fault occurs. Then DAX FS use huge page as possible. So, when msync syscalls is called, 2MB dirty page and file metadata is flushed. It has huge overhead.
  43. I rewrote PG using this way. It has huge overhead.   So, my coworker, Menjo is developing this. I guess this will be published next year, maybe.