Successfully reported this slideshow.

Persistent memory

9

Share

1 of 33
1 of 33

Persistent memory

9

Share

Download to read offline

Persistent memory holds a lot of promise: what's not to like about vast amounts of directly-attached memory that remembers its contents over a power cycle? For some years we have been told that large persistent-memory arrays are coming; now it seems that they are about to arrive. In this lecture we will be covering the following:
What is Persistent Memory , The upcoming storage class memory (SCM)devices.
Difference between NVMe and SCM
How to use it and emulate it
Challenge : Durability / Consistency
Remote access
Implication for Next Generation Architecture

Persistent memory holds a lot of promise: what's not to like about vast amounts of directly-attached memory that remembers its contents over a power cycle? For some years we have been told that large persistent-memory arrays are coming; now it seems that they are about to arrive. In this lecture we will be covering the following:
What is Persistent Memory , The upcoming storage class memory (SCM)devices.
Difference between NVMe and SCM
How to use it and emulate it
Challenge : Durability / Consistency
Remote access
Implication for Next Generation Architecture

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Persistent memory

  1. 1. Persistent Memory Dr. Benoit Hudzia @blopeur benoit@stratoscale.com
  2. 2. Agenda NVM Evolution Persistent Memory Linux Software Stack Using , Emulating PMEM on Linux Remote PMEM Micro Storage Architecture
  3. 3. NVM Evolution
  4. 4. Persistent Memory Yesterday : Battery Backed RAM Today : NVDIMM with RAM + FLASH Power Down - copy to Flash, Power Up copy Back to RAM Emerging NVDIMM : PCM - 3DX Point - Memristor - etc… Offer 1000x speed vs NAND -> closer to RAM Characteristics as seen by software : Synchronous Model Load / Store memory instruction
  5. 5. New Generation HW NVM is no longer the bottleneck But still limited by Block stack latency + Asynchronous Model
  6. 6. Asynchronous Model : NVMe “When Poll is Better than Interrupt” Yang & Al . Usenix Fast 2012 https://www.usenix.org/legacy/events/fast12/tech/full_papers/Yang.pdf ● Active Polling ( SYNC ) lower latency ( at the expense of CPU) vs interrupt MSI-X (ASYNC) ● Used in Intel SPDK
  7. 7. Enter persistent Memory Source: Intel 4KB Read 64B Read
  8. 8. Moving away from Block I/O L A T E N C Y A C C E S S
  9. 9. Lead to a new Tiered Software Stack
  10. 10. Challenge: Durability
  11. 11. PMEM Linux Software Stack
  12. 12. Linux kernel (>4.2) subsystem
  13. 13. NVDIMM Software Architecture http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
  14. 14. BTT vs DAX BTT : Block translation table provides atomic sector update semantics for persistent memory devices applications that rely on sector writes not being torn can continue to do so. For Legacy application DAX : stands for Direct Access Allows mapping a pmem range directly into userspace via mmap If the application is aware of persistent, byte-addressable memory, and can use it to an advantage, DAX is the best path for it
  15. 15. Using , Emulating PMEM on Linux
  16. 16. Kernel Config ( > 4.2 ) Enable NVDIMM dynamic debug before you start playing with NVDIMMs Add to the kernel cmd line: libnvdimm.dyndbg nfit.dyndbg nd_pmem.dyndbg nd_blk.dyndbg ignore_loglevel
  17. 17. Pick your PMEM Use ACPI 6.0 compatible NVDIMM hardware or legacy NVDIMMs Use virtual NVDIMMs provided by hypervisor RAM as persistent memory PCMSIM: NVM-disk Emulation
  18. 18. Emulation : RAM as PMEM Bare metal : Add 'memmap=16G!16G' to the kernel boot parameters will reserve 16G of memory, starting at 16G. cat /proc/cmdline : BOOT_IMAGE=/boot/vmlinuz-4.3.0-1-default root=UUID=39635fd6-64ee- 4538-9964-7de6bb181181 resume=/dev/sda1 splash=silent quiet showopts memmap=1G!5G memmap=1G!7G BTT works
  19. 19. QEMU NVDIMM Qemu : qemu-system-x86_64 -object memory-backend-file,share,id=mem1,mem- path=/dax/D1 -device nvdimm,memdev=mem1,reserve-label-data,id=nv1 -m 2048,maxmem=100G,slots=10 …. Not yet in Upstream Qemu : https://github.com/xiaogr/qemu/tree/nvdimm-v9 Seabios integration : http://www.seabios.org/pipermail/seabios/2015-September/009770.html
  20. 20. Playing with DAX Only ext2, ext4 and xfs currently support DAX Note that block size should match page size mkfs.ext4 -b 4096 /dev/pmem1 mount -t ext4 -o dax /dev/pmem1 /tmp/dax/
  21. 21. Playing with DAX - Cont Then you just have to mmap it! But remember: CFLUSH, etc.. for durability
  22. 22. NVML : Lets somebody else do the heavy lifting http://pmem.io/ libpmem – Basic persistency handling Libvmmalloc - Transparently converts all the dynamic memory allocations into persistent memory allocations. libpmemblk – Block access to pmem libpmemlog - Log file on pmem (append-mostly) libpmemobj - Transactional Object Store on pmem Many more… pynvm , C++ bidings , etc..
  23. 23. Remote PMEM
  24. 24. Remote NVMe : using RDMA to transfer NVMe commands & data http://blog.pmcs.com/flash-memory-summit-2015-special-nvm-express-rdma-awesome/
  25. 25. Transitioning from Indirect to Direct Flow ● Project Donard ( PMC - Microsemi) ● Page Struct backed Pmem patch (I/O mem are normally accessed via PFN only)
  26. 26. Comes with Challenge : Durability vs Visibility http://www.snia.org/sites/default/files/SDC15_presentations/persistant_mem/ChetDouglas_RDMA_with_PM.pdf
  27. 27. RDMA + DDIO
  28. 28. RDMA + Non Allocating write
  29. 29. Peer 2 Peer : Bypassing CPU + SW bottleneck ● NVM HW - Expose BAR address ● March 16 : RFC patchset for DAX allowing DMA to I/O mem ● CCIX fabric ● Use case: ○ Pre-process in Data path ○ Avoid RAM buffer ( HMM style ) ○ SW only fetch what is necessary
  30. 30. Future Hyperscale Architecture NVMe gravy train for 3-5 years Transition to Pmem optimised apps and Natural evolution of Ethernet Connected Drive => Fabric connected Pmem Durable Array of Wimpy Nodes Direct PMEM Low power High perf K/V storage Use pluggable front end
  31. 31. Links Drivers specs: http://pmem.io/documents/ NVDIMM Namespace Specification: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf NVDIMM Drivers Writers Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf NVDIMM DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf Linux docs: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/nvdimm/nvdimm.txt Qemu : https://github.com/xiaogr/qemu/tree/nvdimm-v9 Seabios : http://www.seabios.org/pipermail/seabios/2015-September/009770.html Libraries: https://github.com/pmem/nvml/ https://github.com/perone/pynvm http://opennvm.github.io/index.html https://github.com/spdk/spdk Project : PMFS : https://github.com/linux-pmfs/pmfs NOVA: NOn-Volatile memory Accelerated log-structured file system https://github.com/NVSL/NOVA PCMSIM : https://code.google.com/p/pcmsim/ Patch : Donard: A PCIe Peer-2-Peer kernel patch https://github.com/sbates130272/donard adds struct page backing for IO memory and as such allows IO memory to be used as a DMA target : http://www.spinics.net/lists/linux- mm/msg103990.html
  32. 32. Thank You! Questions ?
  33. 33. NVDIMM block I/O path

×