Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Persistent memory

1,781 views

Published on

Persistent memory holds a lot of promise: what's not to like about vast amounts of directly-attached memory that remembers its contents over a power cycle? For some years we have been told that large persistent-memory arrays are coming; now it seems that they are about to arrive. In this lecture we will be covering the following:
What is Persistent Memory , The upcoming storage class memory (SCM)devices.
Difference between NVMe and SCM
How to use it and emulate it
Challenge : Durability / Consistency
Remote access
Implication for Next Generation Architecture

Published in: Technology
  • Nice summary. The most advanced PM-based file system is missing though - Plexistor's SDM. It also supports DAX and can be downloaded for free from www.plexistor.com/download/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Persistent memory

  1. 1. Persistent Memory Dr. Benoit Hudzia @blopeur benoit@stratoscale.com
  2. 2. Agenda NVM Evolution Persistent Memory Linux Software Stack Using , Emulating PMEM on Linux Remote PMEM Micro Storage Architecture
  3. 3. NVM Evolution
  4. 4. Persistent Memory Yesterday : Battery Backed RAM Today : NVDIMM with RAM + FLASH Power Down - copy to Flash, Power Up copy Back to RAM Emerging NVDIMM : PCM - 3DX Point - Memristor - etc… Offer 1000x speed vs NAND -> closer to RAM Characteristics as seen by software : Synchronous Model Load / Store memory instruction
  5. 5. New Generation HW NVM is no longer the bottleneck But still limited by Block stack latency + Asynchronous Model
  6. 6. Asynchronous Model : NVMe “When Poll is Better than Interrupt” Yang & Al . Usenix Fast 2012 https://www.usenix.org/legacy/events/fast12/tech/full_papers/Yang.pdf ● Active Polling ( SYNC ) lower latency ( at the expense of CPU) vs interrupt MSI-X (ASYNC) ● Used in Intel SPDK
  7. 7. Enter persistent Memory Source: Intel 4KB Read 64B Read
  8. 8. Moving away from Block I/O L A T E N C Y A C C E S S
  9. 9. Lead to a new Tiered Software Stack
  10. 10. Challenge: Durability
  11. 11. PMEM Linux Software Stack
  12. 12. Linux kernel (>4.2) subsystem
  13. 13. NVDIMM Software Architecture http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
  14. 14. BTT vs DAX BTT : Block translation table provides atomic sector update semantics for persistent memory devices applications that rely on sector writes not being torn can continue to do so. For Legacy application DAX : stands for Direct Access Allows mapping a pmem range directly into userspace via mmap If the application is aware of persistent, byte-addressable memory, and can use it to an advantage, DAX is the best path for it
  15. 15. Using , Emulating PMEM on Linux
  16. 16. Kernel Config ( > 4.2 ) Enable NVDIMM dynamic debug before you start playing with NVDIMMs Add to the kernel cmd line: libnvdimm.dyndbg nfit.dyndbg nd_pmem.dyndbg nd_blk.dyndbg ignore_loglevel
  17. 17. Pick your PMEM Use ACPI 6.0 compatible NVDIMM hardware or legacy NVDIMMs Use virtual NVDIMMs provided by hypervisor RAM as persistent memory PCMSIM: NVM-disk Emulation
  18. 18. Emulation : RAM as PMEM Bare metal : Add 'memmap=16G!16G' to the kernel boot parameters will reserve 16G of memory, starting at 16G. cat /proc/cmdline : BOOT_IMAGE=/boot/vmlinuz-4.3.0-1-default root=UUID=39635fd6-64ee- 4538-9964-7de6bb181181 resume=/dev/sda1 splash=silent quiet showopts memmap=1G!5G memmap=1G!7G BTT works
  19. 19. QEMU NVDIMM Qemu : qemu-system-x86_64 -object memory-backend-file,share,id=mem1,mem- path=/dax/D1 -device nvdimm,memdev=mem1,reserve-label-data,id=nv1 -m 2048,maxmem=100G,slots=10 …. Not yet in Upstream Qemu : https://github.com/xiaogr/qemu/tree/nvdimm-v9 Seabios integration : http://www.seabios.org/pipermail/seabios/2015-September/009770.html
  20. 20. Playing with DAX Only ext2, ext4 and xfs currently support DAX Note that block size should match page size mkfs.ext4 -b 4096 /dev/pmem1 mount -t ext4 -o dax /dev/pmem1 /tmp/dax/
  21. 21. Playing with DAX - Cont Then you just have to mmap it! But remember: CFLUSH, etc.. for durability
  22. 22. NVML : Lets somebody else do the heavy lifting http://pmem.io/ libpmem – Basic persistency handling Libvmmalloc - Transparently converts all the dynamic memory allocations into persistent memory allocations. libpmemblk – Block access to pmem libpmemlog - Log file on pmem (append-mostly) libpmemobj - Transactional Object Store on pmem Many more… pynvm , C++ bidings , etc..
  23. 23. Remote PMEM
  24. 24. Remote NVMe : using RDMA to transfer NVMe commands & data http://blog.pmcs.com/flash-memory-summit-2015-special-nvm-express-rdma-awesome/
  25. 25. Transitioning from Indirect to Direct Flow ● Project Donard ( PMC - Microsemi) ● Page Struct backed Pmem patch (I/O mem are normally accessed via PFN only)
  26. 26. Comes with Challenge : Durability vs Visibility http://www.snia.org/sites/default/files/SDC15_presentations/persistant_mem/ChetDouglas_RDMA_with_PM.pdf
  27. 27. RDMA + DDIO
  28. 28. RDMA + Non Allocating write
  29. 29. Peer 2 Peer : Bypassing CPU + SW bottleneck ● NVM HW - Expose BAR address ● March 16 : RFC patchset for DAX allowing DMA to I/O mem ● CCIX fabric ● Use case: ○ Pre-process in Data path ○ Avoid RAM buffer ( HMM style ) ○ SW only fetch what is necessary
  30. 30. Future Hyperscale Architecture NVMe gravy train for 3-5 years Transition to Pmem optimised apps and Natural evolution of Ethernet Connected Drive => Fabric connected Pmem Durable Array of Wimpy Nodes Direct PMEM Low power High perf K/V storage Use pluggable front end
  31. 31. Links Drivers specs: http://pmem.io/documents/ NVDIMM Namespace Specification: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf NVDIMM Drivers Writers Guide: http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf NVDIMM DSM Interface Example: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf Linux docs: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/nvdimm/nvdimm.txt Qemu : https://github.com/xiaogr/qemu/tree/nvdimm-v9 Seabios : http://www.seabios.org/pipermail/seabios/2015-September/009770.html Libraries: https://github.com/pmem/nvml/ https://github.com/perone/pynvm http://opennvm.github.io/index.html https://github.com/spdk/spdk Project : PMFS : https://github.com/linux-pmfs/pmfs NOVA: NOn-Volatile memory Accelerated log-structured file system https://github.com/NVSL/NOVA PCMSIM : https://code.google.com/p/pcmsim/ Patch : Donard: A PCIe Peer-2-Peer kernel patch https://github.com/sbates130272/donard adds struct page backing for IO memory and as such allows IO memory to be used as a DMA target : http://www.spinics.net/lists/linux- mm/msg103990.html
  32. 32. Thank You! Questions ?
  33. 33. NVDIMM block I/O path

×