Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Improving hyperconverged performance

249 views

Published on

DevConf-2018
Gluster storage is integrated with oVirt as file based storage using FUSE, enabling all oVirt features with very little special code. However FUSE is not the most efficient and scalable way to access Gluster storage, resulting in poor virtual machine performance. With newly added native Gluster support a VM can access gluster storage directly in the most efficient way. Decreased storage access latency results in a better IOPS, thus making storage more responsive and improve the VMs performance. Participants will be able to learn more on how file system access works for VMs, review the reason for potential performance issues in hyperconverged setups, and how to improve it.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Improving hyperconverged performance

  1. 1. This presentation is licensed under a Creative Commons Attribution 4.0 International License Improving Hyperconverged Performance Denis Chaplygin Senior Software Engineer Jan 2018
  2. 2. oVirt overview oVirt engine 2
  3. 3. ● Separate storage ○ Stores VM images ○ Needs to be shared between compute hosts ○ Data availability provided by storage ○ Storage access may not be redundant ● Separate compute hosts ○ Host agent (VDSM) manages VMs, storages and networks ● Engine host is just a VM... ○ ...and Hosted Engine makes it highly available 3 oVirt overview cont’d
  4. 4. 4 Gluster overview BRICK BRICK BRICK BRICK BRICK BRICK BRICK BRICK BRICK GLUSTER SERVER GLUSTER SERVER GLUSTER SERVER VOLUME VOLUME VOLUME
  5. 5. ● GlusterFS is a general purpose, scale-out, distributed file-system supporting thousands of clients ● Aggregates storage exports over network connection to provide a single unified namespace ● File-system completely in userspace, runs on commodity hardware ● Gluster cluster is a collection of storage servers 5 Gluster overview cont’d
  6. 6. Two Plus Two Equals Five ● oVirt + Self Hosted Engine + GlusterFS ● Gluster volumes are oVirt storage domains ● Same nodes used to ○ Host the engine ○ Run payload VMs ○ Provide shared storage ● And now, storage (thanks to Gluster), is highly available and redundant 6 Hyperconverged - Integration of oVirt and Gluster.
  7. 7. The problem: VM data on the shared fs 7
  8. 8. 8 oVirt VM disk image store Host Shared storage VM VM Disk image Storage domain
  9. 9. 9 oVirt VM disk image store cont’d
  10. 10. ● VMs disk images are stored on a shared storage, either block- based or file-based ● VDSM mounts storage domains on the each host ● Storage domain is a special on-disk data structure, containing some metadata alongside VM data ● In the case of the filesystem-based storage, VMs are configured to use files in that directory as their drive images 10 oVirt VM disk image store cont’d
  11. 11. VM typical FOP Flow 11 VM QEMU VFS FUSE GLUSTER CLIENT GLUSTER VOLUME Host Gluster server User space Kernel space
  12. 12. Direct gluster access with libgfapi 12
  13. 13. ● libgfapi is a userspace library for accessing data in glusterfs ● No FUSE mount required ● Speed and latency have improved due to less overhead ● In the post-Meltdown world, context switches are very expensive 13 LibGfApi overview
  14. 14. VM disk access path with libgfapi 14 VM QEMU VFS FUSE GLUSTER CLIENT GLUSTER VOLUME Host Gluster Server User Space Kernel Space
  15. 15. ● QEMU has a GlusterFS block driver that uses libgfapi ● FUSE overhead no longer exists when QEMU works with VM images on gluster volumes ● gluster[+transport]://[server[:port]]/volname/image[?socket=...] ● Unfortunately, libgfapi support is a little bit limited: ○ Multiple servers can not be specified ○ Migrations between network and non-network drives are not yet possible 15 LibGfApi QEMU integration
  16. 16. ● Concept of a disk type - we can’t use a binary (file/block) logic anymore ● Special handling of ‘network’ disk types during VM creation ● Supports changing disk type on the fly during storage migrations ● Support for other operations, which earlier required actual presence of a file 16 LibGfApi VDSM support
  17. 17. ● Libgfapi support should be switchable at the engine or cluster level ● libgfapi support is only available in newer VDSM ○ On older version of VDSM, the engine has to detect and disable the libgfapi feature ● Gluster support during initial VM creation 17 LibGfApi engine support
  18. 18. ● Supported on oVirt 4.2 or oVirt 4.1, starting from v4.1.6 ● VM restart is required 18 Enabling libgfapi root# engine-config -s LibgfApiSupported=true --cver=4.2
  19. 19. Performance benchmarking 19
  20. 20. Worst case scenario: ● 50/50 reads/writes ● 4k blocks ● Latency kept below 10 ms ● 5% increase on IOPS with single brick volume ● 10% increase on IOPS with replica 3 volume 20 IOPS
  21. 21. 21 IOPS cont’d
  22. 22. Same scenario as for IOPS ● Just 2% increase of bandwidth on the single brick volume ● Huge 22% increase of bandwidth on the replica 3 volume 22 Bandwidth
  23. 23. ● MySQL database running DVD store simulator test suite inside a VM ● Compared an average number of transactions per minute, with and without libgfapi enabled 23 Realistic workload
  24. 24. ● Under low-to-moderate load (10-20 simultaneous clients), increase of transactions per minute with libgfapi enabled is about 11% ● Under higher load (80 simultaneous clients), increase of transactions per minute with libgfapi enabled is about 24% 24 Realistic workload - Results
  25. 25. 25 Realistic workload - Results
  26. 26. 26 Summary ● Combining two projects can give you more than just their sum ● Treating gluster as a typical network filesystem, as NFS for example, introduces some overhead and disadvantages ○ Fortunately, gluster has a special, userspace-only, library for direct file access ● Removing FUSE overhead gives you up to a 24% performance boost under database load for free
  27. 27. This presentation is licensed under a Creative Commons Attribution 4.0 International License THANK YOU http://www.ovirt.org dchaplyg@redhat.com

×