How can OpenNebula fit your needs - OpenNebulaConf 2013

246 views

Published on

In the scope of a European Project (BonFIRE - www.bonfire-project.eu ), I had to tune openNebula to fit our requirement that are unusual in a private cloud environment (small hardware, small number of base images, but lot of vms created).

These slides explain how, thanks to how OpenNebula enables administrators to tune it, I updated the transfer manager scripts to improve our deployment speed by almost 8.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
246
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

How can OpenNebula fit your needs - OpenNebulaConf 2013

  1. 1. How can OpenNebula fit your needs ? Or “I want to write my own (transfer) managers.” Maxence Dunnewind OpenNebulaConf 2013 - Berlin
  2. 2. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 2 Who am I ? ● French system engineer ● Working at Inria on BonFIRE european project ● Working with OpenNebula inside BonFIRE ● Free software addict ● Puppet, Nagios, Git, Redmine, Jenkins, etc ... ● Sysadmin of french Ubuntu community ( http://www.ubuntu-fr.org ) ● More about me at: ● http://www.dunnewind.net (fr) ● http://www.linkedin.com/in/maxencedunnewind
  3. 3. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 3 What's BonFIRE ? European project which aims at delivering : « … a robust, reliable and sustainable facility for large scale experimentally-driven cloud research. » ● Provides extra set of tools to help experimenters : ● Improved monitoring ● Centralized services with common API for all testbeds ● OpenNebula project is involved in BonFIRE ● 4 testbeds provide OpenNebula infrastructure
  4. 4. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 4 What's BonFIRE … technically ? ● OCCI used through the whole stack ● Monitoring data : ● collected through Zabbix ● On-request export of metrics to experimenters ● Each testbed has a local administrative domain : ● Choice of technologies ● Open Access available ! ● http://www.bonfire-project.eu ● http://doc.bonfire-project.eu
  5. 5. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 5 OpenNebula & BonFIRE ● Only use OCCI API ● Patched for BonFIRE ● Publish on Message Queue through hooks ● Handle “experiment” workflow : ● Short experiment lifetime ● Lot of VM to deploy in short time ● Only a few different images : ● ~ 50 ● 3 based images used most of the time
  6. 6. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 6 Testbed infrastructure ● One disk server : ● 4 TB RAID-5 on 8 600GB SAS 15k hard drive ● 48 Gb of RAM ● 1 * 6 cores E5-2630 ● 4 * 1 Gb Ethernet links aggregated using Linux bonding 802.3ad ● 4 workers : ● Dell C6220, 1 blade server with 4 blades ● Each blade has : ● 64G of RAM ● 2 * 300G SAS 10k (grouped in one LVM VG) ● 2 * E5-2620 ● 2 * 1Gb Ethernet aggregated
  7. 7. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 7 Testbed infrastructure ● Drawbacks & constraints : ● Not a lot of disks ● Not a lot of time to deploy things like Ceph backend ● Network is fine, but still Ethernet (no low-latency network) ● Only a few servers for VM ● Disk server is shared with other things (backup for example) ● Advantages : ● Network not heavily used ● Can use it for deployment ● Disk server is fine for virtualization ● Workers have a Xen with LVM backend ● Both server and workers have enough RAM to benefits of caches
  8. 8. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 8 First iteration ● Pros : ● Fast boot process when image is already copied ● Network saving ● Cons : ● LVM snapshot performance ● Cache coherency ● Custom Housekeeping scripts need to be maintained ● Before the blade, we had 8 small servers : ● 4G of RAM ● 500G of disk space ● 4 cores ● Our old setup based on customized SSH TM was to : ● Make a local copy of each image on the host ● Only once per image ● Snapshot the local copy to boot the VM on it
  9. 9. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 9 Second iteration ● Requirements : ● Efficient copy through network ● ONE frontend hosted on the disks server as a VM ● Use of LVM backend (easy for backup / snapshot etc …) ● Try to benefit from cache when copying one image many times in a row ● Efficient use of network bonding when deploying on blades ● No copy if possible when image is persistent But : ● OpenNebula doesn't support Copy + LVM backend (only ssh OR clvm) ● OpenNebula main daemon is written in compiled language (C/C++) ● But all mads are written in shell (or ruby ) ! ● Creating a mad is just a new directory with a few shell files
  10. 10. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 10 What's wrong ? ● What's wrong with SSH TM : ● It uses ssh … which drops the performance ● Images need to be present inside the frontend VM to be copied, so a deployment will need to go through : hypervisor's disk VM memory network→ → → ● One ssh connection needs to be opened for each transfer ● Reduce the benefits of cache ● No cache on client/blade side ● What's wrong with NFS TM : ● Almost fine if you have very strong network / hard drives ● Disastrous when you try to do (write) something with VMs if you don't have strong network / hard drives :)
  11. 11. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 11 Let's customize ! ● Let's create our own Transfer Manager mad : ● Used for image transfer ● Only need a few files in (for system-wide install) /var/lib/one/remotes/tm/mynewtm ● clone => Main script called to copy an OS image to the node ● context => Manage context ISO creation and copy ● delete => Delete OS image ● ln => Called when a persistent (not cloned) image is used in a VM Only clone, delete and context will be updated, ln is the same as the NFS one
  12. 12. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 12 Let's customize ! How can we improve ? ● Avoid SSH to improve copy ● Netcat ? ● Require complex script to create netcat servers dynamically ● NFS ? ● Avoid to run ssh command if possible ● Try to improve cache use ● On server ● On clients / blades ● Optimize network for parallel copy ● Blade IP's need to be carefully chosen to use one 1Gb link of disk server for each blade ( 4 links, 4 blades )
  13. 13. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 13 Infrastructure setup ● Disk server acts has NFS server ● Datastore is exported from the disk server as a NFS share : ● To the ONE frontend (VM on the same host) ● To the blades (through network) ● Each blade mounts the datastore directory locally ● Copy of base images is done from NFS mount to local LVM ● Or linked in case of persistent image => only persistent images write directly on NFS ● Almost all commands are done directly on NFS share for VM deployment ● No extra ssh sessions
  14. 14. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 14 Deployment Workflow Using default SSH ● Ssh mkdir ● Scp image ● Ssh mkdir for context ● Create context iso locally ● Scp context iso ● Ssh create symlink ● Remove local context iso / directory Using custom TM ● Local mkdir on NFS mount ● Create LV on worker ● Ssh to cp image from NFS to local LV ● Create symlink on NFS mount which points to LV ● Create context iso on NFS mount
  15. 15. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 15 Deployment Workflow Using default SSH ● 3 SSH connections ● 2 encrypted copies ● ~ 15MB/s raw bw ● No improvement on next copy ● ~ 15MB for real image copy => ssh makes encryption / cpu the bottleneck Using custom TM ● 1 SSH connection ● 0 encrypted copy ● 2 copy from NFS : ● ~ 110MB/s raw bw for first copy ( > /dev/null) ● up to ~120MB/s raw for second ● ~ 80MB/s for real image copy ● Bottleneck is hard drive ● Up to 115 MB/s with cache
  16. 16. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 16 Results Deploying a VM using our most commonly used image (700M) : ● Scheduler interval is 10s, and can deploy 30 VMs per run, 3 per host ● Takes ~ 13s from ACTIVE to RUNNING ● Image copy ~ 7s Tue Sep 24 22:51:11 2013 [TM][I]: 734003200 bytes (734 MB) copied, 6.49748 s, 113 MB/s' ● 4 VMs on 4 nodes (one per node) from submission to RUNNING in 17 s , 12 VMs in 2 minutes 6s (+/- 10s) ● Transfer between 106 and 113 MB/s on the 4 nodes at same time ● Thanks to efficient 802.3ad bonding
  17. 17. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 17 Results
  18. 18. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 18 Conclusion With no extra hardware, just updating 3 scripts in ONE and our network configuration, we : ● Reduced contention on SSH, speedup command doing them locally (NFS then sync with nodes) ● Reduced CPU used by deployment for SSH encryption ● Removed SSH bottleneck on encryption ● Improved almost by 8 our deployment time ● Optimized parallel deployment, so that we reach (network) hardware limitation : ● Deploying images in parallel have almost no impact on each deployment performance All this without need for a huge (and expensive) NFS server (and network) which would have to host images of running VMs ! Details on http://blog.opennebula.org/?p=4002
  19. 19. The END ….The END …. Thanks for your attention ! Maxence Dunnewind OpenNebulaConf 2013 - Berlin

×