How can OpenNebula fit your needs ?
Or “I want to write my own (transfer) managers.”
Maxence Dunnewind
OpenNebulaConf 2013...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
2
Who am I ?
● French system engineer
● Working at Inria on BonFIRE ...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
3
What's BonFIRE ?
European project which aims at delivering :
« … a...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
4
What's BonFIRE …
technically ?
● OCCI used through the whole
stack...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
5
OpenNebula & BonFIRE
● Only use OCCI API
● Patched for BonFIRE
● P...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
6
Testbed infrastructure
● One disk server :
● 4 TB RAID-5 on 8 600G...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
7
Testbed infrastructure
● Drawbacks :
● Not a lot of disk
● Not a l...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
8
First iteration
● Pros :
● Fast boot process when image is
already...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
9
Second iteration
● Requirements :
● Efficient copy through network...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
10
What's wrong ?
● What's wrong with SSH TM :
● It uses ssh … which...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
11
Let's customize !
● Let's create our own Transfer Manager mad :
●...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
12
Let's customize !
How can we improve ?
● Avoid SSH to improve cop...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
13
Infrastructure setup
● Disk server acts has NFS server
● Datastor...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
14
Deployment Workflow
Using default SSH
● Ssh mkdir
● Scp image
● S...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
15
Deployment Workflow
Using default SSH
● 3 SSH connections
● 2 enc...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
16
Results
Deploying a VM using our most commonly used image (700M) ...
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
17
Results
24 - 26 Sept. 2013
Maxence Dunnewind - OpenNebulaConf
18
Conclusion
With no extra hardware, just updating 3 scripts in ONE...
The END ….The END ….
Thanks for your attention !
Maxence Dunnewind
OpenNebulaConf 2013 - Berlin
Upcoming SlideShare
Loading in...5
×

How Can OpenNebula Fit Your Needs: A European Project Feedback

1,924

Published on

BonFIRE is an european project which aims at providing a ”multi-site cloud facility for applications, services and systems research and experimentation”. Grouping different research cloud providers behind a common set of tools, APIs and services, it enables users to run their experiment against a heterogeneous set of infrastructure, hypervisors, networks, etc …

BonFIRE, and thus the (OpenNebula) testbeds, provide a relatively small set of images used to boot VMs. However, the experimental nature of BonFIRE projects results in a big ”turnover” of running VMs. Lot of VMs are used for a time period between a few hours and a few days, and an experiment startup can trigger deployment of many VMs at same time on a small set of OpenNebula workers, which does not correspond to usual Cloud workflow.

Default OpenNebula is not optimized for such usecase (small amount of worker nodes, high VMs turnover). However, thanks to its ability to be easily modified at each level of a Cloud deployment workflow, OpenNebula has been tuned to make it fit better with BonFIRE deployment process. This presentation will explain how to change OpenNebula TM and VMM to improve the parrallel deployment of many VMs in a short amount of time, reducing time needed to deploy an experiment to its lowest without lot of expensive hardware.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,924
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

How Can OpenNebula Fit Your Needs: A European Project Feedback

  1. 1. How can OpenNebula fit your needs ? Or “I want to write my own (transfer) managers.” Maxence Dunnewind OpenNebulaConf 2013 - Berlin
  2. 2. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 2 Who am I ? ● French system engineer ● Working at Inria on BonFIRE european project ● Working with OpenNebula inside BonFIRE ● Free software addict ● Puppet, Nagios, Git, Redmine, Jenkins, etc ... ● Sysadmin of french Ubuntu community ( http://www.ubuntu-fr.org ) ● More about me at: ● http://www.dunnewind.net (fr) ● http://www.linkedin.com/in/maxencedunnewind
  3. 3. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 3 What's BonFIRE ? European project which aims at delivering : « … a robust, reliable and sustainable facility for large scale experimentally-driven cloud research. » ● Provide extra set of tools to help experimenters : ● Improved monitoring ● Centralized services with common API for all testbeds ● OpenNebula project is involved in BonFIRE ● 4 testbeds provide OpenNebula infrastructure
  4. 4. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 4 What's BonFIRE … technically ? ● OCCI used through the whole stack ● Monitoring data : ● collected through Zabbix ● On-request export of metrics to experimenters ● Each testbed has a local administrative domain : ● Choice of technologies ● Open Access available ! ● http://www.bonfire-project.eu ● http://doc.bonfire-project.eu
  5. 5. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 5 OpenNebula & BonFIRE ● Only use OCCI API ● Patched for BonFIRE ● Publish on Message Queue through hooks ● Handle “experiment” workflow : ● Short experiment lifetime ● Lot of VM to deploy in short time ● Only a few different images : ● ~ 50 ● 3 based images used most of the time
  6. 6. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 6 Testbed infrastructure ● One disk server : ● 4 TB RAID-5 on 8 600GB SAS 15k hard drive ● 48 Gb of RAM ● 1 * 6 cores E5-2630 ● 4 * 1 Gb Ethernet links aggregated using Linux bonding 802.3ad ● 4 workers : ● Dell C6220, 1 blade server with 4 blades ● Each blade has : ● 64G of RAM ● 2 * 300G SAS 10k (grouped in one LVM VG) ● 2 * E5-2620 ● 2 * 1Gb Ethernet aggregated
  7. 7. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 7 Testbed infrastructure ● Drawbacks : ● Not a lot of disk ● Not a lot of time to deploy things like Ceph backend ● Network is fine, but still Ethernet (no low-latency network) ● Only a few servers for VM ● Disk server is shared with other things (backup for example) ● Advantages : ● Network not heavily used ● Disk server is fine for virtualization ● Workers have a Xen with LVM backend ● Both server and workers have enough RAM to benefits of big caches
  8. 8. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 8 First iteration ● Pros : ● Fast boot process when image is already copied ● Network saving ● Cons : ● LVM snapshot performance ● Cache coherency ● Custom Housekeeping scripts need to be maintained ● Before the blade, we had 8 small servers : ● 4G of RAM ● 500G of disk space ● 4 cores ● ● Our old setup customized SSH TM to : ● Make a local copy of each image on the host ● Snapshot the local copy to boot the VM on it
  9. 9. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 9 Second iteration ● Requirements : ● Efficient copy through network ● ONE frontend hosted on the disks server as a VM ● Use of LVM backend (easy for backup / snapshot etc …) ● Try to benefits from cache when copying one image many times in a row ● Efficient use of network bonding when deploying on blades ● No copy if possible when image is persistent But : ● OpenNebula doesn't support Copy + LVM backend (only ssh OR clvm) ● OpenNebula main daemon is written in compiled language (C/C++) ● But all mads are written in shell (or ruby ) ! ● Creating a mad is just a new directory with a few shell files
  10. 10. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 10 What's wrong ? ● What's wrong with SSH TM : ● It uses ssh … which drops the performances ● Images need to be present inside the frontend VM to be copied, so a deployment will need to : disk VM memory network→ → → ● One ssh connection need to be opened for each transfer ● Reduce the benefits of cache ● No cache on client/blade side ● What's wrong with NFS TM : ● Almost fine if you have very strong network / hard drives ● Disastrous when you try to do something with VMs if you don't have strong network / hard drives :)
  11. 11. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 11 Let's customize ! ● Let's create our own Transfer Manager mad : ● Used for image transfer ● Only need a few files in (for system-wide install) /var/lib/one/remotes/tm/mynewtm ● clone => Main script called to copy an OS image to the node ● context => Manage context ISO creation and copy ● delete => Delete OS image ● ln => Called when a persistent (not cloned) image is used in a VM Only clone, delete and context will be updated, ln is the same as the NFS one
  12. 12. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 12 Let's customize ! How can we improve ? ● Avoid SSH to improve copy ● Netcat ? ● Require complex script to create netcat server dynamically ● NFS ? ● Avoid to run ssh commands if possible ● Try to improve cache use ● On server ● On clients / blades ● Optimize network for parallel copy ● Blade IP's need to be carefully chosen to use one 1Gb link of disk server for each blade ( 4 links, 4 blades )
  13. 13. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 13 Infrastructure setup ● Disk server acts has NFS server ● Datastore is exported from the disk server as a NFS share : ● To the ONE frontend (VM on the same host) ● To the blades (through network) ● Each blade mounts the datastore directory locally ● Copy of base images is done from NFS mount to local LVM ● Or linked in case of persistent image => only persistent images write directly on NFS ● Almost all commands are done directly on NFS share for VM deployment ● No extra ssh sessions
  14. 14. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 14 Deployment Workflow Using default SSH ● Ssh mkdir ● Scp image ● Ssh mkdir for context ● Create context iso locally ● Scp context iso ● Ssh create symlink ● Remove local context iso / directory Using custom TM ● Local mkdir on NFS mount ● Create LV on worker ● Ssh to cp image from NFS to local LV ● Create symlink on NFS mount which points to LV ● Create context iso on NFS mount
  15. 15. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 15 Deployment Workflow Using default SSH ● 3 SSH connections ● 2 encrypted copy ● ~ 15MB/s raw bw ● No improvement on next copy ● ~ 15MB for real image copy => ssh makes encryption / cpu the bottleneck Using custom TM ● 1 SSH connection ● 0 encrypted copy ● 2 copy from NFS : ● ~ 110MB/s raw bw for first copy ( > /dev/null) ● up to ~120MB/s raw for second ● ~ 80MB/s for real image copy ● Bottleneck is hard drive ● Up to 115 MB/s with cache
  16. 16. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 16 Results Deploying a VM using our most commonly used image (700M) : ● Scheduler interval is 10s, and can deploy 30 VMs per run, 3 per host ● Takes ~ 13s from ACTIVE to RUNNING ● Image copy ~ 7s Tue Sep 24 22:51:11 2013 [TM][I]: 734003200 bytes (734 MB) copied, 6.49748 s, 113 MB/s' ● 4 VMs on 4 nodes (one per node) from submission to RUNNING in 17 s , 12 VMs in 2 minutes 6s (+/- 10s) ● Transfer between 106 and 113 MB/s on the 4 nodes at same time ● Thanks to efficient 802.3ad bonding
  17. 17. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 17 Results
  18. 18. 24 - 26 Sept. 2013 Maxence Dunnewind - OpenNebulaConf 18 Conclusion With no extra hardware, just updating 3 scripts in ONE and our network configuration, we : ● Reduced contention on SSH, speedup command doing them locally (NFS then sync with nodes) ● Reduced CPU used by deployment for SSH encryption ● Removed SSH bottleneck on encryption ● Improved almost by 8 our deployment time ● Optimized parallel deployment, so that we reach (network) hardware limitation : ● Deploying images in parallel have almost no impact on each deployment performance All this without need for a huge (and expensive) NFS server (and network) which would have to host images of running VMs ! Details on http://blog.opennebula.org/?p=4002
  19. 19. The END ….The END …. Thanks for your attention ! Maxence Dunnewind OpenNebulaConf 2013 - Berlin
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×