Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers

on

  • 1,345 views

High-Performance Grid and Cloud Computing Workshop

High-Performance Grid and Cloud Computing Workshop

Statistics

Views

Total Views
1,345
Views on SlideShare
1,341
Embed Views
4

Actions

Likes
2
Downloads
10
Comments
0

1 Embed 4

https://twitter.com 4

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data Centers Presentation Transcript

  • 1. High%Performance/Grid/and/Cloud/Compu6ng/Workshop,/May/20/2013,/BostonRyousei/Takano,/Hidemoto/Nakada,/Takahiro/Hirofuchi,//Yoshio/Tanaka,/and/Tomohiro/Kudoh//Informa(on)Technology)Research)Ins(tute,))Na(onal)Ins(tute)of)Advanced)Industrial)Science)and)Technology)(AIST),)JapanNinja&Migra*on:&An&Interconnect2transparent&Migra*on&for&Heterogeneous&Data&Centers
  • 2. Background•  HPC/cloud/is/a/promising/HPC/plaIorm./•  VM/migra6on/is/useful/for/improving/flexibility/and/maintainability/in/cloud/compu6ng./VM1 VM2 VM3VM1 VM2VM3Maintenance,/fault/tolerance,/energy/efficient/VM/placementDisaster/recovery/VM1 VM2 VM3
  • 3. InfinibandConstraints&on&VM&migra*on•  Migra6on/with/a/VMM%bypass/I/O/device/–  It/can/greatly/reduce/the/overhead/of/virtualiza6on,/but/it/is/not/under/the/control/of/a/VMM./InfinibandVM1 VM2 VM3
  • 4. Impact&of&VMM2bypass&I/O050100150200250300BT CG EP FT LUExecutiontime[seconds]BMM (IB) BMM (10GbE)KVM (IB) KVM (virtio)The/overhead/of/I/O/virtualiza6on/on/the/NAS/Parallel/Benchmarks/3.3.1/class/C,/64/processes.BMM: Bare Metal MachineKVM (virtio)VM110GbE NICVMMGuestdriverPhysicaldriverGuest OSKVM (IB)VM1IB QDR HCAVMMPhysicaldriverGuest OS•  Performance/evalua6on/of/HPC/cloud/–  (Para2)virtualized&I/O&incurs/a/large/overhead./–  PCI&passthrough&significantly/mi6gate/the/overhead./
  • 5. Constraints&on&VM&migra*on•  Migra6on/with/a/VMM%bypass/I/O/device/–  It/can/greatly/reduce/the/overhead/of/virtualiza6on,/but/it/is/not/under/the/control/of/a/VMM./•  Heterogeneity/of/interconnect/devices/–  A/VM/assigned/to/an/Infiniband/device/cannot/migrate/to/an/Ethernet/machine./Infiniband EthernetVM1 VM2 VM3
  • 6. Challenge•  Goal:/Migrate/a/cluster/of/VMs/between/heterogeneous/data/centers./!Interconnect%transparent/migra6on/•  Challenge:/How/do/we/realize/it/with/the/minimal/overhead/of/virtualiza6on/during/normal/opera6on?/–  (Para%)/Virtualized/devices/suffer/from/the/overhead./
  • 7. Outline•  Introduc6on/•  Ninja/migra6on:/interconnect%transparent/migra6on/•  Experiment/•  Conclusion
  • 8. Interconnect2transparent&migra*onInfinibandNormal/opera6onVM1 VM2 VM3 Fallback/migra6onRecovery/migra6onEthernetFallback/opera6onVM1 VM2 VM3VM1VM2VM3VM1VM2VM3Infiniband/cluster Ethernet/clusterUse&cases:/transparent/fail%over/to/another/cluster/for/maintenance,//evacua6on/from/a/disaster%stricken/data/center,/etc.
  • 9. Requirements1.  Detach/VMM%bypass/I/O/devices/only/when/VM/migra6on/is/required./2.  Global/Coordina6on/among/distributed/VMs/before/migra6on./3.  Change/an/applica6on’s/transport/protocol//for/the/available/device/ager/migra6on./ /Our/approach:/leverage/the/knowledge/of/an/applica6on/to/ensure/coopera6on/between/migra6on/and/a/communica6on/layer/inside/the/guest/OS./
  • 10. VMSymVirt:&Symbio*c&Virtualiza*onNICVMMApplica6onMigra6onGlobal/coordina6onDevice/setupExis*ng&VM&migra*on&(Black%box/approach)/Pro:/portabilityVM&migra*on&w/&SymVirt&(Gray%box/approach)/Pro:/performanceNICVMVMMApplica6onVMM#bypass)I/OMigra6onGlobal/coordina6onDevice/setupCoopera1on
  • 11. Ninja&migra*onNICVMVMMApplica6onMPI/systemNICVMVMMApplica6onMPI/systemNICVMVMMApplica6onMPI/system///Ninja/migra6onMigra6onGlobal/coordina6onDevice/setupIn/conjunc6on/with/VM/migra6on,//MPI/system/is/in/charge/of:/•  global/coordina6on/among/MPI/processes/•  changing/a/transport/protocol
  • 12. Implementa*onconfirmdetachGuest OS modeVMM modemigration re-attachconfirmApplicationSymVirt coordinator(SELF component)SymVirt controller/agentlinkupMPI runtime•  No/modifica6on/to/either/of/the/MPI/system/and/applica6ons/•  Open/MPI/user%level/checkpoint/restart/framework/(SELF)/–  Global/coordina6on/protocol/–  Re%establishes/connec6ons/among/MPI/processes/ager/migra6on/Migra6onDevice/setupGlobal/coordina6on
  • 13. Outline•  Introduc6on/•  Ninja/migra6on:/interconnect%transparent/migra6on/•  Experiment/•  Conclusion
  • 14. Experiment•  The/overhead/of/Ninja/migra6on/–  We/used/8/VMs/on/a/cluster./–  We/migrated/VMs/once/during/a/benchmark/execu6on./•  Fallback/and/recovery/migra6on/between/an/Infiniband/cluster/and/an/Ethernet/cluster/–  Infiniband:/VMM%bypass/I/O/(PCI/passthrough)/–  Ethernet:/Para%virtualized/I/O/(vir6o_net)/•  Two/benchmark/programs/wriken/in/MPI/–  memtest:/a/simple/memory/intensive/benchmark/–  NAS/Parallel/Benchmarks/(NPB)/version/3.3.1/
  • 15. Experimental&SeRngWe/used/a/16/node/Infiniband/cluster.Blade&server& Dell&PowerEdge&M610CPU Intel/quad%core/Xeon/E5540/2.53GHz/x2Chipset Intel/5520Memory 48/GB/DDR3InfiniBand/ Mellanox/ConnectX/(MT26428)10/GbE/ Broadcom/NetXtreme/II/(BMC57711)Blade&switchInfiniBand Mellanox/M3601Q/(QDR/16/ports)10GbE Dell/M8024Host&machine&environmentOS Debian/7.0Linux/kernel 3.2.18QEMU/KVM 1.1%rc3MPI Open/MPI/1.6OFED 1.5.4.1Compiler gcc/gfortran/4.4.6VM&environmentVCPU 8Memory 20/GB
  • 16. Result:&memtest•  The/overhead/of/Ninja/migra6on/–  The/migra6on/6me/depends/on/the/memory/footprint./–  Both/hotplug/and/link%up/6mes/are/almost/constant./28.5 28.5 28.5 28.614.6 13.5 12.5 11.335.9 38.7 44.2 53.70204060801002GB 4GB 8GB 16GBExecu*on&Time&[Seconds]migra6on/ hotplug/ linkup/memory&footprint
  • 17. Result:&link2up&*mesrc.&device&2>&dest.&device hotplug link2upInfiniband/%>/Infiniband 3.88 29.9Ethernet/%>/Infiniband 1.15 29.8Ethernet/%>/Ethernet 0.13 0.00Infiniband/%>/Ethernet 2.80 0.00•  Focus/on/the/link%up/6me./–  Note:/the/source/and/the/des6na6on/are/the/same/node./•  If/the/des6na6on/has/an/Infiniband/device,/the//link%up/6me/is/not/a/negligible/overhead./[seconds]
  • 18. Result:&NPB&(64&proc.,&Class&D)020040060080010001200baseline migration baseline migration baseline migration baseline migrationBT CG FT LUExecutiontime[seconds]migra6on/ hotplug/ linkup/ applica6on/BT CG FT LU4417 3394 15678 2348Transferred&Memory&Size&during&VM&Migra*on&[MB]There/is/no/overhead/during/normal/opera6ons The/overhead/is/propor6onal/to/the/memory/footprint.+8%+14%+37%+11%
  • 19. Fallback/Recovery&migra*on:&memtest0204060801001 11 21 31Executiontime[seconds]StepsOverhead/Applica6on/4/hosts//(IB)2/hosts//(TCP)4/hosts//(IB)4/hosts//(TCP)040801201602001 11 21 31Executiontime[seconds]StepsOverhead/Applica6on/4/hosts//(IB)2/hosts//(TCP)4/hosts//(IB)4/hosts//(TCP)Total&4&proc.&(1&proc.&/&VM) Total&32&proc.&(8&proc.&/&VM)
  • 20. Outline•  Introduc6on/•  Ninja/migra6on:/interconnect%transparent/migra6on/•  Experiment/•  Conclusion
  • 21. Related&Work•  Heterogeneous/VM/migra6on/–  Vagrant/supports/a/live/migra6on/across/heterogeneous/VMM./[P./Liu/‘08]/–  Ninja/migra6on/provides/interconnect%transparent/migra6on./•  VM/migra6on/with/VMM%bypass/I/O/devices/–  Driver2level:/shadow/driver/[A./Kadav/‘09],//Nomad/[W./Huang/‘07]/–  Run*me2level:/Ninja/migra6on
  • 22. Conclusion•  We/propose/an/interconnect%transparent/migra6on/mechanism/to/migrate/a/bunch/of/VMs/between/heterogeneous/data/centers./•  We/demonstrate/the/implementa6on/called/Ninja/migra6on./–  VMs/can/migrate/between/an/IB/cluster/and/an/Ethernet/cluster/without/restar6ng/an/applica6on.–  It/has/no/performance/overhead/during/normal/opera6ons./
  • 23. Future&work•  Demonstrate/the/scalability./•  Inves6gate/the/very/long/link%up/6me/of/Infiniband./•  Design/a/run6me%agnos6c/(MPI/free)/implementa6on./This/work/was/partly/supported/by/JSPS/KAKENHI//Grant/Number/24700040.