NUMA and Virtualization, the case of Xen

  • 6,253 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
6,253
On Slideshare
0
From Embeds
0
Number of Embeds
9

Actions

Shares
Downloads
89
Comments
2
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. NUMA and Virtualization, the case of Xen Dario Faggioli, dario.faggioli@citrix.comAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 2. NUMA and Virt. And XenTalk outline: ● What is NUMA – NUMA and Virtualization – NUMA and Xen ● What I have done in Xen about NUMA ● What remains to be done in Xen about NUMAAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 3. NUMA and Virt. And XenTalk outline: ● What is NUMA – NUMA and Virtualization – NUMA and Xen ● What I have done in Xen about NUMA ● What remains to be done in Xen about NUMAAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 4. What is NUMA ● Non-Uniform Memory Access: it will take longer Access to access some regions of memory than others ● Designed to improve scalability on large SMPs ● Bottleneck: contention on the shared memory busAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 5. What is NUMA ● Groups of processors (NUMA node) have their own local memory ● Any processor can access any memory, including the one not "owned" by its group (remote memory) ● Non-uniform: accessing local memory is faster than accessing remote memoryAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 6. What is NUMASince: ● (most of) memory is NODE NODE allocated at task startup, CPUs CPUs task A task A ● tasks are (usually) free to run on any processorBoth local and remote mem Aaccesses can happenduring tasks life MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 7. NUMA and VirtualizationShort lived tasks are just fine... But whatabout VMs? NODE NODE CPUs CPUs VM2 VM1 VM1 mem VM1 FOREVER!! mem VM2 MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 8. NUMA and XenXen allocates memory from all the nodeswhere the VM is allowed to run (when created) NODE NODE CPUs CPUs VM2 VM1 VM1 mem mem VM1 VM1 mem mem VM2 VM2 MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 9. NUMA and XenXen allocates memory from all the nodeswhere the VM is allowed to run (when created) NODE NODE NODE NODE CPUs CPUs CPUs CPUs VM2 VM1 mem mem VM1 VM1 mem mem VM2 VM2 MEM MEM MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 10. NUMA and XenXen allocates memory from all the nodeswhere the VM is allowed to run (when created)How to specify that?vCPU pinning during VM creation (e.g., in the VMconfig file)August 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 11. NUMA and Virt. And XenTalk outline: ● What is NUMA – NUMA and Virtualization – NUMA and Xen ● What I have done in Xen about NUMA ● What remains to be done in Xen about NUMAAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 12. Automatic Placement 1.What happens on the left is better than what happens on the right NODE NODE NODE NODE CPUs CPUs CPUs CPUs VM1 VM1 mem mem mem VM1 VM1 VM1 MEM MEM MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 13. Automatic Placement 2.What happens on the left can be avoided: look at what happens on the right NODE NODE NODE NODE CPUs CPUs CPUs CPUs VM2 VM1 VM1 VM2 mem mem mem VM1 VM1 VM2 mem VM2 MEM MEM MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 14. Automatic Placement 1.VM1 creation NODE NODE time: pin VM1 to the CPUs CPUs first node VM1 VM2 2.VM2 creation time: pin VM2 to the mem mem VM1 VM2 second node, as first one already has another VM pinned to This now happens MEM MEM it Automatically within libxlAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 15. NUMA Aware SchedulingNice, but using pinning has a majordrawback: NODE NODE CPUs CPUs VM3 VM1 VM2 VM1 mem mem When VM1 is back, VM1 VM2 It has to wait, even if There are idle CPUs! mem VM3 MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 16. NUMA Aware SchedulingDont use pinning, tell the scheduler wherea VM prefers to run (node affinity) NODE NODE CPUs CPUs Now VM1 can run immediately: VM3 VM1 VM1 VM2 remote accesses Are better than not Patches almost running at all! mem mem ready (targeting VM1 VM2 Xen 4.3) mem VM3 MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 17. Experimental ResultsTrying to verify the performances of: ● Automatic placement (pinning) ● NUMA aware scheduling (automatic placement + node affinity)August 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 18. Testing Placement (thanks to Andre Przywara from AMD) ● Host: AMD Opteron(TM) 6276, 64 cores, 128 GB RAM, 8 NUMA nodes ● VMs: 1 to 49, 2 vCPUs, 2GB RAM each. nr. vCPUs Node 14 0 12 1 12 2 12 3 12 4 12 5 12 6 12 7August 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 19. Some Traces about NUMA Aware SchedulingAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 20. Benchmarking Pinning and NUMA Aware Scheduling ● Host: Intel Xeon(R) E5620, 16 cores, 12 GB RAM, 2 NUMA nodes ● VMs: 2, 4, 6, 8 and 10 of them, 960MB RAM. Different experiments with 1, 2 and 4 vCPUs per VM ➔ SPECjbb2005 executed concurrently in all VMs ➔ 3 configurations: all-cpus, auto-pinning, auto-affinity ➔ Exp. repeated 3 times per each configurationAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 21. Benchmarking Pinning and NUMA Aware Scheduling1 vCPU per VM: 20% to 16% improvement!August 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 22. Benchmarking Pinning and NUMA Aware Scheduling2 vCPUs per VM: 17% to 13% improvement!August 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 23. Benchmarking Pinning and NUMA Aware Scheduling4 vCPUs per VM: 0.2% to 8% gap from auto-affinity to auto-pinning 16% to 4% improvement from all-cpus to auto-affinityAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 24. Some Traces about NUMA Aware SchedulingAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 25. Some Traces about NUMA Aware SchedulingAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 26. NUMA and Virt. And XenTalk outline: ● What is NUMA – NUMA and Virtualization – NUMA and Xen ● What I have done in Xen about NUMA ● What remains to be done in Xen about NUMAAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 27. Lots of Room for Further Improvement!Tracking ideas, people and progress on http://wiki.xen.org/wiki/Xen_NUMA_Roadmap ● Dynamic memory migration ● IO NUMA ● Guest (or Virtual) NUMA ● Ballooning and memory sharing ● Inter-VM dependencies ● Benchmarking and performances evaluationAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 28. Dynamic Memory MigrationIf VM2 goes away, it would be nice if wecould change VM1s (or VM3s) node affinityon-line NODE NODE CPUs CPUs VM3 VM2 VM1 Also targeting mem mem And move its VM1 VM1 VM2 Memory accordingly! Xen 4.3 mem VM3 MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 29. IO NUMADifferent devices can be attached todifferent nodes: needs to be considered duringplacement / schedulingAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 30. Guest NUMAIf a VM is bigger than 1 node, should it know?NODE NODE NODE Pros: performancesCPUs CPUs CPUs (especially HPC)VM1 VM2 VM1 VM2 VM1 Cons: what if thingsmem mem mem changes?VM1 VM1 VM2 VM1 VM2 ● live migrationMEM MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 31. Ballooning and SharingBallooning should be Sharing, should weNUMA aware allow that cross-node? NODE NODE NODE NODE CPUs CPUs CPUs CPUs VM1 VM2 VM3 VM1 VM2 mem mem mem mem VM1 mem VM3 VM1 VM2 VM1 mem shmem !! VM2 Remote access! MEM MEM MEM MEMAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 32. Inter-VM Dependences Are we sure the situation on the right is always better? Might it be workload dependant (VM cooperation VS. competiotion) NODE NODE NODE NODE CPUs CPUs CPUs CPUs VM2 VM3 VM1 VM2VM1 VM3 mem mem mem mem VM1 VM3 VM1 VM2 mem mem VM2 VM3 MEM MEM MEM MEM August 27-28, 2012, Dario Faggioli, San Diego, CA, USA dario.faggioli@citrix.com
  • 33. Benchmarking and Performances EvaluationHow to verify we are actually improving: ● What kind of workload(s)? ● What VMs configuration?August 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 34. NUMA and Virt. And XenTalk outline: ● What is NUMA – NUMA and Virtualization – NUMA and Xen ● What I have done in Xen about NUMA ● What remains to be done in Xen about NUMAAugust 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com
  • 35. Thanks! Any Questions?August 27-28, 2012, Dario Faggioli,San Diego, CA, USA dario.faggioli@citrix.com