Performance Profiling        of Virtual MachinesJiaqing Du+, Nipun Sehrawat*, Willy Zwaenepoel++EPFL, Switzerland*Universi...
Performance Profiling•    Use CPU performance counters•    Monitor software runtime behavior•    Incur very low overhead• ...
Terminology      OS                          Guest                       Guest                   profiler                p...
Profiling with Virtual Machines                                 Para-            Hardware     Binary                      ...
Contributions                                 (1) Give solutions                                    Para-            Hardw...
Outline•    Native profiling•    Guest-wide profiling•    System-wide profiling•    EvaluationJiaqing Du, VEE, March 9, 20...
Native Profiling• Performance monitoring unit (PMU)       – consists of a set of event counters       – generates an inter...
Guest-wide Profiling• Profiler runs in the guest and only profiles the guest                                   Guest      ...
System-wide Profiling (1/3)• Reveal runtime behavior of both VMM and guest(s)                                         Gues...
System-wide Profiling (2/3)• Interpret guest samples: full delegation                                                     ...
System-wide Profiling (3/3)• Interpret guest samples: interpretation delegation                                           ...
PMU Multiplexing• When to save & restore performance counters?• CPU switch       – only in-guest execution is accounted to...
Implementation                                  Para-            KVM   QEMU                                  virtualizatio...
Evaluation question #1How much does profiling slow down programs?Jiaqing Du, VEE, March 9, 2011                           ...
Profiling Overhead• Measure execution time       – a computation-intensive program       – with and without profiling     ...
Evaluation question #2                       Are profiling results accurate?Jiaqing Du, VEE, March 9, 2011                ...
Profiling Accuracy (1/4)• A computation-intensive benchmark• compute_{a|b}() does floating point arithmetic• Monitor CPU c...
Profiling Accuracy (2/4)• Comparison with native profiling                 90                 80                 70       ...
Profiling Accuracy (3/4)• A memory-intensive benchmark• Randomly access a fixed-size region of memory• Monitor last level ...
Profiling Accuracy (4/4) • Comparison with native profiling                         1.6                         1.4       ...
Evaluation question #3                     What is the difference between                     CPU switch and domain switch...
Recap• CPU switch                           VMM                    VMM              guest1        I/Oguest1    guest2     ...
Profiling Packet Receive (1/2)• Experiment       – push packets to a Linux guest in KVM       – run OProfile in the guest ...
Profiling Packet Receive (2/2)                                CPU Switch                       Domain Switch              ...
Related Work• XenOprof       – first profiler targeting virtual machines       – system-wide profiling for Xen• Linux perf...
Conclusions                                 Para-            Hardware     Binary                                 virtualiz...
Upcoming SlideShare
Loading in...5
×

Performance Profiling of Virtual Machines

2,087
-1

Published on

Published in: Technology, Travel
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,087
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
49
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Performance Profiling of Virtual Machines

  1. 1. Performance Profiling of Virtual MachinesJiaqing Du+, Nipun Sehrawat*, Willy Zwaenepoel++EPFL, Switzerland*University of Illinois at Urbana-Champaign
  2. 2. Performance Profiling• Use CPU performance counters• Monitor software runtime behavior• Incur very low overhead• Used extensively: OProfile, VTune, … %CYCLE Function Module 98.5529 vmx_vcpu_run kvm-intel.ko 0.2226 (no symbols) libc.so 0.1034 hpet_cpuhp_notify vmlinux 0.1034 native_patch vmlinuxJiaqing Du, VEE, March 9, 2011 2
  3. 3. Terminology OS Guest Guest profiler profiler profiler VMM VMM profiler CPU PMU CPU PMU CPU PMU (1) native profiling (2) guest-wide profiling (3) system-wide profilingJiaqing Du, VEE, March 9, 2011 3
  4. 4. Profiling with Virtual Machines Para- Hardware Binary virtualization assistance translation Guest-wide profiling ? ? ? System-wide profiling XenOprof ? ? Profilers do not work well with virtual machines.Jiaqing Du, VEE, March 9, 2011 4
  5. 5. Contributions (1) Give solutions Para- Hardware Binary virtualization assistance translation Guest-wide profiling ? ? ? System-wide profiling XenOprof ? ? (2) Implement prototypesJiaqing Du, VEE, March 9, 2011 5
  6. 6. Outline• Native profiling• Guest-wide profiling• System-wide profiling• EvaluationJiaqing Du, VEE, March 9, 2011 6
  7. 7. Native Profiling• Performance monitoring unit (PMU) – consists of a set of event counters – generates an interrupt when a counter overflows• PMU-based profiler User Control Interpret - previous PC value Kernel - process identifier Configure Collect CPU PMUJiaqing Du, VEE, March 9, 2011 7
  8. 8. Guest-wide Profiling• Profiler runs in the guest and only profiles the guest Guest Control Interpret Injected interrupts should be handled right after guest Configure Collect resumes execution. VMM CPU PMU Challenge: synchronous interrupt delivery to the guestJiaqing Du, VEE, March 9, 2011 8
  9. 9. System-wide Profiling (1/3)• Reveal runtime behavior of both VMM and guest(s) Guest1 Guest2 Do not know the internals of a guest. Control Interpret VMM Configure Collect CPU PMU Challenge: interpret samples belonging to the guestJiaqing Du, VEE, March 9, 2011 9
  10. 10. System-wide Profiling (2/3)• Interpret guest samples: full delegation Control Interpret Guest Configure Collect Control Interpret VMM Configure Collect CPU PMUJiaqing Du, VEE, March 9, 2011 10
  11. 11. System-wide Profiling (3/3)• Interpret guest samples: interpretation delegation Control Interpret Guest Configure Collect Control Interpret Shared Buffer VMM Configure Collect CPU PMUJiaqing Du, VEE, March 9, 2011 11
  12. 12. PMU Multiplexing• When to save & restore performance counters?• CPU switch – only in-guest execution is accounted to the guest VMM VMM guest1 I/Oguest1 guest2 I/Oguest2 guest2 account to guest 1 account to guest 2 account to guest 2• Domain switch – in-VMM execution is also accounted to the guest VMM VMM guest1 I/Oguest1 guest2 I/Oguest2 guest2 account to guest1 account to guest2Jiaqing Du, VEE, March 9, 2011 12
  13. 13. Implementation Para- KVM QEMU virtualization Guest-wide profiling ? √ ? System-wide profiling XenOprof √ √Jiaqing Du, VEE, March 9, 2011 13
  14. 14. Evaluation question #1How much does profiling slow down programs?Jiaqing Du, VEE, March 9, 2011 14
  15. 15. Profiling Overhead• Measure execution time – a computation-intensive program – with and without profiling – about 400 counter overflows per second Profiling environment Increased execution time Native Linux 0.04% ± 0.004% KVM guest-wide 0.39% ± 0.045% KVM system-wide 0.44% ± 0.043% QEMU system-wide 0.94% ± 0.044%Jiaqing Du, VEE, March 9, 2011 15
  16. 16. Evaluation question #2 Are profiling results accurate?Jiaqing Du, VEE, March 9, 2011 16
  17. 17. Profiling Accuracy (1/4)• A computation-intensive benchmark• compute_{a|b}() does floating point arithmetic• Monitor CPU cycles int main(int argc, char *argv[]) { while (1) { compute_a(); compute_b(); } }Jiaqing Du, VEE, March 9, 2011 17
  18. 18. Profiling Accuracy (2/4)• Comparison with native profiling 90 80 70 60 50 Native Cycle % 40 KVM guest-wide KVM system-wide 30 QEMU system-wide 20 10 0 compute_a compute_b Routine nameJiaqing Du, VEE, March 9, 2011 18
  19. 19. Profiling Accuracy (3/4)• A memory-intensive benchmark• Randomly access a fixed-size region of memory• Monitor last level cache misses struct item { struct item *next; long pad[NUM_PAD]; } void chase_pointer() { struct item *p = NULL; p = &randomly_connected_items; while (p != null) p = p->next; }Jiaqing Du, VEE, March 9, 2011 19
  20. 20. Profiling Accuracy (4/4) • Comparison with native profiling 1.6 1.4 1.2 1 NativeCache misses per 0.8 KVM guest-widememory access 0.6 KVM system-wide QEMU system-wide 0.4 0.2 0 256 512 768 1024 1280 1536 1792 2048 2304 2560 2816 3072 Working set size (KB) Jiaqing Du, VEE, March 9, 2011 20
  21. 21. Evaluation question #3 What is the difference between CPU switch and domain switch?Jiaqing Du, VEE, March 9, 2011 21
  22. 22. Recap• CPU switch VMM VMM guest1 I/Oguest1 guest2 I/Oguest2 guest2 account to guest 1 account to guest 2 account to guest 2• Domain switch VMM VMM guest1 I/Oguest1 guest2 I/Oguest2 guest2 account to guest1 account to guest2Jiaqing Du, VEE, March 9, 2011 22
  23. 23. Profiling Packet Receive (1/2)• Experiment – push packets to a Linux guest in KVM – run OProfile in the guest – monitor instruction retirements Linux KVM virtual NIC Linux Hardware Hardware NIC NICJiaqing Du, VEE, March 9, 2011 23
  24. 24. Profiling Packet Receive (2/2) CPU Switch Domain Switch INSTR Function INSTR Function 167 csum_partial 2261 cp_interrupt 106 csum_partial_copy_generic 1336 cp_rx_pollPacket 74 copy_to_user 1034 cp_start_xmit I/OProcessing Related 47 ipt_do_table 421 native_apic_mem_write 38 tcp_v4_rcv 374 native_apic_mem_read … … 191 … csum_partial … … … 105 … csum_partial_copy_generic … … … 94 … copy_to_user … … … 79 … ipt_do_table … … … 51 … tcp_v4_rcv … Domain switch gives more insight for I/O operations. Jiaqing Du, VEE, March 9, 2011 24
  25. 25. Related Work• XenOprof – first profiler targeting virtual machines – system-wide profiling for Xen• Linux perf – a profiling infrastructure for Linux – limited support of profiling KVM Linux guest• VMware vmkperf – only read and write CPU performance countersJiaqing Du, VEE, March 9, 2011 25
  26. 26. Conclusions Para- Hardware Binary virtualization assistance translation Guest-wide √ √ profiling √ System-wide profiling XenOprof √ √Jiaqing Du, VEE, March 9, 2011 26
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×