LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszutek Wilk , Oracle

  • 624 views
Uploaded on

For many years, the Xen community has been delivering a solid virtualization platform for the enterprise. In support of the Xen community innovation effort, Oracle has been translating our enterprise …

For many years, the Xen community has been delivering a solid virtualization platform for the enterprise. In support of the Xen community innovation effort, Oracle has been translating our enterprise experience with mission-critical workloads and large-scale infrastructure deployments into upstream contributions for the Linux and Xen efforts. In this session, you'll hear from a key Oracle expert, and community member, about Oracle contributions that focus on large-scale Xen deployments, networking, PV drivers, new PVH architecture, performance enhancements, dynamic memory usage with ‘tmem', and much more. This is your chance to get an under the hood view and see why the Xen architecture is the ideal choice for the enterprise.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
624
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
9
Comments
1
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Why use Xen for large scale Enterprise Deployments? Konrad Rzeszutek Wilk Software Developer Manager
  • 2. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 2
  • 3. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 3  A bit of history Where does the code come from? Distributions and kernels Features The end result
  • 4. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Unbreakable Enterprise Kernel and Oracle Linux purpose: • Red Hat and Oracle split: – Oracle supports a kernel based on RHEL distribution but with our own kernel - Unbreakable Enterprise Kernel (UEK). We want better performance for customers. The kernel is being updated more often and with features and benefits to take advantage of Oracle products. – As such an Oracle Linux Distribution along with UEK kernels is offered. The UEK kernel is used in other products – OVM. 4
  • 5. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle’s virtualization product (OVM): We use Xen for hypervisor. For kernel we use UEK – in the past (OVM 2) we had SLES based kernel. • OVM 2 (Xen 3.4) – Linux 2.6.32 based on SLES Xen Patches (classic) While the newer ones are based on paravirt (pvops): • OVM 3 (Xen 4.1) – UEK2 kernel (2.6.39) • OVM 3.3 (Xen 4.3) – UEK3 kernel (3.8.13) 5
  • 6. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Kernels (UEK: 2.6.39, 3.8). • Oracle’s approach is – Available for anybody (https://oss.oracle.com/git/). – Make features available for everybody. • Best way is to have it upstream so every distribution can have it. • The end goal is for applications to run as best as they can. • Large set of patches (big divergence from upstream) inhibit this as there is a lot of complexity in them. Classic Xen patches is an example of this. 6
  • 7. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Developers approach to patches: • We forget what we did after 6 months (more or less). • Want the code in one place (one repository). • Want to develop new features against code to make it better and faster. Don’t want to retouch the old code over and over. • Want to fix new bugs in new shinny code. • Big patches are scary. 7
  • 8. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Quality Assurance approach to patches: • Want to find the bug and have it fixed. – Don't want bugs to re-appear later in a new version of kernel (aka regressions). • Want to catch new bugs, expose new scenarios, not find old bugs. • Ideal situation: – new hardware = new bugs – not new hardware = old bugs. 8
  • 9. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Linux kernel: 2.6.32 (…) 3.0 (…) 3.8 (…) 3.11 (…) 3.15 Linux stable tree: 2.6.32LT 3.0 LT 3.8 LT Unbreakable Linux UEK1 UEK2 UEK3 Unbreakable Enterprise Kernel origin Backporting patches from upstream (Linus's tree) for new functionality. Long-term kernels is where the community puts in the fixes and features deemed necessary by maintainers. The version number gives an idea of origin, for example 2.6.39 was 3.0 but some of the code is from 3.11. 9
  • 10. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | The process to make this work: • Patches MUST go upstream (Linus’s tree). • New functionality developed against upstream kernel. • Bug-fixes also developed against upstream kernel (where applicable as some code had been re-worked). • In some instances, where they do not make sense to go upstream, we keep them in our tree. • The problem we had with OVM2 was that it had a huge patchset of Xen code – and not in any way easy to review. 10
  • 11. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Upstreaming Xen in Linus’s tree We started with slowly integrating pieces and pieces, one on top of each other.  Linux 3.0 had the initial domain support (but no backend drivers).  Later versions gained different backend drivers (block, network, etc).  For Xen (hypervisor) we did not have a huge set so much easier.  What we ended up doing was: Linus tree UEK tree OVM and Oracle Linux Xen upstream OVM 11
  • 12. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | The “problem” with Linus’s tree and Xen tree: • High quality of code. – Code has to go through numerous reviews before accepted. It takes time. • The end result is: – High quality and beautiful code. – Performance driven (no maintainer wants code that slows things down). – Improve the existing code. • A fantastic side effect is that other distributions and users gain these features right out of the box (such as Fedora Core, Debian, Red Hat, etc) 12
  • 13. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Linux features that we are developing:  Data safety  DIF and DIX (Data Integretty), hardening ext4 and XFS against fuzzing attacks and corrupt filesystems.  DIRECT_IO - bypass caches so that data goes directly to the disk.  Expose this via the AIO system call for applications.  Better use of CPU and memory for:  Making fsck work faster.  De duplication of various filesystems (btrfs).  Faster snapshotting.  Quota calculations on XFS.  dtrace 13
  • 14. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Linux features we have been developing:  NFS/RDMA (InfiniBand), NFS v4.0, support for NFS client using ZFS storage and Solaris NFS. • Security fixes before Linux gets released (And after too). • Xen: – The initial domain support and hardware features to match classic Xen support. – Features in block and frontend to improve I/O. – Lower latency for PCI passthrough devices. – Near bare metal performance of guests. – Continuous upstream presence to catch and fix regressions during Linus's merge window. – perf’ support for Xen and more. 14
  • 15. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | In Xen ecosystem (hypervisor and toolstack): • Xen Advisory Board where we collaborate with other companies using Xen – To do more testing across all vendors workloads. – Get more developers. – Companies work together on features (Xen block subsystem). • OASIS VirtIO workgroup to define the VirtIO specification. • Faster boot, faster deallocation/allocation for huge guests. • Faster performance on NUMA machines. • Faster guests – replacing PV with PVH. 15
  • 16. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | In the Xen ecosystem (hypervisor and toolstack) • 'perf' support – For full stack (hypervisor, guests, etc) performance view of what they are running and performance bottlenecks. • Xen hypervisor debugger – to troubleshoot in the field. • Lower interrupt latencies for PCI passthrough. • Transcendent memory (cooperative memory ballooning with benefits) – An answer to memory overcommit – where Linux balloons out pages it does not think it will use often but which can take a lot of memory space. Hypervisor can deduplicate + compress those across different guests. End result is that we can fit more guests on a machine and still have good performance (sometimes even 4% benefit!) 16
  • 17. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Exadata Database Machine (have X4-2, X4-4, X4-8). 17
  • 18. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | X4-8: 18 From Sun Server X4-8 Service Manual
  • 19. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Under the hood we have: • NUMA – 2, 4 or 8 sockets (CPU) – Each socket has its own local memory. – PCIe slots off sockets (I/O NUMA) with InfiniBand or flash in them. – All sockets connected via QuickPath Interconnect (QPI). • For best performance we don’t want to use QPI excessively, an solution is: – Partitioning per socket. – We have various size guests that reside within their NUMA node. • Combined with intelligent software (GRID, Oracle RAC) gives top-notch performance. 19
  • 20. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Networking – 40G and more: • Multiple ways of having better performance: – PCIe passthrough (InfiniBand or Network Integrated Cards) – SRIOV – what we concentrate on for best performance for Engineered Systems. But no migration! – Intel Data Plane Development Kit (DPDK). Low latency, but no migration! – Improving Xen netback and netfront (Citrix driven, they are the maintainers of Linux Xen netback driver). • Want the guest to run without invoking the hypervisor for privileged operations (aka less VMEXITs): – Interrupts go directly to the guest (posted interrupts). Improvement in Linux to use vAPIC instead of event channels for PCIe interrupt. – Lower the latency of interrupt delivery if we have to go through hypervisor. 20
  • 21. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Storage: More IOPS! • Classis OVM deployment is OCFS2 shared across different hosts. • We have SSDs, now PCIe flash, and in the future NVMe. • For better performance we do: – Improve Xen block frontend and backend. Joint projects with Citrix on increasing throughput and lowering latency. – SR-IOV for even higher throughput and low latency (but no migration) for Engineered Systems. 21
  • 22. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Guests improvements: • ParaVirtualized guests problem: – Page updates and syscall require context switch to hypervisor. – ParaVirtualized Hardware uses the hardware to do page updates and syscall instead of requiring the guest to do the hypercalls. End result is removal of bottlenecks in PV 22
  • 23. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Xen hypervisor bottlenecks: • Identify them using ‘perf’ to visualize and get full system stack (hypervisor and guests). 23
  • 24. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Xen transcendent memory. • Memory is becoming a bottleneck in virtualized system – we want more! However we have memory in-efficient workloads. 24
  • 25. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | End goal • Performance, high quality, stability and security for all different workloads. • Push patches upstream to benefit everybody. 25
  • 26. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle is hiring! konrad.wilk@oracle.com 26
  • 27. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 27