Xen summit 2010 extending xen into embedded


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Xen summit 2010 extending xen into embedded

  1. 1. Xen Summit 2010 Extending Xen into Embedded and Communications Workloads
  2. 2. Agenda • Embedded Usage Models • Virtual Machine Monitor Requirements • Benchmarking • Cisco Product Range • Embedded Development Requirements • High Availability 2 09.14.05
  3. 3. Embedded Usage Models Robotics Using Core Micro Architecture for GUI IP Media Phones interface with real time Atom based platforms industrial control. delivering Internet connectivity and media content to continuous connected devices. Routing Xeon Micro Architecture based platforms implement control and data- plane services on high end routers. Unique VMM requirements across all segments 3
  4. 4. Virtual Machine Monitor Implementation Scalability, Flexibility, RAS Industrial Control requires and Fail Over are a few of determinism. Performance the vmm requirements in Critical partition Comm’s appliance required to host Cell is measured in interrupt latency (10 usec or lower) environment phone application, hypervisor requires Quality of Service RTOS (Service) Linux Microsoft Linux Critical App GUI Partition partition RTOS Shared Memory vmm vmm Thin vmm Industrial Comm’s Appliance Media Phone 4
  5. 5. Embedded Virtualization - Advantages Consolidation and Preservation Dataplane Control Legacy - Proprietary Single Legacy Legacy Threaded Operating Systems RTOS RTOS Linux Rapid Deployment of new vmm services VT-d / SRIOV Core 0 Core 1 Multi-Core Integrate Development Architecture rx rx Environment separate from tx tx Critical Services PF 10 Gb/s 5
  6. 6. Embedded Deployment Requirements Single Core scheduling Scheduling control for Guest Quality of Service Phone App Dom0 Application Development Traffic prioritization to avoid packet loss requires (soft) Real Time scheduling Xen Credit based scheduler research in progress Atom I/O I/O Consolidated Grant Tables Consolidate Fast Path with Security fast path Intrusion Detection application Linux io rings Fast Path Requires efficient mechanism to share Intrusion ip packet data with Linux application Dom0 Detection packet Forwarding Grant tables (io rings) maybe an efficient mechanism to meet performance Xen requirements (needs to be Lock Free) Xeon I/O 6
  7. 7. Embedded Xen Deployment 120 100 Power Profile of some edge based appliances is 80 cyclical, potential power savings can be substantial 60 (Example Base Station Controller) 40 Data 20 ACPI support generally not supported in Real Time / 0 Proprietary Operating Systems 6am 6pm 120 Hypervisor Power Management could be very useful 100 to control overall power budget 80 Voice 60 “Shelf Manager” Power management research in 40 progress 20 Fast Fast 0 Fast Path Dom0 Path Path 6am 6pm Shf mgr Xen Fast Fast Fast Path Dom0 Path Path Multi Core Shf mgr Intelligent Power Fast Xen Fast Fast Path Management, balances I/O Dom0 Path Path Multi Core latency & throughout Shf mgr Xen Multi Core 7
  8. 8. Embedded Xen – Direct Cache Access memory DCA - Direct Cache Access delivers data in cache to CPU ctrl reduce average memory latency and attempts to Cache reduce memory bandwidth DCA Driver uses get_cpu() to gather APIC_ID, uses this to configure the DCA enabled NIC device IOH DCA static void igb_update_dca(struct igb_q_vector *q_vector) { I/O struct igb_adapter *adapter = q_vector->adapter; struct e1000_hw *hw = &adapter->hw; int cpu = get_cpu(); /* Get the current CPU Id*/ if (q_vector->cpu == cpu) Dom0 Guest Guest goto out_no_update; Xen get_cpu() requires to return the valid APIC ID of the CPU CPU Cache Cache core where the guest is executing. 8
  9. 9. Benchmarking, 10 GbE perspective A 64B packet can arrive every 67.2ns In terms of processor cycles : @ 2.53 GHz, a 64B packet arrives every ~201 cycles Can generate up to 14.88 million Rx and 14.88 million Tx transactions every second (packets) Each packet has a 16B descriptor associated with it, that must be written for every packet that needs to be processed Mpp/s 16,000,000 The Linux forwarding code 14,000,000 takes ~3000 cycles to process 12,000,000 a packet. 10,000,000 8,000,000 With enhancement we can 6,000,000 reduce the number of cycles per (64 Byte) packet to ~1350 4,000,000 cycles. 2,000,000 0 64 118 172 226 280 334 388 442 496 550 604 658 712 766 820 874 928 982 1036 1090 1144 1198 1252 1306 1360 1414 1468 Packet Size 9
  10. 10. Guest Forwarding Performance Native Layer 3 Forwarding Virtualized 2-Port (1 Core, 1 Thread) Packets per Second (PPS) Linux Linux forwarding forwarding VT-d vmm Core 0 Core 1 Multi-Core Architecture I/O I/O 64 128 256 512 768 1024 1280 1518 Packet Size (bytes) Single threaded virtualized environments show promising performance: - Near native performance for small packet sizes - Native performance for large packet sizes ( >256B ). Limited performance penalty for consolidation, additional scaling tests in progress 10
  11. 11. Cisco Embedded Product Space Service Provider Wide range of products in a number of market segments: ASR 9000 CRS Data Center Voice & Video UCS Nexus 7000 TelePresence Unified Enterprise Communications Security MDS 9222i (SAN) ASR 1000 Branch Home Ironport ASA 5500 3900 ISR 2800 ISR Flip Video Valet 11
  12. 12. Embedded Product Environment Hardware Environment General Purpose CPUs, SoCs, ASICs, FPGAs, custom processors, ixp, DSPs, … From large multi-core, multi-blade, multi-chassis systems to small single/dual core devices Terabit to Gigabit I/O Software Environment Multi-OS: IOS, IOS-XE, IOS-XR, NX-OS Proprietary (legacy), Linux, other … Single threaded, multi-threaded, pipelined, flow-based, … Multiple vm models integrated services platform, distributed/load balancing, HA, control & data separation, … Control plane, data plane, management plane, appliance and service engines, … e.g., routing, data, voice, video, deep packet inspection, firewall, security, etc. Memory, processor, and I/O bandwidth requirements vary by application and network device location 12
  13. 13. Embedded Development Requirements We believe that xen is the right choice for an embedded hypervisor Early support for prototype hardware required: In hypervisor and dom0 Open source xen and linux critical to this effort It’s the right architecture and feature set for embedded development RAS High Availability (HA) for guests non-disruptive stateful failover, non-disruptive in service software upgrade (ISSU) Devices hot pluggable/removable (non-disruptive): shared & dedicated (including sr-iov) dom0 Separate device driver domains good, but not enough All domains need to be restartable Deterministic Performance QoS control through configuration and scheduling I/O linearly scalable across cores and vms Low latency interrupts 13
  14. 14. Embedded Development Requirements Core allocation/Scheduling: vcpu pcpu mapping (pinned, non-shared): deterministic performance (pinned, shared), (non-pinned, shared): scheduled For pv IOS, I/O workload, 64-byte packets, 2 ports, bidirectional, 64-bit xen, NUMA on (pinned, non-shared), HT off 100%line rate (1Gb) per core <0.1% time spent in hypervisor (non-pinned, shared), HT off ~10% decreased throughput (pinned, non-shared), NUMA- remote, HT off ~8% decreased throughput (pinned, non-shared), HT on, one on each 1.5x/1.7x (I/O/cpu) increase in thread on the core throughput (aggregate) .75x/.85x (I/O/cpu) throughput per transaction single thread (pinned, non-shared), HT on, only one Same as (pinned, non-shared), HT off thread on the core in use Guest Support Both pv and hvm (hybrid!) 32-bit & 64-bit Virtual memory paged and non-paged (single, flat address space) 14
  15. 15. Embedded Development Requirements Debug and Performance Monitoring multi-guest, simultaneous 32-bit & 64-bit guests (minimum is gdbsx for both pv & hvm) Performance monitoring tools (access to PMU data - xenoprofile & others) Required in the field as well as during development Trusted Systems: Secure Products Trusted boot, TPM, Intel TXT/AMD-V Trusted guests, sandboxed 3rd party guests, anti-counterfeiting, … Manageable Power Management Especially at the edge, branch, and consumer devices Policy based, managed by hypervisor Cases where guest should not be automatically power managed “carrier class” xen Development Environment Support for rapid prototyping Support for production product environment 15
  16. 16. HA Requirements Rationale HA & ISSU features available on many platforms across our product space today Cannot go to market without support in certain product spaces Software fails much more often than hardware Software-only HA/ISSU at much lower cost very attractive Natural fit on multi-core devices High Availability (HA) Active-Standby: stateful, “hot” Standby Failure of Active causes non-disruptive failover to Standby Reconciliation required on switchover Standby progresses through state machine to Active state I/O devices always belong to Active and switch to [new] Active without loss of state Packet loss ok on switchover – higher level protocols recover Downstream end of device connection must not see a “failure” Switchover must take place in < 1 sec. In Service Software Upgrade (ISSU) Built on HA infrastructure Automated software upgrade (or downgrade) Non disruptive: Fallback if required or requested 16
  17. 17. HA Requirements What is needed: Reliable fast failure detection mechanism Current: hardware uses interrupt pin; backup is heart-beat mechanism (slow) Need to emulate/implement fast, reliable failure detection mechanism in xen Failover device transparently from Active to Standby no loss of [device] state Packet traffic dropped until Standby transitions to Active Interrupts redirected to new Active (old Standby) on failover interrupts dropped until Standby transitions to Active [new] Active must be able to address outstanding interrupts without complete reset Need to be able to run in redundant hardware configuration or on multi-core device drivers responsible for appropriate reconciliation protocols Minimize the changes to xen kernel and dom0 code recovery decisions need to be in the domain of the guest driver Support for direct assign devices (including sr-iov) and shared devices Non shared memory solution for DMA target memory preferred requires ability to either pre-program and switch or reprogram and switch on failover 17
  18. 18. “carrier class” xen Development Environment Needs to support 2 different Environments: Rapid prototyping and development of new services Work often requires unstable branch, pre-release/prototype hardware Straight forward, and accessible to the non xen expert Interest is in getting the prototype/product up and running quickly rather than xen infrastructure Developer threads, blogs, etc. not a substitute for up-to-date documentation Product decisions (go/no go) based on prototype results Failure/missed deadlines will eliminate a prototype as a possible solution Corporate networks/labs behind firewalls, use proxies Doesn’t work well with current git-based source control Requires exceptions to corporate IT policy Production product Uses stable release Controlled access to performance & debug tools in customer environment Documentation required in field as well Auditing requires ability to reproduce image bit-for-bit from local build 18
  19. 19. Summary • Embedded market provides for a great growth opportunity • Deployment requires some unique features • Xen is well positioned but requires support for RAS features, debug and “Carrier Class” Release 19