vSphere 6.x Host Resource Deep Dive
Frank Denneman
Niels Hagoort
INF8430
#INF8430
Agenda
• Compute
• Storage
• Network
• Q&A
Introduction
www.cloudfix.nl
Niels Hagoort
• Independent Architect
• VMware VCDX #212
• VMware vExpert (NSX)
Frank Denneman
• Enjoying Summer 2016
• VMware VCDX #29
• VMware vExpert
www.frankdenneman.nl
Compute
( N U M A , N U M A , N U M A )
Insights In Virtual Data Centers
Modern dual sockets CPU servers are Non-Uniform
Memory Access (NUMA) systems
Local and Remote Memory
NUMA Focus Points
• Caching Snoop modes
• DIMM configuration
• Size VM match CPU topology
CPU Cache
( t h e f o r g o t t e n h e r o )
CPU Architecture
Caching Snoop Modes
DIMM Configuration
( a n d w h y 3 8 4 G B i s n o t a n o p t i m a l c o n f i g u r a t i o n )
Memory Constructs
3-DPC - 384 GB – 2400 MHz DIMM
DIMMS Per Channel
2-DPC - 384 GB – 2400 MHz DIMM
Current Sweet Spot: 512GB
Right Size your VM
( A l i g n m e n t e q u a l s c o n s i s t e n t p e r f o r m a n c e )
ESXi NUMA focus points
• CPU scheduler allocates Core or HT cycles
• NUMA scheduler initial placement + LB
• vCPU configuration impacts IP & LB
Scheduling constructs
12 vCPU On 20 Core System
Align To CPU Topology
• Resize vCPU configuration to match core count
• Use vcpu.numa.preferHT
• Use cores per socket (CORRECTLY)
• Attend INF8089 at 5 PM in this room
Prefer HT + 12 Cores Per Socket
Storage
( H o w f a r a w a y i s y o u r d a t a ? )
The Importance of Access Latency
Location of operands CPU Cycles Perspective
CPU Register 1 Brain (Nanosecond)
L1/L3 cache 10 End of this room
Local Memory 100 Entrance of building
Disk 10^6 New York
Every Layer = CPU Cycles & Latency
Industry Moves Toward NVMe
• SSD bandwidth capabilities exceeds current
controller bandwidth
• Protocol inefficiencies dominant contributor to
access time
• NVMe architected from the ground up for non -
volatile memory
I/O Queue Per CPU
Driver Stack
Not All Drivers Are Created Equal
Network
pNIC considerations for VXLAN
performance
• Additional layer of packet processing
• Consumes CPU cycles for each packet for
encapsulation/de-capsulation
• Some of the offload capabilities of the NIC cannot
be used (TCP based)
• VXLAN offloading! (TSO / CSO)
VXLAN
1.
2.
3.
[root@ESXi02:~] vmkload_mod -s bnx2x
vmkload_mod module information
input file: /usr/lib/vmware/vmkmod/bnx2x
Version: Version 1.78.80.v60.12, Build: 2494585, Interface: 9.2 Built on: Feb 5 2015
Build Type: release
License: GPL
Name-space: com.broadcom.bnx2x#9.2.3.0
Required name-spaces:
com.broadcom.cnic_register#9.2.3.0
com.vmware.driverAPI#9.2.3.0
com.vmware.vmkapi#v2_3_0_0
Parameters:
skb_mpool_max: int
Maximum attainable private socket buffer memory pool size for the driver.
skb_mpool_initial: int
Driver's minimum private socket buffer memory pool size.
heap_max: int
Maximum attainable heap size for the driver.
heap_initial: int
Initial heap size allocated for the driver.
disable_feat_preemptible: int
For debug purposes, disable FEAT_PREEMPTIBLE when set to value of 1
disable_rss_dyn: int
For debug purposes, disable RSS_DYN feature when set to value of 1
disable_fw_dmp: int
For debug purposes, disable firmware dump feature when set to value of 1
enable_vxlan_ofld: int
Allow vxlan TSO/CSO offload support.[Default is disabled, 1: enable vxlan offload, 0: disable vxlan offload]
debug_unhide_nics: int
Force the exposure of the vmnic interface for debugging purposes[Default is to hide the nics]1. In SRIOV mode expose the PF
enable_default_queue_filters: int
Allow filters on the default queue. [Default is disabled for non-NPAR mode, enabled by default on NPAR mode]
multi_rx_filters: int
Define the number of RX filters per NetQueue: (allowed values: -1 to Max # of RX filters per NetQueue, -1:
use the default number of RX filters; 0: Disable use of multiple RX filters; 1..Max # the number of RX filters
per NetQueue: will force the number of RX filters to use for NetQueue
........
[root@ESXi01:~] esxcli system module parameters list -m bnx2x
Name Type Value Description
---------------------------- ---- ----- -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------
RSS int Control the number of queues in an RSS pool. Max 4.
autogreeen uint Set autoGrEEEn (0:HW default; 1:force on; 2:force off)
debug uint Default debug msglevel
debug_unhide_nics int Force the exposure of the vmnic interface for debugging purposes[Default is to hide the nics]1. In SRIOV mode expose the PF
disable_feat_preemptible int For debug purposes, disable FEAT_PREEMPTIBLE when set to value of 1
disable_fw_dmp int For debug purposes, disable firmware dump feature when set to value of 1
disable_iscsi_ooo uint Disable iSCSI OOO support
disable_rss_dyn int For debug purposes, disable RSS_DYN feature when set to value of 1
disable_tpa uint Disable the TPA (LRO) feature
dropless_fc uint Pause on exhausted host ring
eee set EEE Tx LPI timer with this value; 0: HW default
enable_default_queue_filters int Allow filters on the default queue. [Default is disabled for non-NPAR mode, enabled by default on NPAR mode]
enable_vxlan_ofld int Allow vxlan TSO/CSO offload support.[Default is disabled, 1: enable vxlan offload, 0: disable vxlan offload]
gre_tunnel_mode uint Set GRE tunnel mode: 0 - NO_GRE_TUNNEL; 1 - NVGRE_TUNNEL; 2 - L2GRE_TUNNEL; 3 - IPGRE_TUNNEL
gre_tunnel_rss uint Set GRE tunnel RSS mode: 0 - GRE_OUTER_HEADERS_RSS; 1 - GRE_INNER_HEADERS_RSS; 2 - NVGRE_KEY_ENTROPY_RSS
heap_initial int Initial heap size allocated for the driver.
heap_max int Maximum attainable heap size for the driver.
int_mode uint Force interrupt mode other than MSI-X (1 INT#x; 2 MSI)
max_agg_size_param uint max aggregation size
mrrs int Force Max Read Req Size (0..3) (for debug)
multi_rx_filters int Define the number of RX filters per NetQueue: (allowed values: -1 to Max # of RX filters per NetQueue, -1: use the default number of RX filters; 0: Disable use of
multiple RX filters; 1..Max # the number of RX filters per NetQueue: will force the number of RX filters to use for NetQueue
native_eee uint
num_queues uint Set number of queues (default is as a number of CPUs)
num_rss_pools int Control the existence of a RSS pool. When 0,RSS pool is disabled. When 1, there will bea RSS pool (given that RSS > 0).
........
• Check the supported features of your pNIC
• Check the HCL for supported features in the driver
module
• Check the driver module; does it requires you to
enable features?
• Other async (vendor) driver available?
Driver Summary
RSS & NetQueue
• NIC support required (RSS / VMDq)
• VMDq is the hardware feature, NetQueue is the
feature baked into vSphere
• RSS & NetQueue similar in basic functionality
• RSS uses hashes based on IP/TCP port/MAC
• NetQueue uses MAC filters
Without RSS for VXLAN (1 thread per pNIC)
RSS enabled (>1 threads per pNIC)
How to enable RSS (Intel)
1. Unload module: esxcfg-module -u ixgbe
2. Enable inbox: vmkload_mod ixgbe RSS="4,4”
Enable async: vmkload_mod ixgbe RSS=“1,1”
Receive throughput with VXLAN using 10GbE
Intel examples:
Intel Ethernet products RSS for VXLAN technology
Intel Ethernet X520/540 series Scale RSS on VXLAN Outer UDP information
Intel Ehternet X710 series Scale RSS on VXLAN Inner or Outer header information
X710 series = better at balancing over queues > CPU threads
“What is the maximum performance of
the vSphere (D)vSwitch?”
• By default one transmit (Tx) thread per VM
• By default, one receive (Netpoll) thread per pNIC
• Transmit (Tx) and receive (Netpoll) threads
consume CPU cycles
• Each additional thread provides capacity
(1 thread = 1 core)
Network IO CPU consumption
Netpoll Thread
%SYS is ± 100% dur ing tes t. pN IC r ec eives .
( this is the N ETPOLL thr ead)
NetQueue Scaling
{"name": "vmnic0", "switch": "DvsPortset-0", "id": 33554435, "mac": "38:ea:a7:36:78:8c", "rxmode": 0, "uplink": "true",
"txpps": 247, "txmbps": 9.4, "txsize": 4753, "txeps": 0.00, "rxpps": 624291, "rxmbps": 479.9, "rxsize": 96, "rxeps": 0.00,
"wdt": [
{"used": 0.00, "ready": 0.00, "wait": 41.12, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 39, "name": "242.vmnic0-netpoll-10"},
{"used": 0.00, "ready": 0.00, "wait": 41.12, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 39, "name": "243.vmnic0-netpoll-11"},
{"used": 82.56, "ready": 0.49, "wait": 16.95, "runct": 8118, "remoteactct": 1, "migct": 9, "overrunct": 33, "afftype": "pcpu", "affval": 45, "name": "244.vmnic0-netpoll-12"},
{"used": 18.71, "ready": 0.75, "wait": 80.54, "runct": 6494, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "vcpu", "affval": 19302041, "name": "245.vmnic0-netpoll-13"},
{"used": 55.64, "ready": 0.55, "wait": 43.81, "runct": 7491, "remoteactct": 0, "migct": 4, "overrunct": 5, "afftype": "vcpu", "affval": 19299346, "name": "246.vmnic0-netpoll-14"},
{"used": 0.14, "ready": 0.10, "wait": 99.48, "runct": 197, "remoteactct": 6, "migct": 6, "overrunct": 0, "afftype": "vcpu", "affval": 19290577, "name": "247.vmnic0-netpoll-15"},
{"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 45, "name": "1242.vmnic0-0-tx"},
{"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 22, "name": "1243.vmnic0-1-tx"},
{"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 24, "name": "1244.vmnic0-2-tx"},
{"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 39, "name": "1245.vmnic0-3-tx"} ],
3 NetPoll threads are us ed (3 wordlets ) .
Tx Thread
PKTGEN is polling, c ons uming near 100% C PU
%SYS = ± 1 0 0 %
This is the Tx thr ead
• VMXNET3 is required!
• example for vNIC2:
ethernet2.ctxPerDev = "1“
Additional Tx Thread
Additional Tx thread
%SYS = ± 2 0 0 %
C PU thr eads in s ame N U MA node as VM
{"name": "pktgen_load_test21.eth0", "switch": "DvsPortset-0", "id": 33554619, "mac": "00:50:56:87:10:52", "rxmode": 0, "uplink": "false",
"txpps": 689401, "txmbps": 529.5, "txsize": 96, "txeps": 0.00, "rxpps": 609159, "rxmbps": 467.8, "rxsize": 96, "rxeps": 54.09,
"wdt": [
{"used": 99.81, "ready": 0.19, "wait": 0.00, "runct": 1176, "remoteactct": 0, "migct": 12, "overrunct": 1176, "afftype": "vcpu", "affval": 15691696, "name": "323.NetWdt-Async-15691696"},
{"used": 99.85, "ready": 0.15, "wait": 0.00, "runct": 2652, "remoteactct": 0, "migct": 12, "overrunct": 12, "afftype": "vcpu", "affval": 15691696, "name": "324.NetWorldlet-Async-33554619"} ],
2 wo r l d l e t s
• Transmit (Tx) and receive (Netpoll) threads can be
scaled!
• Take the extra CPU cycles for network IO into
account!
Summary
Q&A
Keep an eye out for
our upcoming book!
@frankdenneman
@NHagoort
@frankdenneman
@NHagoort
VMworld 2016: vSphere 6.x Host Resource Deep Dive

VMworld 2016: vSphere 6.x Host Resource Deep Dive

  • 1.
    vSphere 6.x HostResource Deep Dive Frank Denneman Niels Hagoort INF8430 #INF8430
  • 2.
  • 3.
    Introduction www.cloudfix.nl Niels Hagoort • IndependentArchitect • VMware VCDX #212 • VMware vExpert (NSX) Frank Denneman • Enjoying Summer 2016 • VMware VCDX #29 • VMware vExpert www.frankdenneman.nl
  • 4.
    Compute ( N UM A , N U M A , N U M A )
  • 5.
    Insights In VirtualData Centers
  • 6.
    Modern dual socketsCPU servers are Non-Uniform Memory Access (NUMA) systems
  • 7.
  • 8.
    NUMA Focus Points •Caching Snoop modes • DIMM configuration • Size VM match CPU topology
  • 9.
    CPU Cache ( th e f o r g o t t e n h e r o )
  • 10.
  • 11.
  • 12.
    DIMM Configuration ( an d w h y 3 8 4 G B i s n o t a n o p t i m a l c o n f i g u r a t i o n )
  • 13.
  • 14.
    3-DPC - 384GB – 2400 MHz DIMM
  • 15.
  • 16.
    2-DPC - 384GB – 2400 MHz DIMM
  • 17.
  • 18.
    Right Size yourVM ( A l i g n m e n t e q u a l s c o n s i s t e n t p e r f o r m a n c e )
  • 19.
    ESXi NUMA focuspoints • CPU scheduler allocates Core or HT cycles • NUMA scheduler initial placement + LB • vCPU configuration impacts IP & LB
  • 20.
  • 21.
    12 vCPU On20 Core System
  • 22.
    Align To CPUTopology • Resize vCPU configuration to match core count • Use vcpu.numa.preferHT • Use cores per socket (CORRECTLY) • Attend INF8089 at 5 PM in this room
  • 23.
    Prefer HT +12 Cores Per Socket
  • 24.
    Storage ( H ow f a r a w a y i s y o u r d a t a ? )
  • 25.
    The Importance ofAccess Latency Location of operands CPU Cycles Perspective CPU Register 1 Brain (Nanosecond) L1/L3 cache 10 End of this room Local Memory 100 Entrance of building Disk 10^6 New York
  • 26.
    Every Layer =CPU Cycles & Latency
  • 27.
    Industry Moves TowardNVMe • SSD bandwidth capabilities exceeds current controller bandwidth • Protocol inefficiencies dominant contributor to access time • NVMe architected from the ground up for non - volatile memory
  • 28.
  • 29.
  • 30.
    Not All DriversAre Created Equal
  • 31.
  • 32.
    pNIC considerations forVXLAN performance
  • 33.
    • Additional layerof packet processing • Consumes CPU cycles for each packet for encapsulation/de-capsulation • Some of the offload capabilities of the NIC cannot be used (TCP based) • VXLAN offloading! (TSO / CSO) VXLAN
  • 35.
  • 36.
    [root@ESXi02:~] vmkload_mod -sbnx2x vmkload_mod module information input file: /usr/lib/vmware/vmkmod/bnx2x Version: Version 1.78.80.v60.12, Build: 2494585, Interface: 9.2 Built on: Feb 5 2015 Build Type: release License: GPL Name-space: com.broadcom.bnx2x#9.2.3.0 Required name-spaces: com.broadcom.cnic_register#9.2.3.0 com.vmware.driverAPI#9.2.3.0 com.vmware.vmkapi#v2_3_0_0 Parameters: skb_mpool_max: int Maximum attainable private socket buffer memory pool size for the driver. skb_mpool_initial: int Driver's minimum private socket buffer memory pool size. heap_max: int Maximum attainable heap size for the driver. heap_initial: int Initial heap size allocated for the driver. disable_feat_preemptible: int For debug purposes, disable FEAT_PREEMPTIBLE when set to value of 1 disable_rss_dyn: int For debug purposes, disable RSS_DYN feature when set to value of 1 disable_fw_dmp: int For debug purposes, disable firmware dump feature when set to value of 1 enable_vxlan_ofld: int Allow vxlan TSO/CSO offload support.[Default is disabled, 1: enable vxlan offload, 0: disable vxlan offload] debug_unhide_nics: int Force the exposure of the vmnic interface for debugging purposes[Default is to hide the nics]1. In SRIOV mode expose the PF enable_default_queue_filters: int Allow filters on the default queue. [Default is disabled for non-NPAR mode, enabled by default on NPAR mode] multi_rx_filters: int Define the number of RX filters per NetQueue: (allowed values: -1 to Max # of RX filters per NetQueue, -1: use the default number of RX filters; 0: Disable use of multiple RX filters; 1..Max # the number of RX filters per NetQueue: will force the number of RX filters to use for NetQueue ........
  • 37.
    [root@ESXi01:~] esxcli systemmodule parameters list -m bnx2x Name Type Value Description ---------------------------- ---- ----- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------------------ RSS int Control the number of queues in an RSS pool. Max 4. autogreeen uint Set autoGrEEEn (0:HW default; 1:force on; 2:force off) debug uint Default debug msglevel debug_unhide_nics int Force the exposure of the vmnic interface for debugging purposes[Default is to hide the nics]1. In SRIOV mode expose the PF disable_feat_preemptible int For debug purposes, disable FEAT_PREEMPTIBLE when set to value of 1 disable_fw_dmp int For debug purposes, disable firmware dump feature when set to value of 1 disable_iscsi_ooo uint Disable iSCSI OOO support disable_rss_dyn int For debug purposes, disable RSS_DYN feature when set to value of 1 disable_tpa uint Disable the TPA (LRO) feature dropless_fc uint Pause on exhausted host ring eee set EEE Tx LPI timer with this value; 0: HW default enable_default_queue_filters int Allow filters on the default queue. [Default is disabled for non-NPAR mode, enabled by default on NPAR mode] enable_vxlan_ofld int Allow vxlan TSO/CSO offload support.[Default is disabled, 1: enable vxlan offload, 0: disable vxlan offload] gre_tunnel_mode uint Set GRE tunnel mode: 0 - NO_GRE_TUNNEL; 1 - NVGRE_TUNNEL; 2 - L2GRE_TUNNEL; 3 - IPGRE_TUNNEL gre_tunnel_rss uint Set GRE tunnel RSS mode: 0 - GRE_OUTER_HEADERS_RSS; 1 - GRE_INNER_HEADERS_RSS; 2 - NVGRE_KEY_ENTROPY_RSS heap_initial int Initial heap size allocated for the driver. heap_max int Maximum attainable heap size for the driver. int_mode uint Force interrupt mode other than MSI-X (1 INT#x; 2 MSI) max_agg_size_param uint max aggregation size mrrs int Force Max Read Req Size (0..3) (for debug) multi_rx_filters int Define the number of RX filters per NetQueue: (allowed values: -1 to Max # of RX filters per NetQueue, -1: use the default number of RX filters; 0: Disable use of multiple RX filters; 1..Max # the number of RX filters per NetQueue: will force the number of RX filters to use for NetQueue native_eee uint num_queues uint Set number of queues (default is as a number of CPUs) num_rss_pools int Control the existence of a RSS pool. When 0,RSS pool is disabled. When 1, there will bea RSS pool (given that RSS > 0). ........
  • 38.
    • Check thesupported features of your pNIC • Check the HCL for supported features in the driver module • Check the driver module; does it requires you to enable features? • Other async (vendor) driver available? Driver Summary
  • 39.
    RSS & NetQueue •NIC support required (RSS / VMDq) • VMDq is the hardware feature, NetQueue is the feature baked into vSphere • RSS & NetQueue similar in basic functionality • RSS uses hashes based on IP/TCP port/MAC • NetQueue uses MAC filters
  • 40.
    Without RSS forVXLAN (1 thread per pNIC)
  • 41.
    RSS enabled (>1threads per pNIC)
  • 42.
    How to enableRSS (Intel) 1. Unload module: esxcfg-module -u ixgbe 2. Enable inbox: vmkload_mod ixgbe RSS="4,4” Enable async: vmkload_mod ixgbe RSS=“1,1”
  • 43.
    Receive throughput withVXLAN using 10GbE
  • 44.
    Intel examples: Intel Ethernetproducts RSS for VXLAN technology Intel Ethernet X520/540 series Scale RSS on VXLAN Outer UDP information Intel Ehternet X710 series Scale RSS on VXLAN Inner or Outer header information X710 series = better at balancing over queues > CPU threads
  • 45.
    “What is themaximum performance of the vSphere (D)vSwitch?”
  • 46.
    • By defaultone transmit (Tx) thread per VM • By default, one receive (Netpoll) thread per pNIC • Transmit (Tx) and receive (Netpoll) threads consume CPU cycles • Each additional thread provides capacity (1 thread = 1 core) Network IO CPU consumption
  • 48.
    Netpoll Thread %SYS is± 100% dur ing tes t. pN IC r ec eives . ( this is the N ETPOLL thr ead)
  • 49.
    NetQueue Scaling {"name": "vmnic0","switch": "DvsPortset-0", "id": 33554435, "mac": "38:ea:a7:36:78:8c", "rxmode": 0, "uplink": "true", "txpps": 247, "txmbps": 9.4, "txsize": 4753, "txeps": 0.00, "rxpps": 624291, "rxmbps": 479.9, "rxsize": 96, "rxeps": 0.00, "wdt": [ {"used": 0.00, "ready": 0.00, "wait": 41.12, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 39, "name": "242.vmnic0-netpoll-10"}, {"used": 0.00, "ready": 0.00, "wait": 41.12, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 39, "name": "243.vmnic0-netpoll-11"}, {"used": 82.56, "ready": 0.49, "wait": 16.95, "runct": 8118, "remoteactct": 1, "migct": 9, "overrunct": 33, "afftype": "pcpu", "affval": 45, "name": "244.vmnic0-netpoll-12"}, {"used": 18.71, "ready": 0.75, "wait": 80.54, "runct": 6494, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "vcpu", "affval": 19302041, "name": "245.vmnic0-netpoll-13"}, {"used": 55.64, "ready": 0.55, "wait": 43.81, "runct": 7491, "remoteactct": 0, "migct": 4, "overrunct": 5, "afftype": "vcpu", "affval": 19299346, "name": "246.vmnic0-netpoll-14"}, {"used": 0.14, "ready": 0.10, "wait": 99.48, "runct": 197, "remoteactct": 6, "migct": 6, "overrunct": 0, "afftype": "vcpu", "affval": 19290577, "name": "247.vmnic0-netpoll-15"}, {"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 45, "name": "1242.vmnic0-0-tx"}, {"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 22, "name": "1243.vmnic0-1-tx"}, {"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 24, "name": "1244.vmnic0-2-tx"}, {"used": 0.00, "ready": 0.00, "wait": 0.00, "runct": 0, "remoteactct": 0, "migct": 0, "overrunct": 0, "afftype": "pcpu", "affval": 39, "name": "1245.vmnic0-3-tx"} ], 3 NetPoll threads are us ed (3 wordlets ) .
  • 50.
    Tx Thread PKTGEN ispolling, c ons uming near 100% C PU %SYS = ± 1 0 0 % This is the Tx thr ead
  • 51.
    • VMXNET3 isrequired! • example for vNIC2: ethernet2.ctxPerDev = "1“ Additional Tx Thread
  • 52.
    Additional Tx thread %SYS= ± 2 0 0 % C PU thr eads in s ame N U MA node as VM {"name": "pktgen_load_test21.eth0", "switch": "DvsPortset-0", "id": 33554619, "mac": "00:50:56:87:10:52", "rxmode": 0, "uplink": "false", "txpps": 689401, "txmbps": 529.5, "txsize": 96, "txeps": 0.00, "rxpps": 609159, "rxmbps": 467.8, "rxsize": 96, "rxeps": 54.09, "wdt": [ {"used": 99.81, "ready": 0.19, "wait": 0.00, "runct": 1176, "remoteactct": 0, "migct": 12, "overrunct": 1176, "afftype": "vcpu", "affval": 15691696, "name": "323.NetWdt-Async-15691696"}, {"used": 99.85, "ready": 0.15, "wait": 0.00, "runct": 2652, "remoteactct": 0, "migct": 12, "overrunct": 12, "afftype": "vcpu", "affval": 15691696, "name": "324.NetWorldlet-Async-33554619"} ], 2 wo r l d l e t s
  • 53.
    • Transmit (Tx)and receive (Netpoll) threads can be scaled! • Take the extra CPU cycles for network IO into account! Summary
  • 54.
  • 55.
    Keep an eyeout for our upcoming book! @frankdenneman @NHagoort
  • 56.