About the author: Priya Autee is software engineer at Intel working on various leading edge IA features and Intel(R) RDT expert. She is focused on prototyping and researching open source APIs like DPDK, Intel(R) RDT etc. to support NFV/compute sensitive requirements on Intel Architecture. She holds Masters in Computer Science from Arizona State University, Arizona.
2. 2
TRANSFORMING NETWORKING INFRASTRUCTURE
Cache Monitoring Tech (CMT)
§ Per-thread L3 Occupancy Monitoring
§ 4 Resource Monitoring ID’s per logical thread
LPHP
Memory BW Monitoring (MBM)
§ Per-thread Memory Bandwidth Monitoring
§ Leverages RMID infrastructure
IMC?
Cache Allocation Tech (CAT)
§ Per-thread L3 Occupancy Control
§ Code and Data Prioritization (CDP)
§ Intel® Xeon® processor E5 v4 introduces 16
Classes of Service
Cache LPHP
Supplement existing Telemetry:
• Counters; Perfmon;
• Intel® Node Manager
• Snap (open source project);
• Utilities in Kernel & VMM; etc…
Intel® Resource Director Technology
(Intel® RDT)
Building on a rich and growing portfolio of technologies embedded in Intel silicon
8
3. 3
TRANSFORMING NETWORKING INFRASTRUCTURE
• 3 Levels of cache (SNB, IVB, HSW,BDW processors)
• L1 cache – 32KB data and 32KB instruction caches
• L2 cache – 256KB – unified (holds code & data)
• L3 cache (LLC) – 25MB (IVB) , 30MB (HSW) common cache for all cores in
CPU socket.
• L1 cache is smallest, and fastest.
• CPU tries to access data – not in L1 cache?
• Try L2 cache - not in L2 cache?
• Try L3 cache – not in L3 cache?
• Cache miss - need to access system memory (DRAM).
• L1 & L2 cache is per physical core (shared per logical core)
• L3 cache is shared (per CPU socket)
Caching on IA
4. 4
TRANSFORMING NETWORKING INFRASTRUCTURE
Intel® Resource Director Technology (Intel®
RDT)
Core
app
Core
app
Last
Level
Cache
Core
DRAM
app
• Identify misbehaving
applications and
reschedule according to
priority
• Cache Occupancy
reported on a per
Resource Monitoring ID
(RMID) basis –
Advanced Telemetry
Cache Monitoring Technology
(CMT)
Core
app
Core
app
Last
Level
Cache
Core
DRAM
app
Cache Allocation Technology
(CAT)
• Last Level Cache
partitioning mechanism
enabling separation
and prioritization of
apps or VMs
• Misbehaving threads
can be isolated to
increase determinism
Core
app
Core
app
Last
Level
Cache
Core
app
Memory Bandwidth Monitoring
(MBM)
• Monitors Memory Bandwidth
consumption on per thread/
core/app basis
• Shares common RMID
architecture -- Telemetry
• Provides insight into second
order of shared resource
contention
DRAM
9
5. 5
TRANSFORMING NETWORKING INFRASTRUCTURE
Key Concepts: Resource Monitoring IDs
(RMIDs)
§ Threads/Apps/VMs grouped into
Resource Monitoring IDs (RMIDs)
§ Any thread, app, VM or a combination can
be monitored with any RMID
§ Specify the RMID for a thread via the per-
core IA32_PQR_ASSOC (“PQR”) MSR
Associate threads into RMIDs.
Hardware tracks resource
utilization per RMID.
SW retrieves monitoring data
periodically via event IDs for
CMT, MBM, future features.
6. 6
TRANSFORMING NETWORKING INFRASTRUCTURE
Key Concepts: Classes of Service (CLOS)
§ Threads/Apps/VMs grouped into Classes
of Service (CLOS) for resource allocation
§ Resource usage of any thread, app, VM
or a combination controlled with a CLOS
§ Specify the CLOS for a thread via the
per-core IA32_PQR_ASSOC (“PQR”)
MSR
§ Configure resource guidelines per CLOS.
§ Associate threads into CLOS.
§ Hardware manages resource allocation.
§ Extensible to other shared resources
7. 7
TRANSFORMING NETWORKING INFRASTRUCTURE
PQoS Kernel Implementation
Threads
resctrl fs
/sys/fs/resctrl perf
User interface
Cache alloc
Cache, mem
bw monitoring
Kernel QOS support
Intel Xeon Intel® RDT
support
Shared L3 Cache
User
Space
Kernel Space
Hardware
MSR
Driver
Configure
bitmask per
CLOS
Set CLOS/
RMID for
thread
During ctx
switch
Allocation
configuration
Read
Event
counter
Read
Monitored data
Standalone
PQoS library
8. 8
TRANSFORMING NETWORKING INFRASTRUCTURE
PQoS Library and Utility
PQoS static library (Intel IP, BSD Lic. 01.org, github)
- Provides applications a simple C API
- Requires C & pthreads libraries (GNU C library on Linux implements both)
- Uses MSR and CPUID Linux kernel drivers through standard file I/O API
PQoS utility:
- Links to PQoS static library
- Simple and easy to use command line interface
- Enable customers for evaluation of CMT, MBM, CAT and CDP
PQoS
Utility
PQoS
library
Linux User Space Linux Kernel Space
MSR Driver
CPUID Driver
Software Package
Standard Linux Kernel
Modules
File I/O
9. 9
TRANSFORMING NETWORKING INFRASTRUCTURE
Options
#1 PQoS integration through PQoS
library linking
#2 PQoS library development in an
Application
#3 Using perf sys call and resctrl fs for
Scheduler based RDT support
10. 10
TRANSFORMING NETWORKING INFRASTRUCTURE
Download & Installation
• Download v0.1.5 & unpack
wget https://github.com/01org/intel-cmt-cat/archive/v0.1.5.tar.gz
tar xzf v0.1.5.tar.gz
note: tip of the master branch is available as zip here at
https://github.com/01org/intel-cmt-cat/archive/master.zip
• Compile & Install
cd intel-cmt-cat-0.1.5
make
sudo make install
• Uninstall with “sudo make uninstall”
6/20/17
11. 11
TRANSFORMING NETWORKING INFRASTRUCTURE
Other Download & Installation Options
• Ubuntu / Debian based
sudo apt-get install intel-cmt-cat
• Fedora / RedHat based
sudo yum install intel-cmt-cat
sudo dnf install intel-cmt-cat
Note:
OS packages are typically bit behind github code.
For latest features refer to github.com/01org/intel-cmt-cat
6/20/17
12. 12
TRANSFORMING NETWORKING INFRASTRUCTURE
Common Installation Problems
• “/usr/local/bin” not in the PATH
- Update profile or sudoers file with “/usr/local/bin” path
• “/usr/local/lib” not in the LD PATH
- Update LD configuration to include “/usr/local/lib” path
echo “/usr/local/lib” > /etc/ld.so.conf.d/libpqos.conf
ldconfig
6/20/17
13. 13
TRANSFORMING NETWORKING INFRASTRUCTURE
Package Details
• ‘libpqos’ shared library
• Provides API’s to:
• detect & enumerate Intel® RDT features on the platform
• monitor resources on hardware thread basis
• manage resources on hardware thread basis
• ‘pqos’ tool
• Detect & show intel® RDT configuration
• Monitors resources
• Manages resources
• ‘rdtset’ tool
• Aims to simplify Intel® RDT resource management
• Same as ‘taskset’ pins application to cores
• Then configures classes to satisfy command line requirements
6/20/17
14. 14
TRANSFORMING NETWORKING INFRASTRUCTURE
Package Details
• Other Bits and Pieces in the Package
• Perl shim for the library
• C example code
• Perl example code
• Net SNMP Agent to providing SNMP access to CAT, CDP and CMT
• For FAQ and other usage examples please have a look at project wiki
web site
https://github.com/01org/intel-cmt-cat/wiki
6/20/17
15. 15
TRANSFORMING NETWORKING INFRASTRUCTURE
• Total Cache size: 55 MB
• Number of ways: 20
Calculations:
Formula for Bitmask: Total Cache Size/Number of Ways
For our lab systems:
55MB/20 = 2.75MB
Example:
Mask: 0x00001 means 2.75 MB ( One cache way)
Mask: 0x00003 means 5.5 MB ( Two cache ways)
Mask: 0x00007 means 8.25 MB ( Three cache ways)
Mask: 0x0000F means 11 MB ( Four cache ways)
Bitmask/Capacity Calculation
17. 17
TRANSFORMING NETWORKING INFRASTRUCTURE
-bash-4.3$ sudo pqos –s –v
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
INFO: CACHE: type 1, level 1, max id sharing this cache 2 (1 bits)
INFO: CACHE: type 2, level 1, max id sharing this cache 2 (1 bits)
INFO: CACHE: type 3, level 2, max id sharing this cache 2 (1 bits)
INFO: CACHE: type 3, level 3, max id sharing this cache 64 (6 bits)
INFO: Monitoring capability detected
INFO: CPUID.0x7.0: L3 CAT supported
INFO: CDP is disabled
INFO: L3 CAT details: CDP support=1, CDP on=0, #COS=16, #ways=20, ways contention bit-mask 0xc0000
INFO: L3 CAT details: cache size 57671680 bytes, way size 2883584 bytes
INFO: L3CA capability detected
INFO: CPUID 0x10.0: L2 CAT not supported!
INFO: L2CA capability not detected
INFO: CPUID 0x10.0: MBA not supported!
INFO: MBA capability not detected
INFO: resctrl not detected. Kernel version 4.10 or higher required
INFO: OS support for CMT detected
INFO: OS support for L3 CAT not detected
...
RDT enumeration
19. 19
TRANSFORMING NETWORKING INFRASTRUCTURE
Monitoring (CMT, MBM)
# monitor all cores and all events
-bash-4.3$ sudo pqos
# monitor cores 0 to 11 and all events
-bash-4.3$ sudo pqos –m all:0-11
# monitor LLC occupancy on cores 0, 1, 4 and 6, local memory
bandwidth on cores 8 to 11 and remote memory bandwidth on cores
12-14
-bash-4.3$ sudo pqos –m “llc:0,1,4,6” –m “mbl:8-11” –m “mbr:
12-14”
# reset monitoring infrastructure ; Reclaims in-use RMID's.
-bash-4.3$ sudo pqos -r
Note: Use ctrl-c to stop monitoring
6/20/17
20. 20
TRANSFORMING NETWORKING INFRASTRUCTURE
Monitoring
# monitor groups of cores together (aggregate statistics):
# cores 0 to 7 – group 1
# cores 8 to 11 – group 2
# cores 12 to 15 – group 3
# groups can represent applications or VM’s
-bash-4.3$ sudo pqos –m “all:[0-7][8-11][12-15]”
# cores 0 to 11 – group 1 [all events]
# cores 12-14 – group 2 [LLC occupancy]
# cores 15,17 and 20 – group 3 [Local memory BW]
# groups can represent applications or VM’s
-bash-4.3$ sudo pqos -m "all:[0-11];llc:[12,13,14];mbl:[15-17,20]"
Note: Use ctrl-c to stop monitoring
6/20/17
21. 21
TRANSFORMING NETWORKING INFRASTRUCTURE
# All sockets:
# - set COS1 to 4 ways
# - set COS2 to 8 ways
-bash-4.3$ sudo pqos -e “llc:1=0xf;llc:2=0xff0;”
# Set COS1 to 4 ways on socket 0
# Set COS1 to 8 ways on socket 1
-bash-4.3$ sudo pqos –e “llc@0:1=0xf;llc@1:1=0xff0;”
Allocation LLC (Define COS)
22. 22
TRANSFORMING NETWORKING INFRASTRUCTURE
# associate cores 1 to COS1
-bash-4.3$ sudo pqos -a “llc:1=1”
# associate:
# - cores 0 to 2 with COS1
# - cores 3 to 5 with COS2
# - cores 6 to 8 with COS3
-bash-4.3$ sudo pqos –a “llc:1=0-2;llc:2=3,4,5;llc:3=8-6”
# run sleep on core 2 with access to 2 LLC ways: rdtset
-bash-4.3$ sudo rdtset –t “l3=0x3;cpu=2” –c 2 sleep 60
Allocation LLC (Associate Core with COS)
23. 23
TRANSFORMING NETWORKING INFRASTRUCTURE
# reset & keep current CDP config. Sets all COS to default (fill into all
ways) and associates all cores with COS 0.
-bash-4.3$ sudo pqos –R
# reset & turn on CDP
-bash-4.3$ sudo pqos –R l3cdp-on
# Use current L3 CDP settings and set COS 1 code and data bitmasks
-bash-4.3$ sudo pqos -e "llc:1d=0xfff;llc:1c=0xfff00;"
# reset & turn off CDP
-bash-4.3$ sudo pqos –R l3cdp-off
Allocation LLC (CDP & RESET)
24. 24
TRANSFORMING NETWORKING INFRASTRUCTURE
Orchestration Proposal for RDT in the Datacenter:
OpenStack Integration (CMT Example)
24
Ceilometer
Data store
AODH
(AlarmEngine)
Host-C
CMT
Host-B (w/o
CMT)
Host-A
CMT
VM1
VM2
VM3
Cache Allocations
Nova
(Scheduler,
Orchestrator)
Congress
(Policy Engine)
Cache-Usage
CPU-Usage
X-Usage
VM9
CPU-Usage
X-Usage
Cache-Usage
CPU-Usage
X-Usage
Ceilometer
agent
Ceilometer
agent
Ceilometer
agent
Ceilometer
agent
collectd
PQoS
Lib
perf
MSR if Linux sched
integrated
Vision: Per-node resource
controls directed by a
datacenter-level
orchestration framework
29. 29
TRANSFORMING NETWORKING INFRASTRUCTURE
Intel® Resource Director Technology (Intel®
RDT) Collateral
• Intel Resource Director Technology landing page
• http://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-
technology.html
• Includes links to blogs and many other resources
• Intel Software Developer’s Manual
• http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
(Vol 3b, Chapter 17.15 and 17.16, covers CMT, CAT, MBM and CDP)
• NPG Product Literature
• http://www.intel.com/content/www/us/en/communications/nfv-packet-processing-brief.html
• Academic Research Papers
• Numerous prior works are available from multiple researchers and organizations
229
30. 30
TRANSFORMING NETWORKING INFRASTRUCTURE
Intel® Resource Director Technology
Collateral – Software Enabling
• Software Enabling: Non Operating System integrated options
• Standalone tool to monitor and control allocation functionality https://01.org/packet-processing/cache-monitoring-
allocation-technology
• Software Enabling: Operating System Scheduler enabled options
• Linux* Perf patches for Monitoring (mainstream since kernel v4.1, v4.6) h"ps://lkml.kernel.org/r/1422038748-21397-1-
git-send-email-ma"@codeblueprint.co.uk
• Linux resctrl for allocation support v4.10
• Introduc)on to CMT: h"ps://soBware.intel.com/en-us/blogs/2014/06/18/benefit-of-cache-monitoring
• Discussion of RMIDs and CMT So6ware Interfaces: h"ps://soBware.intel.com/en-us/blogs/2014/12/11/intel-s-
cache-monitoring-technology-soBware-visible-interfaces
• Use Models and Example Data: h"ps://soBware.intel.com/en-us/blogs/2014/12/11/intels-cache-monitoring-
technology-use-models-and-data
• So6ware Supports and Tools: Intel's Cache Monitoring Technology: SoBware Support and Tools:
h"ps://soBware.intel.com/en-us/blogs/2014/12/11/intels-cache-monitoring-technology-soBware-support-and-
tools
25
31. 31
TRANSFORMING NETWORKING INFRASTRUCTURE
Perf CMT and MBM Implementation
• RMID recycling
• Cache monitoring and Memory Bandwidth monitoring per Pid/tid based
# tools/perf/perf list | grep intel_cqm
intel_cqm/llc_occupancy/ [Kernel PMU event]
intel_cqm/local_bytes/ [Kernel PMU event]
intel_cqm/total_bytes/ [Kernel PMU event]
Command: #tools/perf/perf stat -e intel_cqm/llc_occupancy/ -e intel_cqm/local_bytes/ -e
intel_cqm/total_bytes/ -p <#pid>
• CMT and MBM support in applications to track pid/tid.
• Libvirt enables CMT event for monitoring VM’s using perf_event_open sys
call.
int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu, int
group_fd, unsigned long flags);
(Please refer lib/perf.c for detailed implementation)
32. 32
TRANSFORMING NETWORKING INFRASTRUCTURE
Libvirt CMT support
Libvirt enabling Commands:
• To Enable/Disable CMT perf event for domain:
$virsh perf <domain> --enable <event_name>
For Example: $virsh perf guest01 --enable cmt
• To get the perf events list:
§ $virsh perf <domain>
• To print statistics for the perf events:
§ $ virsh domstats domain
Patches Available: https://www.redhat.com/archives/libvir-list/
2016-January/msg01264.html