PCP Introduction1
GETTING STARTED WITH
PERFORMANCE CO-PILOT
Paul V. Novarese
pvn@redhat.com
Strategic Customer Engagement
26 March 2015
Burbank, CA
PCP Introduction2
AGENDA
● Overview
● Exploring PCP
● Latest Developments
● Demo
PCP Introduction3
OVERVIEW
What is PCP?
● Open source toolkit
● System-level analysis
● Live and historical
● Extensible (monitors, collectors)
● Distributed
● R&D project, started ~20 years ago!
PCP Introduction4
ARCHITECTURE (System-Level)
PCP Introduction5
ARCHITECTURE (Datacenter-Level)
PCP Introduction6
METRICS
● pminfo --desc -tT --fetch disk.dev.read
disk.dev.read [per-disk read operations]
Data Type: 32-bit unsigned int InDom: 60.1
Semantics: counter Units: Kbyte
Help: Cumulative number of disk read operations since boot time
Values:
inst [0 or "sda"] value 3382299
inst [1 or "sdb"] value 178421
● pmprobe -v mem.util.shmem xfs.log.niclogs nvidia.memused
PCP Introduction7
METRICS NAMESPACES
PCP Introduction8
COLLECTOR TOOLKIT OVERVIEW
● pmcd, pmproxy, pmwebd
● Agents:
● Kernels (linux, mac, win, solaris, bsd, bonding, kvm,
xfs, jbd2, gfs2, gluster, zswap, dmcache, ...)
● Services (samba, elasticsearch, apache, nginx,
memcache, postfix,...)
● Databases (mysql, postgresql, sqlserver, dbping)
● Misc (cisco, shping, zimbra, mmv, ...)
● pcp(1)
PCP Introduction9
CONSUMER TOOLKIT OVERVIEW
● Logging tools
● pmlogsummary, pmlogextract, pmlogger, ...
● Console tools
● pmval, pminfo, pmstat, pmdumptext, pmatop, ...
● Most tools share command line arguments
● Source (host, archive)
● Sampling (interval, count)
● Time windows, timezone
● PCPIntro(1)
PCP Introduction10
CLIENT TOOLKIT: pmchart
● Arbitrary charts
● Load / Save views
● VCR-style playback
PCP Introduction11
CLIENT TOOLKIT: pmie
● “Inference Engine”
● Rules Actions
ruleset kernel.all.load #’1 minute’ > 10 * hinv.ncpu
print "extreme load average %v"
else kernel.all.load #’1 minute’ > 2 * hinv.ncpu
print "moderate load average %v"
unknown
print "load average unavailable"
otherwise
print "load average OK";
PCP Introduction12
CONTAINER AWARENESSCONTAINER AWARENESS
PCP Introduction13
GOALS
● Zero installation inside containers required
● Allow targeting of individual containers
● Simplify your life (dev_t auto-mapping)
● Data reduction (proc.*, cgroup.*)
PCP Introduction14
KERNEL INSTRUMENTATION
● cgroup accounting
● [subsys].stat files below /sys/fs/cgroup
● blkio
● IOPs/bytes, service/wait time – aggregate/per-dev
● Split up by read/write, sync/async
● cpuacct
● Processor use per-cgroup - aggregate/per-CPU
● memory
● mapped anon pages, page cache, writeback, swap,
active/inactive LRU state
PCP Introduction15
NAMESPACES
● EG: cat /proc/net/dev
● Contents differ inside vs outside a container
● Processes (e.g. cat) in containers run in different
network, ipc, process, uts, mount namespaces
● Namespaces are inherited across fork/clone
● Processes within a container share common view
PCP Introduction17
BONUS MATERIAL
PCP Introduction18
CURRENT DEVELOPMENT
● Browser interfaces / APIs
● JSON APIs
● http://grafana.org
● Python APIs
● Clients and agents
● http://2014.pycon-au.org
● Containers, discovery
● New metrics (cgroups, libpfm, GPUs,...)
PCP Introduction19
RESOURCES
● http://pcp.io/
● http://developerblog.redhat.com/
● Supported in RHEL 7, RHEL 6.6
● Tech Brief: Getting Started w/ PCP (similar to demo)
● https://access.redhat.com/articles/1216303
● RHEL 7 cheatsheets for RHEL 6 admins
● https://access.redhat.com/articles/1190233
● https://access.redhat.com/system/files/private_discussion_files/rhel_5_6_
7_cheatsheet_27x36_1014_jcs_web.pdf
PCP Introduction20
QUICK COMPARISON
https://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems
Name License Trending SNMP Platform IPv6 WebApp
Commercial Yes Yes Yes Yes Yes Unknown Yes Yes Full Control
Commercial Yes Yes Yes Yes Yes Unknown Yes Yes Yes
Nagios GPL Yes No Via plugin Via plugin Yes Yes Yes Yes
Flat file Yes No Yes Yes Yes Yes Yes Viewing
Zabbix GPL Yes No Yes Yes Yes Yes Yes Full Control
Data
Storage
Method
Trend
Prediction
Auto
Discovery
Distributed
Monitoring
Access
Control
HP Network
Node
Manager
(NNMi)
PostgreSQL,
Oracle
IBM Tivoli
Network
Manager
MySQL,
Oracle, DB2
Flat file, SQL C, PHP
Performance
Co-Pilot GPL, LGPL
C, Perl,
Python,
POSIX,
MinGW
Oracle,
MySQL,
PostgreSQL,
DB2, SQLite C, PHP

Getting Started with Performance Co-Pilot

  • 1.
    PCP Introduction1 GETTING STARTEDWITH PERFORMANCE CO-PILOT Paul V. Novarese pvn@redhat.com Strategic Customer Engagement 26 March 2015 Burbank, CA
  • 2.
    PCP Introduction2 AGENDA ● Overview ●Exploring PCP ● Latest Developments ● Demo
  • 3.
    PCP Introduction3 OVERVIEW What isPCP? ● Open source toolkit ● System-level analysis ● Live and historical ● Extensible (monitors, collectors) ● Distributed ● R&D project, started ~20 years ago!
  • 4.
  • 5.
  • 6.
    PCP Introduction6 METRICS ● pminfo--desc -tT --fetch disk.dev.read disk.dev.read [per-disk read operations] Data Type: 32-bit unsigned int InDom: 60.1 Semantics: counter Units: Kbyte Help: Cumulative number of disk read operations since boot time Values: inst [0 or "sda"] value 3382299 inst [1 or "sdb"] value 178421 ● pmprobe -v mem.util.shmem xfs.log.niclogs nvidia.memused
  • 7.
  • 8.
    PCP Introduction8 COLLECTOR TOOLKITOVERVIEW ● pmcd, pmproxy, pmwebd ● Agents: ● Kernels (linux, mac, win, solaris, bsd, bonding, kvm, xfs, jbd2, gfs2, gluster, zswap, dmcache, ...) ● Services (samba, elasticsearch, apache, nginx, memcache, postfix,...) ● Databases (mysql, postgresql, sqlserver, dbping) ● Misc (cisco, shping, zimbra, mmv, ...) ● pcp(1)
  • 9.
    PCP Introduction9 CONSUMER TOOLKITOVERVIEW ● Logging tools ● pmlogsummary, pmlogextract, pmlogger, ... ● Console tools ● pmval, pminfo, pmstat, pmdumptext, pmatop, ... ● Most tools share command line arguments ● Source (host, archive) ● Sampling (interval, count) ● Time windows, timezone ● PCPIntro(1)
  • 10.
    PCP Introduction10 CLIENT TOOLKIT:pmchart ● Arbitrary charts ● Load / Save views ● VCR-style playback
  • 11.
    PCP Introduction11 CLIENT TOOLKIT:pmie ● “Inference Engine” ● Rules Actions ruleset kernel.all.load #’1 minute’ > 10 * hinv.ncpu print "extreme load average %v" else kernel.all.load #’1 minute’ > 2 * hinv.ncpu print "moderate load average %v" unknown print "load average unavailable" otherwise print "load average OK";
  • 12.
  • 13.
    PCP Introduction13 GOALS ● Zeroinstallation inside containers required ● Allow targeting of individual containers ● Simplify your life (dev_t auto-mapping) ● Data reduction (proc.*, cgroup.*)
  • 14.
    PCP Introduction14 KERNEL INSTRUMENTATION ●cgroup accounting ● [subsys].stat files below /sys/fs/cgroup ● blkio ● IOPs/bytes, service/wait time – aggregate/per-dev ● Split up by read/write, sync/async ● cpuacct ● Processor use per-cgroup - aggregate/per-CPU ● memory ● mapped anon pages, page cache, writeback, swap, active/inactive LRU state
  • 15.
    PCP Introduction15 NAMESPACES ● EG:cat /proc/net/dev ● Contents differ inside vs outside a container ● Processes (e.g. cat) in containers run in different network, ipc, process, uts, mount namespaces ● Namespaces are inherited across fork/clone ● Processes within a container share common view
  • 16.
  • 17.
    PCP Introduction18 CURRENT DEVELOPMENT ●Browser interfaces / APIs ● JSON APIs ● http://grafana.org ● Python APIs ● Clients and agents ● http://2014.pycon-au.org ● Containers, discovery ● New metrics (cgroups, libpfm, GPUs,...)
  • 18.
    PCP Introduction19 RESOURCES ● http://pcp.io/ ●http://developerblog.redhat.com/ ● Supported in RHEL 7, RHEL 6.6 ● Tech Brief: Getting Started w/ PCP (similar to demo) ● https://access.redhat.com/articles/1216303 ● RHEL 7 cheatsheets for RHEL 6 admins ● https://access.redhat.com/articles/1190233 ● https://access.redhat.com/system/files/private_discussion_files/rhel_5_6_ 7_cheatsheet_27x36_1014_jcs_web.pdf
  • 19.
    PCP Introduction20 QUICK COMPARISON https://en.wikipedia.org/wiki/Comparison_of_network_monitoring_systems NameLicense Trending SNMP Platform IPv6 WebApp Commercial Yes Yes Yes Yes Yes Unknown Yes Yes Full Control Commercial Yes Yes Yes Yes Yes Unknown Yes Yes Yes Nagios GPL Yes No Via plugin Via plugin Yes Yes Yes Yes Flat file Yes No Yes Yes Yes Yes Yes Viewing Zabbix GPL Yes No Yes Yes Yes Yes Yes Full Control Data Storage Method Trend Prediction Auto Discovery Distributed Monitoring Access Control HP Network Node Manager (NNMi) PostgreSQL, Oracle IBM Tivoli Network Manager MySQL, Oracle, DB2 Flat file, SQL C, PHP Performance Co-Pilot GPL, LGPL C, Perl, Python, POSIX, MinGW Oracle, MySQL, PostgreSQL, DB2, SQLite C, PHP