The power of linux advanced tracer [POUG18]

Mahmoud Hatem
Mahmoud HatemSenior Oracle DBA [OCM - ACE Associate]
THE POWER OF LINUX
ADVANCED TRACER
HATEM MAHMOUD
HTTPS://MAHMOUDHATEM.WORDPRESS.COM
HIGH FIVE POUG
2
WHO AM I
Oracle DBA
Oracle experience: 7 years
Located in TUNISIA
Oracle Certified Master
Oracle geek
https://mahmoudhatem.wordpress.com
3
TAKE AWAYS
• Better understanding of Linux tracing landscape
• Getting an idea of what can be done.
As someone else said :
“Knowing what can be done is more important than knowing how to do it - you can
always google that”
4
AGENDA
1. Linux tracing landscape
2. Static tracing
3. Dynamic tracing
4. Monkey patching
5. Deeper look at CPU utilization
5
LINUX TRACING
LANDSCAPE
6
LINUX TRACING TIMELINE
http://www.brendangregg.com/Slides/SCALE2017_perf_analysis_eBPF.pdf
7
LINUX TRACING LANDSCAPE
eBPF
kprobe
uprobe
tracepoints
software events
hardware events
systemtap
perf_events
bcc/ebpf
USDT
Ftrace
8
LINUX TRACING SYSTEMS
• systemtap,perf,bcc,pmu-tools,etcFront-end tools
• stap module,eBPF,perf_events (perf_event_open
syscall ),ftrace(/sys/kernel/debug/tracing),etc
Mechanisms for
extracting data
• kprobes and uprobes (dynamic tracing),
• tracepoints ,software events and USDT (static tracing)
• PMCs (hardware counters).
• Etc
Event source
https://jvns.ca/blog/2017/07/05/linux-tracing-systems/
Breakdown as suggested by Brendan Gregg and Julia Evans
9
LINUX TRACING SYSTEMS
• systemtap,perf,bcc,pmu-tools,etcFront-end tools
• stap module,eBPF,perf_events (perf_event_open
syscall ),ftrace(/sys/kernel/debug/tracing),etc
Mechanisms for
extracting data
• kprobes and uprobes (dynamic tracing),
• tracepoints ,software events and USDT (static tracing)
• PMCs (hardware counters).
• Etc
Event source
https://jvns.ca/blog/2017/07/05/linux-tracing-systems/
Breakdown as suggested by Brendan Gregg and Julia Evans
10
LINUX TRACING SYSTEMS
• systemtap,perf,bcc,pmu-tools,etcFront-end tools
• stap module,eBPF,perf_events (perf_event_open
syscall ),ftrace(/sys/kernel/debug/tracing),etc
Mechanisms for
extracting data
• kprobes and uprobes (dynamic tracing),
• tracepoints ,software events and USDT (static tracing)
• PMCs (hardware counters).
• Etc
Event source
https://jvns.ca/blog/2017/07/05/linux-tracing-systems/
Breakdown as suggested by Brendan Gregg and Julia Evans
11
LINUX TRACING SYSTEMS
• systemtap,perf,bcc,pmu-tools,etcFront-end tools
• stap module,eBPF,perf_events (perf_event_open
syscall ),ftrace(/sys/kernel/debug/tracing),etc
Mechanisms for
extracting data
• kprobes and uprobes (dynamic tracing),
• tracepoints ,software events and USDT (static tracing)
• PMCs (hardware counters).
• Etc
Event source
https://jvns.ca/blog/2017/07/05/linux-tracing-systems/
Breakdown as suggested by Brendan Gregg and Julia Evans
12
STATIC TRACING
13
STATIC TRACING
Tracepoints :
• Kernel predefined trace probe
• Inserted by kernel developers at important locations in
the code (system calls, disk I/O, etc)
User Statically-Defined Tracing (USDT) :
• Application predefined trace probe
• Inserted by application developers at important
locations in the code,
Software Events :
• kernel counters (CPU migrations, minor faults, major
faults,etc)
http://www.brendangregg.com/perf.html
14
KERNEL STATIC TRACEPOINT
15
BCC/TOOLS : BIOLATENCY SUMMARIZE BLOCK DEVICE I/O
LATENCY AS A HISTOGRAM
https://github.com/iovisor/bcc/blob/master/tools/biolatency_example.txt
• Traditional tools such iostat and
sar show average latency which
can be misleading (Hide latency
outliers)
• Need to study the full distribution
• Biolatency based on kernel
tracepoints (blk_start_request,
blk_account_io_completion,etc)
16
BCC/TOOLS : EXT4SLOWER TRACE SLOW EXT4 OPERATIONS.
https://github.com/iovisor/bcc/blob/master/tools/ext4slower_example.txt
• Better measure of the latency
suffered by applications reading
from the file system.
• The measured Latency spans
• block device I/O (disk I/O)
• file system CPU cycles
• file system locks
• run queue latency
• etc
Great CPU
saturation metric !
17
BCC/TOOLS : RUNQLAT: RUN QUEUE (SCHEDULER)
LATENCY AS A HISTOGRAM
https://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt
• The best CPU saturation metrics
are measures of run queue (or
scheduler) latency.
• Time a task spends waiting on a
run queue for a turn on-CPU,
• Better than the run queue length
metric for estimating the
magnitude of CPU saturation !
18
BCC/TOOLS : RUNQLAT: RUN QUEUE (SCHEDULER)
LATENCY AS A HISTOGRAM
https://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt
19
https://github.com/brendangregg/systemtap-lwtools
https://github.com/iovisor/bcc
https://github.com/brendangregg/perf-tools
20
SYSTEMTAP : SCHEDTIMES_WSI.STP : TRACK TIME
PROCESSES SPEND IN VARIOUS STATES
https://mahmoudhatem.wordpress.com/2017/02/06/extending-systemtap-scripts-with-oracle-session-info/
• Bring application context to your monitoring tools !
21
USERSPACE STATIC TRACEPOINT/USDT
22
BCC/TOOLS : DBSLOWER: TRACE MYSQL/POSTGRESQL
QUERIES SLOWER THAN A THRESHOLD
https://github.com/iovisor/bcc/blob/master/tools/dbslower_example.txt
• dbslower is based USDT probes
(needs MySQL and PostgreSQL
built with USDT (DTrace) support.
23
ORACLE database don’t have USDT support 
24
DYNAMIC TRACING
25
DYNAMIC TRACING
• Dynamically instrumenting (creating events
in) any software location.
• kprobes: kernel dynamic tracing
• uprobes: user-level dynamic tracing
• No need to modify the probed process's
binaries or restart the program.
26
DYNAMIC TRACING (UPROBE)
• Function prologue of “kskthewt”(called at the end of an Oracle wait event) before inserting
probe point :
• After inserting a probe point at function call : The original opcode was replaced with
int3 (software interrupt).
https://mahmoudhatem.wordpress.com/2017/03/21/uprobes-issue-with-oracle-12c/
27
TRACING ORACLE KERNEL FUNCTIONS
kcbgtcr
kcbgcur
kcbzib
kskthbwt
kskthewt
kcbzgb
kcbzvb
opiexe
delrow
qerdlFetch
kpoal8
28
SYSTEMTAP : TRACING ORACLE WAIT EVENTS
https://externaltable.blogspot.com/2014/09/systemtap-into-oracle-for-fun-and-profit.html
29
SYSTEMTAP : AGGREGATIONS AND FILTERING OF
WAIT EVENT DATA
https://externaltable.blogspot.com/2014/09/systemtap-into-oracle-for-fun-and-profit.html
Collect and display microsec-precision histograms for all Oracle version (Note 12.1.0.2 has V$EVENT_HISTOGRAM_MICRO)
What this wait event and the
other I/O wait events are really
measuring ?
30
SYSTEMTAP : WHAT ARE THE I/O-RELATED WAIT EVENTS
REALLY MEASURING? [TRACING LOGICAL AND PHYSICAL I/O ]
https://externaltable.blogspot.com/2014/11/life-of-oracle-io-tracing-logical-and.html
The elapsed time for the wait event
"direct path read" does not
accurately reflect I/O latency
31
TRACING BEYOND FUNCTION BOUNDARY
PROBE AT SPECIFIC ORACLE KERNEL FUNCTION OFFSET
32
SYSTEMTAP : A SIMPLE USER/PASSWORD SNIFFER
https://mahmoudhatem.wordpress.com/2018/03/23/systemtap-probe-at-specific-oracle-function-offset-bonus/
• Powerful and scary at the same time !
33
TRACING PL/SQL
34
SYSTEMTAP : TRACING PL/SQL WITH LINE NUMBER
https://mahmoudhatem.wordpress.com/2017/09/15/geeky-plsql-tracerprofiler-first-step/
35
SYSTEMTAP : TRACING PL/SQL SUBPROGRAM CALLS WITH
PARAMETERS VALUES
https://mahmoudhatem.wordpress.com/2017/11/29/tracing-pl-sql-subprogram-calls-with-parameters-values-dynamic-tracing/
36
SYSTEMTAP : FROM MEMORY REQUEST TO PL/SQL SOURCE LINE
https://mahmoudhatem.wordpress.com/2018/01/15/from-memory-request-to-pl-sql-source-line/
Based on v$process_memory_detail
37
MONKEY PATCHING
ACTIVE MANIPULATIONS OF STATE
38
SYSTEMTAP : A MINI ORACLE DB FIREWALL [LIVE PATCHING]
https://mahmoudhatem.wordpress.com/2016/04/18/systemtap-a-mini-oracle-db-firewall/
https://externaltable.blogspot.com/2016/03/systemtap-guru-mode-and-oracle-sql.html
39
SYSTEMTAP : PLAYING WITH ORACLE DB 18C ON-PREMISES BEFORE
OFFICIAL RELEASE
https://mahmoudhatem.wordpress.com/2018/03/01/playing-with-oracle-db-18c-on-premises-before-official-release/
40
DEEPER LOOK AT CPU
UTILIZATION
• Which code-paths are causing high CPU usage ?
• What’s my CPU bottleneck ?
• How much my CPU are stalled ? For what resource ?
41
CPU PROFILING
• Linux advanced tracer tools are capable of lightweight profiling of CPU usage by stack
sampling such as :
• Systemtap
• Perf
• Bcc
• To quickly understand CPU usage the collected profiling data can be Visualized using a
Flame graphs.
http://www.brendangregg.com/flamegraphs.html
42
FLAMEGRAPH
https://fr.slideshare.net/ennael/kernel-recipes-2017-using-linux-perf-at-netflix-brendan-gregg
43
EXTENDED FLAMEGRAPH : WAIT EVENTS
https://mahmoudhatem.wordpress.com/2016/09/23/perf_events-offonmixed-cpu-flamegraph-extended-with-oracle-wait-events/
https://db-blog.web.cern.ch/blog/luca-canali/2015-11-oracle-wait-events-investigated-extended-stack-profiling-and-flame-graphs
44
EXTENDED FLAMEGRAPH : PL/SQL PROGRAM AND LINE NUMBER
https://mahmoudhatem.wordpress.com/2017/09/22/geeky-plsql-tracerprofiler-another-step/
45
BUT WHAT THAT FUNCTIONS WAS DOING WHEN
THEY WHERE ON-CPU ? RUNNING OR STALLED ?
46
CPU UTILIZATION IS WRONG
http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html
47
WHEN THE CPU UTILIZATION DOES NOT TELL YOU
THE UTILIZATION OF THE CPU
PERFORMANCE MONITOR COUNTER - A BETTER WAY TO MEASURE CPU UTILIZATION
*The next sections are only covering the Intel platforms
48
HARDWARE EVENTS (PMC)
• PMCs instrument low-level processor activity
• Can be used to understand how efficiently a workload uses the processor resources (CPU caches,
MMU, memory busses, CPU interconnects,Execution units,etc)
• PMCs :
• Cores : Measure only values on a single core
• Uncore : The shared socket-wide values
49
HARDWARE EVENTS
https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-m
• PMCs are documented in the Intel Software Developer's Manual Volume 3B: System Programming
Guide, Part 2
50
HARDWARE EVENTS
• Not all of them are listed when using perf list !
51
HIGH-LEVEL METRICS (IPC A GENERAL EFFICIENCY METRIC )
• Events can be observed and combined to create useful high-level metrics such as Instruction per
Cycle (IPC)
* Modern superscalar processors can issue multiples instructions per cycle
52
CPI FLAME GRAPH
• The color now shows what that
function was doing when it was on-
CPU: running or stalled
• Highest CPI blue (slowest
instructions)
• Lowest CPI red (fastest
instructions)
• Visualization of CPU efficiency by
function.
https://mahmoudhatem.wordpress.com/2017/10/26/deeper-look-at-cpu-utilization-the-power-of-pmu-events/
get consistent read
53
IPC INTERPRETATION AND ACTIONABLE ITEMS
http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html
• A good starting point for identifying what the CPU is really doing is IPC (Instruction per cycle)
54
WHERE ARE WE REALLY WASTING OUR PRECIOUS CPU CYCLES ?
False data sharing
Split Stores
Loads Blocked by Store Forwarding
4K Aliasing
DTLB miss
Microcode assists
Memory Bandwidth
Memory Latency
Bad speculation
Port Utilization
L1 miss
L2 miss
Vectorization
Remote DRAM
55
PMC-CLOUD-TOOLS/PMCARCH
https://github.com/brendangregg/pmc-cloud-tools
56
PMC-CLOUD-TOOLS/TLBSTAT
https://github.com/brendangregg/pmc-cloud-tools
57
PMC-CLOUD-TOOLS/CPUCACHE
https://github.com/brendangregg/pmc-cloud-tools
58
MESURING IPC IS GOOD STARTING POINT BUT HOW
TO DRILL DOWN FURTHER ?
A specific microarchitecture may make available hundreds of events through its PMU !
Which events are useful in detecting the true bottleneck ?
Require and in-depth knowledge of both the microarchitecture design and PMU specifications !
“Analysis without a methodology can become a fishing expedition, where
metrics are examined ad hoc, until the issue is found –if it is at all.”
Source: Brendan D. Gregg,
http://www.brendangregg.com/methodology.html
59
TOP-DOWN MICRO-ARCHITECTURE ANALYSIS
METHOD [ TMAM ]
• Systematically Find True Bottleneck (Eliminates guess work)
• Provide an hierarchical execution cycles breakdown (CPI breakdown)
• Avoids the µ-arch high-learning curve
• Correctly Characterizes All Workloads
• Frequent performance bottlenecks are organized in a hierarchical structure
https://software.intel.com/en-us/vtune-amplifier-help-tuning-applications-using-a-top-down-microarchitecture-analysis-method
60
THE TMAM HIERARCHY
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
61
PERF
http://cs.haifa.ac.il/~yosi/PARC/yasin.pdf
Linux perf supports TopDown Level-1 metrics since Linux kernel 4.8
62
PMU-TOOLS : TOPLEV.PY
https://github.com/andikleen/pmu-tools/wiki/toplev-manual
• toplev is a tool, part of pmu-tools, that implements TopDown on top of Linux perf
63
INTEL VTUNE : GENERAL EXPLORATION
https://software.intel.com/en-us/intel-vtune-amplifier-xe
64
INTEL VTUNE : GROUPING BY FUNCTION/CALL STACK
https://software.intel.com/en-us/intel-vtune-amplifier-xe
get consistent read
kernel data scan table full
65
TMAM EXAMPLE
TEST env : ORACLE 12.2.0.1/OEL 7.0 /kernel-3.10 /Processor i5-6500 /2*DDR3-1600 (4GB*2)
Testing the impact of huge pages with SLOB LIO test & intel vtune
66
SLOB CONF
67
WITHOUT HUGEPAGES : LIOPS 3 099 420
DTLB overhead was measured using the following formula
68
WITH HUGEPAGES : LIOPS 3 415 969 About 10% improvement
Workload Characterization
How much ??
69
MEASURING MEMORY THROUGHPUT
https://github.com/LucaCanali/Miscellaneous/blob/master/Spark_Notes/Tools_Linux_Memory_Perf_Measure.md
• Other tools that can be used to measure memory throughput and many other metrics (QPI utilisation,
power consumption,local and remote memory bandwidth,etc) :
• Intel Processor Counter Monitor (PCM)
• Likwid
• pmu-tools
• Perf (ex:MEM_BW_READS = CAS_COUNT.RD*64 (size of cache line).)
https://yunmingzhang.wordpress.com/2015/07/22/measure-memory-bandwidth-using-uncore-counters/
High memory bandwidth
utilization can have an impact
on main memory latency !
70
MEMORY BANDWIDTH VS LATENCY RESPONSE CURVE
• Even if this two concepts are often described independently they are inherently interrelated.
• According to Bruce Jacob in ” The memory system: you can’t avoid it, you can’t ignore it, you can’t
fake it” the bandwidth vs latency response curve for a system has three regions :
• Constant region: The latency response is fairly constant for the first 40% of the sustained bandwidth.
• Linear region: In between 40% to 80% of the sustained bandwidth, the latency response increases almost linearly with
the bandwidth demand of the system due to contention overhead by numerous memory requests.
• Exponential region: Between 80% to 100% of the sustained bandwidth, the memory latency is dominated by the
contention latency which can be as much as twice the idle latency or more.
• Maximum sustained bandwidth : Is 65% to 75% of the theoretical maximum bandwidth.
https://mahmoudhatem.wordpress.com/2017/11/07/memory-bandwidth-vs-latency-response-curve/
71
MEMORY BANDWIDTH VS LATENCY RESPONSE CURVE
• Visualization of how memory latency is affected by the increase of the memory bandwidth
consumption.
• Armed with Intel Memory Latency Checker (MLC) let’s check our current system !
https://mahmoudhatem.wordpress.com/2017/11/07/memory-bandwidth-vs-latency-response-curve/
72
“PMCS ARE CRUCIAL FOR ANALYZING A (IF NOT THE)
MODERN SYSTEM BOTTLENECK: MEMORY I/O.”
http://www.brendangregg.com/blog/2017-05-04/the-pmcs-of-ec2.html
Brendan Gregg
73
THANK YOU FOR YOUR
ATTENTION
https://mahmoudhatem.wordpress.com
@Hatem__Mahmoud
https://linkedin.com/in/mahmoudhatemoracle
1 of 73

Recommended

Performance Wins with BPF: Getting Started by
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedBrendan Gregg
2K views24 slides
Oracle events hunting [POUG19] by
Oracle events hunting [POUG19]Oracle events hunting [POUG19]
Oracle events hunting [POUG19]Mahmoud Hatem
834 views22 slides
Understanding DPDK algorithmics by
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmicsDenys Haryachyy
10.6K views17 slides
OSNoise Tracer: Who Is Stealing My CPU Time? by
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
947 views29 slides
Linux 4.x Tracing Tools: Using BPF Superpowers by
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
210.2K views68 slides
Performance Analysis Tools for Linux Kernel by
Performance Analysis Tools for Linux KernelPerformance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux Kernellcplcp1
1.6K views59 slides

More Related Content

What's hot

IntelON 2021 Processor Benchmarking by
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingBrendan Gregg
1K views17 slides
Network Programming: Data Plane Development Kit (DPDK) by
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
2.3K views65 slides
USENIX ATC 2017: Visualizing Performance with Flame Graphs by
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
672.3K views66 slides
Velocity 2015 linux perf tools by
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
1.1M views142 slides
A Percona Support Engineer Walkthrough on pt-stalk by
A Percona Support Engineer Walkthrough on pt-stalkA Percona Support Engineer Walkthrough on pt-stalk
A Percona Support Engineer Walkthrough on pt-stalkMarcelo Altmann
552 views30 slides
Linux BPF Superpowers by
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF SuperpowersBrendan Gregg
422.7K views60 slides

What's hot(20)

IntelON 2021 Processor Benchmarking by Brendan Gregg
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
Brendan Gregg1K views
Network Programming: Data Plane Development Kit (DPDK) by Andriy Berestovskyy
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy2.3K views
USENIX ATC 2017: Visualizing Performance with Flame Graphs by Brendan Gregg
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Brendan Gregg672.3K views
Velocity 2015 linux perf tools by Brendan Gregg
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
Brendan Gregg1.1M views
A Percona Support Engineer Walkthrough on pt-stalk by Marcelo Altmann
A Percona Support Engineer Walkthrough on pt-stalkA Percona Support Engineer Walkthrough on pt-stalk
A Percona Support Engineer Walkthrough on pt-stalk
Marcelo Altmann552 views
Linux BPF Superpowers by Brendan Gregg
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
Brendan Gregg422.7K views
How to Use EXAchk Effectively to Manage Exadata Environments by Sandesh Rao
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata Environments
Sandesh Rao753 views
Apache httpd ( 아파치 웹서버 ) 설치 가이드 by Opennaru, inc.
Apache httpd ( 아파치 웹서버 ) 설치 가이드Apache httpd ( 아파치 웹서버 ) 설치 가이드
Apache httpd ( 아파치 웹서버 ) 설치 가이드
Opennaru, inc. 205 views
Intel DPDK Step by Step instructions by Hisaki Ohara
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
Hisaki Ohara56.9K views
New Ways to Find Latency in Linux Using Tracing by ScyllaDB
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
ScyllaDB2.3K views
Memory access tracing [poug17] by Mahmoud Hatem
Memory access tracing [poug17]Memory access tracing [poug17]
Memory access tracing [poug17]
Mahmoud Hatem3.9K views
Linux Kernel vs DPDK: HTTP Performance Showdown by ScyllaDB
Linux Kernel vs DPDK: HTTP Performance ShowdownLinux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance Showdown
ScyllaDB1.6K views
Linux Performance Analysis: New Tools and Old Secrets by Brendan Gregg
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg603.9K views
Top 5 mistakes when writing Spark applications by hadooparchbook
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
hadooparchbook14.6K views
SparkSQL: A Compiler from Queries to RDDs by Databricks
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDs
Databricks6.1K views
Physical Memory Models.pdf by Adrian Huang
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
Adrian Huang472 views
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2 by Tanel Poder
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder12.6K views
From DTrace to Linux by Brendan Gregg
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
Brendan Gregg23.3K views

Similar to The power of linux advanced tracer [POUG18]

Using VPP and SRIO-V with Clear Containers by
Using VPP and SRIO-V with Clear ContainersUsing VPP and SRIO-V with Clear Containers
Using VPP and SRIO-V with Clear ContainersMichelle Holley
1.7K views33 slides
eBPF Basics by
eBPF BasicseBPF Basics
eBPF BasicsMichael Kehoe
2.7K views63 slides
Dataplane programming with eBPF: architecture and tools by
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsStefano Salsano
209 views102 slides
Modern Linux Tracing Landscape by
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing LandscapeSasha Goldshtein
1.9K views30 slides
Install FD.IO VPP On Intel(r) Architecture & Test with Trex* by
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Michelle Holley
1.4K views21 slides
PaaSTA: Autoscaling at Yelp by
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpNathan Handler
872 views54 slides

Similar to The power of linux advanced tracer [POUG18](20)

Using VPP and SRIO-V with Clear Containers by Michelle Holley
Using VPP and SRIO-V with Clear ContainersUsing VPP and SRIO-V with Clear Containers
Using VPP and SRIO-V with Clear Containers
Michelle Holley1.7K views
Dataplane programming with eBPF: architecture and tools by Stefano Salsano
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
Stefano Salsano209 views
Install FD.IO VPP On Intel(r) Architecture & Test with Trex* by Michelle Holley
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Michelle Holley1.4K views
Kubecon seattle 2018 workshop slides by Weaveworks
Kubecon seattle 2018 workshop slidesKubecon seattle 2018 workshop slides
Kubecon seattle 2018 workshop slides
Weaveworks969 views
Blackhat USA 2016 - What's the DFIRence for ICS? by Chris Sistrunk
Blackhat USA 2016 - What's the DFIRence for ICS?Blackhat USA 2016 - What's the DFIRence for ICS?
Blackhat USA 2016 - What's the DFIRence for ICS?
Chris Sistrunk1.7K views
Native Support of Prometheus Monitoring in Apache Spark 3.0 by Databricks
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0
Databricks2.4K views
Unmasking Careto through Memory Forensics (video in description) by Andrew Case
Unmasking Careto through Memory Forensics (video in description)Unmasking Careto through Memory Forensics (video in description)
Unmasking Careto through Memory Forensics (video in description)
Andrew Case1.7K views
KSCOPE 2013: Exadata Consolidation Success Story by Kristofferson A
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success Story
Kristofferson A1.8K views
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming by Apache Apex
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Apache Apex2.2K views
Intro to open source telemetry linux con 2016 by Matthew Broberg
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
Matthew Broberg2.1K views
RMOUG 2013 - Where did my CPU go? by Kristofferson A
RMOUG 2013 - Where did my CPU go?RMOUG 2013 - Where did my CPU go?
RMOUG 2013 - Where did my CPU go?
Kristofferson A983 views
Where Did My CPU Go? by Enkitec
Where Did My CPU Go?Where Did My CPU Go?
Where Did My CPU Go?
Enkitec461 views
Rmoug13 - where did my CPU go? by Enkitec
Rmoug13 - where did my CPU go?Rmoug13 - where did my CPU go?
Rmoug13 - where did my CPU go?
Enkitec321 views
Introduction to architecture exploration by Deepak Shankar
Introduction to architecture explorationIntroduction to architecture exploration
Introduction to architecture exploration
Deepak Shankar139 views
Grabbing the PostgreSQL Elephant by the Trunk by Harold Giménez
Grabbing the PostgreSQL Elephant by the TrunkGrabbing the PostgreSQL Elephant by the Trunk
Grabbing the PostgreSQL Elephant by the Trunk
Harold Giménez2.2K views
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016 by Zabbix
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Zabbix1.1K views

Recently uploaded

Special_edition_innovator_2023.pdf by
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdfWillDavies22
16 views6 slides
DALI Basics Course 2023 by
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023Ivory Egg
14 views12 slides
Attacking IoT Devices from a Web Perspective - Linux Day by
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day Simone Onofri
15 views68 slides
Future of Learning - Khoong Chan Meng by
Future of Learning - Khoong Chan MengFuture of Learning - Khoong Chan Meng
Future of Learning - Khoong Chan MengNUS-ISS
33 views7 slides
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV by
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTVSplunk
88 views20 slides
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Safe Software
225 views86 slides

Recently uploaded(20)

Special_edition_innovator_2023.pdf by WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2216 views
DALI Basics Course 2023 by Ivory Egg
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023
Ivory Egg14 views
Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri15 views
Future of Learning - Khoong Chan Meng by NUS-ISS
Future of Learning - Khoong Chan MengFuture of Learning - Khoong Chan Meng
Future of Learning - Khoong Chan Meng
NUS-ISS33 views
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV by Splunk
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
Splunk88 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software225 views
[2023] Putting the R! in R&D.pdf by Eleanor McHugh
[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdf
Eleanor McHugh38 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi120 views
Future of Learning - Yap Aye Wee.pdf by NUS-ISS
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
NUS-ISS41 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman27 views
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex19 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab15 views
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica... by NUS-ISS
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...
NUS-ISS16 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu... by NUS-ISS
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
NUS-ISS37 views
Combining Orchestration and Choreography for a Clean Architecture by ThomasHeinrichs1
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean Architecture
ThomasHeinrichs169 views

The power of linux advanced tracer [POUG18]

  • 1. THE POWER OF LINUX ADVANCED TRACER HATEM MAHMOUD HTTPS://MAHMOUDHATEM.WORDPRESS.COM HIGH FIVE POUG
  • 2. 2 WHO AM I Oracle DBA Oracle experience: 7 years Located in TUNISIA Oracle Certified Master Oracle geek https://mahmoudhatem.wordpress.com
  • 3. 3 TAKE AWAYS • Better understanding of Linux tracing landscape • Getting an idea of what can be done. As someone else said : “Knowing what can be done is more important than knowing how to do it - you can always google that”
  • 4. 4 AGENDA 1. Linux tracing landscape 2. Static tracing 3. Dynamic tracing 4. Monkey patching 5. Deeper look at CPU utilization
  • 7. 7 LINUX TRACING LANDSCAPE eBPF kprobe uprobe tracepoints software events hardware events systemtap perf_events bcc/ebpf USDT Ftrace
  • 8. 8 LINUX TRACING SYSTEMS • systemtap,perf,bcc,pmu-tools,etcFront-end tools • stap module,eBPF,perf_events (perf_event_open syscall ),ftrace(/sys/kernel/debug/tracing),etc Mechanisms for extracting data • kprobes and uprobes (dynamic tracing), • tracepoints ,software events and USDT (static tracing) • PMCs (hardware counters). • Etc Event source https://jvns.ca/blog/2017/07/05/linux-tracing-systems/ Breakdown as suggested by Brendan Gregg and Julia Evans
  • 9. 9 LINUX TRACING SYSTEMS • systemtap,perf,bcc,pmu-tools,etcFront-end tools • stap module,eBPF,perf_events (perf_event_open syscall ),ftrace(/sys/kernel/debug/tracing),etc Mechanisms for extracting data • kprobes and uprobes (dynamic tracing), • tracepoints ,software events and USDT (static tracing) • PMCs (hardware counters). • Etc Event source https://jvns.ca/blog/2017/07/05/linux-tracing-systems/ Breakdown as suggested by Brendan Gregg and Julia Evans
  • 10. 10 LINUX TRACING SYSTEMS • systemtap,perf,bcc,pmu-tools,etcFront-end tools • stap module,eBPF,perf_events (perf_event_open syscall ),ftrace(/sys/kernel/debug/tracing),etc Mechanisms for extracting data • kprobes and uprobes (dynamic tracing), • tracepoints ,software events and USDT (static tracing) • PMCs (hardware counters). • Etc Event source https://jvns.ca/blog/2017/07/05/linux-tracing-systems/ Breakdown as suggested by Brendan Gregg and Julia Evans
  • 11. 11 LINUX TRACING SYSTEMS • systemtap,perf,bcc,pmu-tools,etcFront-end tools • stap module,eBPF,perf_events (perf_event_open syscall ),ftrace(/sys/kernel/debug/tracing),etc Mechanisms for extracting data • kprobes and uprobes (dynamic tracing), • tracepoints ,software events and USDT (static tracing) • PMCs (hardware counters). • Etc Event source https://jvns.ca/blog/2017/07/05/linux-tracing-systems/ Breakdown as suggested by Brendan Gregg and Julia Evans
  • 13. 13 STATIC TRACING Tracepoints : • Kernel predefined trace probe • Inserted by kernel developers at important locations in the code (system calls, disk I/O, etc) User Statically-Defined Tracing (USDT) : • Application predefined trace probe • Inserted by application developers at important locations in the code, Software Events : • kernel counters (CPU migrations, minor faults, major faults,etc) http://www.brendangregg.com/perf.html
  • 15. 15 BCC/TOOLS : BIOLATENCY SUMMARIZE BLOCK DEVICE I/O LATENCY AS A HISTOGRAM https://github.com/iovisor/bcc/blob/master/tools/biolatency_example.txt • Traditional tools such iostat and sar show average latency which can be misleading (Hide latency outliers) • Need to study the full distribution • Biolatency based on kernel tracepoints (blk_start_request, blk_account_io_completion,etc)
  • 16. 16 BCC/TOOLS : EXT4SLOWER TRACE SLOW EXT4 OPERATIONS. https://github.com/iovisor/bcc/blob/master/tools/ext4slower_example.txt • Better measure of the latency suffered by applications reading from the file system. • The measured Latency spans • block device I/O (disk I/O) • file system CPU cycles • file system locks • run queue latency • etc Great CPU saturation metric !
  • 17. 17 BCC/TOOLS : RUNQLAT: RUN QUEUE (SCHEDULER) LATENCY AS A HISTOGRAM https://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt • The best CPU saturation metrics are measures of run queue (or scheduler) latency. • Time a task spends waiting on a run queue for a turn on-CPU, • Better than the run queue length metric for estimating the magnitude of CPU saturation !
  • 18. 18 BCC/TOOLS : RUNQLAT: RUN QUEUE (SCHEDULER) LATENCY AS A HISTOGRAM https://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt
  • 20. 20 SYSTEMTAP : SCHEDTIMES_WSI.STP : TRACK TIME PROCESSES SPEND IN VARIOUS STATES https://mahmoudhatem.wordpress.com/2017/02/06/extending-systemtap-scripts-with-oracle-session-info/ • Bring application context to your monitoring tools !
  • 22. 22 BCC/TOOLS : DBSLOWER: TRACE MYSQL/POSTGRESQL QUERIES SLOWER THAN A THRESHOLD https://github.com/iovisor/bcc/blob/master/tools/dbslower_example.txt • dbslower is based USDT probes (needs MySQL and PostgreSQL built with USDT (DTrace) support.
  • 23. 23 ORACLE database don’t have USDT support 
  • 25. 25 DYNAMIC TRACING • Dynamically instrumenting (creating events in) any software location. • kprobes: kernel dynamic tracing • uprobes: user-level dynamic tracing • No need to modify the probed process's binaries or restart the program.
  • 26. 26 DYNAMIC TRACING (UPROBE) • Function prologue of “kskthewt”(called at the end of an Oracle wait event) before inserting probe point : • After inserting a probe point at function call : The original opcode was replaced with int3 (software interrupt). https://mahmoudhatem.wordpress.com/2017/03/21/uprobes-issue-with-oracle-12c/
  • 27. 27 TRACING ORACLE KERNEL FUNCTIONS kcbgtcr kcbgcur kcbzib kskthbwt kskthewt kcbzgb kcbzvb opiexe delrow qerdlFetch kpoal8
  • 28. 28 SYSTEMTAP : TRACING ORACLE WAIT EVENTS https://externaltable.blogspot.com/2014/09/systemtap-into-oracle-for-fun-and-profit.html
  • 29. 29 SYSTEMTAP : AGGREGATIONS AND FILTERING OF WAIT EVENT DATA https://externaltable.blogspot.com/2014/09/systemtap-into-oracle-for-fun-and-profit.html Collect and display microsec-precision histograms for all Oracle version (Note 12.1.0.2 has V$EVENT_HISTOGRAM_MICRO) What this wait event and the other I/O wait events are really measuring ?
  • 30. 30 SYSTEMTAP : WHAT ARE THE I/O-RELATED WAIT EVENTS REALLY MEASURING? [TRACING LOGICAL AND PHYSICAL I/O ] https://externaltable.blogspot.com/2014/11/life-of-oracle-io-tracing-logical-and.html The elapsed time for the wait event "direct path read" does not accurately reflect I/O latency
  • 31. 31 TRACING BEYOND FUNCTION BOUNDARY PROBE AT SPECIFIC ORACLE KERNEL FUNCTION OFFSET
  • 32. 32 SYSTEMTAP : A SIMPLE USER/PASSWORD SNIFFER https://mahmoudhatem.wordpress.com/2018/03/23/systemtap-probe-at-specific-oracle-function-offset-bonus/ • Powerful and scary at the same time !
  • 34. 34 SYSTEMTAP : TRACING PL/SQL WITH LINE NUMBER https://mahmoudhatem.wordpress.com/2017/09/15/geeky-plsql-tracerprofiler-first-step/
  • 35. 35 SYSTEMTAP : TRACING PL/SQL SUBPROGRAM CALLS WITH PARAMETERS VALUES https://mahmoudhatem.wordpress.com/2017/11/29/tracing-pl-sql-subprogram-calls-with-parameters-values-dynamic-tracing/
  • 36. 36 SYSTEMTAP : FROM MEMORY REQUEST TO PL/SQL SOURCE LINE https://mahmoudhatem.wordpress.com/2018/01/15/from-memory-request-to-pl-sql-source-line/ Based on v$process_memory_detail
  • 38. 38 SYSTEMTAP : A MINI ORACLE DB FIREWALL [LIVE PATCHING] https://mahmoudhatem.wordpress.com/2016/04/18/systemtap-a-mini-oracle-db-firewall/ https://externaltable.blogspot.com/2016/03/systemtap-guru-mode-and-oracle-sql.html
  • 39. 39 SYSTEMTAP : PLAYING WITH ORACLE DB 18C ON-PREMISES BEFORE OFFICIAL RELEASE https://mahmoudhatem.wordpress.com/2018/03/01/playing-with-oracle-db-18c-on-premises-before-official-release/
  • 40. 40 DEEPER LOOK AT CPU UTILIZATION • Which code-paths are causing high CPU usage ? • What’s my CPU bottleneck ? • How much my CPU are stalled ? For what resource ?
  • 41. 41 CPU PROFILING • Linux advanced tracer tools are capable of lightweight profiling of CPU usage by stack sampling such as : • Systemtap • Perf • Bcc • To quickly understand CPU usage the collected profiling data can be Visualized using a Flame graphs. http://www.brendangregg.com/flamegraphs.html
  • 43. 43 EXTENDED FLAMEGRAPH : WAIT EVENTS https://mahmoudhatem.wordpress.com/2016/09/23/perf_events-offonmixed-cpu-flamegraph-extended-with-oracle-wait-events/ https://db-blog.web.cern.ch/blog/luca-canali/2015-11-oracle-wait-events-investigated-extended-stack-profiling-and-flame-graphs
  • 44. 44 EXTENDED FLAMEGRAPH : PL/SQL PROGRAM AND LINE NUMBER https://mahmoudhatem.wordpress.com/2017/09/22/geeky-plsql-tracerprofiler-another-step/
  • 45. 45 BUT WHAT THAT FUNCTIONS WAS DOING WHEN THEY WHERE ON-CPU ? RUNNING OR STALLED ?
  • 46. 46 CPU UTILIZATION IS WRONG http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html
  • 47. 47 WHEN THE CPU UTILIZATION DOES NOT TELL YOU THE UTILIZATION OF THE CPU PERFORMANCE MONITOR COUNTER - A BETTER WAY TO MEASURE CPU UTILIZATION *The next sections are only covering the Intel platforms
  • 48. 48 HARDWARE EVENTS (PMC) • PMCs instrument low-level processor activity • Can be used to understand how efficiently a workload uses the processor resources (CPU caches, MMU, memory busses, CPU interconnects,Execution units,etc) • PMCs : • Cores : Measure only values on a single core • Uncore : The shared socket-wide values
  • 49. 49 HARDWARE EVENTS https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-m • PMCs are documented in the Intel Software Developer's Manual Volume 3B: System Programming Guide, Part 2
  • 50. 50 HARDWARE EVENTS • Not all of them are listed when using perf list !
  • 51. 51 HIGH-LEVEL METRICS (IPC A GENERAL EFFICIENCY METRIC ) • Events can be observed and combined to create useful high-level metrics such as Instruction per Cycle (IPC) * Modern superscalar processors can issue multiples instructions per cycle
  • 52. 52 CPI FLAME GRAPH • The color now shows what that function was doing when it was on- CPU: running or stalled • Highest CPI blue (slowest instructions) • Lowest CPI red (fastest instructions) • Visualization of CPU efficiency by function. https://mahmoudhatem.wordpress.com/2017/10/26/deeper-look-at-cpu-utilization-the-power-of-pmu-events/ get consistent read
  • 53. 53 IPC INTERPRETATION AND ACTIONABLE ITEMS http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html • A good starting point for identifying what the CPU is really doing is IPC (Instruction per cycle)
  • 54. 54 WHERE ARE WE REALLY WASTING OUR PRECIOUS CPU CYCLES ? False data sharing Split Stores Loads Blocked by Store Forwarding 4K Aliasing DTLB miss Microcode assists Memory Bandwidth Memory Latency Bad speculation Port Utilization L1 miss L2 miss Vectorization Remote DRAM
  • 58. 58 MESURING IPC IS GOOD STARTING POINT BUT HOW TO DRILL DOWN FURTHER ? A specific microarchitecture may make available hundreds of events through its PMU ! Which events are useful in detecting the true bottleneck ? Require and in-depth knowledge of both the microarchitecture design and PMU specifications ! “Analysis without a methodology can become a fishing expedition, where metrics are examined ad hoc, until the issue is found –if it is at all.” Source: Brendan D. Gregg, http://www.brendangregg.com/methodology.html
  • 59. 59 TOP-DOWN MICRO-ARCHITECTURE ANALYSIS METHOD [ TMAM ] • Systematically Find True Bottleneck (Eliminates guess work) • Provide an hierarchical execution cycles breakdown (CPI breakdown) • Avoids the µ-arch high-learning curve • Correctly Characterizes All Workloads • Frequent performance bottlenecks are organized in a hierarchical structure https://software.intel.com/en-us/vtune-amplifier-help-tuning-applications-using-a-top-down-microarchitecture-analysis-method
  • 61. 61 PERF http://cs.haifa.ac.il/~yosi/PARC/yasin.pdf Linux perf supports TopDown Level-1 metrics since Linux kernel 4.8
  • 62. 62 PMU-TOOLS : TOPLEV.PY https://github.com/andikleen/pmu-tools/wiki/toplev-manual • toplev is a tool, part of pmu-tools, that implements TopDown on top of Linux perf
  • 63. 63 INTEL VTUNE : GENERAL EXPLORATION https://software.intel.com/en-us/intel-vtune-amplifier-xe
  • 64. 64 INTEL VTUNE : GROUPING BY FUNCTION/CALL STACK https://software.intel.com/en-us/intel-vtune-amplifier-xe get consistent read kernel data scan table full
  • 65. 65 TMAM EXAMPLE TEST env : ORACLE 12.2.0.1/OEL 7.0 /kernel-3.10 /Processor i5-6500 /2*DDR3-1600 (4GB*2) Testing the impact of huge pages with SLOB LIO test & intel vtune
  • 67. 67 WITHOUT HUGEPAGES : LIOPS 3 099 420 DTLB overhead was measured using the following formula
  • 68. 68 WITH HUGEPAGES : LIOPS 3 415 969 About 10% improvement Workload Characterization How much ??
  • 69. 69 MEASURING MEMORY THROUGHPUT https://github.com/LucaCanali/Miscellaneous/blob/master/Spark_Notes/Tools_Linux_Memory_Perf_Measure.md • Other tools that can be used to measure memory throughput and many other metrics (QPI utilisation, power consumption,local and remote memory bandwidth,etc) : • Intel Processor Counter Monitor (PCM) • Likwid • pmu-tools • Perf (ex:MEM_BW_READS = CAS_COUNT.RD*64 (size of cache line).) https://yunmingzhang.wordpress.com/2015/07/22/measure-memory-bandwidth-using-uncore-counters/ High memory bandwidth utilization can have an impact on main memory latency !
  • 70. 70 MEMORY BANDWIDTH VS LATENCY RESPONSE CURVE • Even if this two concepts are often described independently they are inherently interrelated. • According to Bruce Jacob in ” The memory system: you can’t avoid it, you can’t ignore it, you can’t fake it” the bandwidth vs latency response curve for a system has three regions : • Constant region: The latency response is fairly constant for the first 40% of the sustained bandwidth. • Linear region: In between 40% to 80% of the sustained bandwidth, the latency response increases almost linearly with the bandwidth demand of the system due to contention overhead by numerous memory requests. • Exponential region: Between 80% to 100% of the sustained bandwidth, the memory latency is dominated by the contention latency which can be as much as twice the idle latency or more. • Maximum sustained bandwidth : Is 65% to 75% of the theoretical maximum bandwidth. https://mahmoudhatem.wordpress.com/2017/11/07/memory-bandwidth-vs-latency-response-curve/
  • 71. 71 MEMORY BANDWIDTH VS LATENCY RESPONSE CURVE • Visualization of how memory latency is affected by the increase of the memory bandwidth consumption. • Armed with Intel Memory Latency Checker (MLC) let’s check our current system ! https://mahmoudhatem.wordpress.com/2017/11/07/memory-bandwidth-vs-latency-response-curve/
  • 72. 72 “PMCS ARE CRUCIAL FOR ANALYZING A (IF NOT THE) MODERN SYSTEM BOTTLENECK: MEMORY I/O.” http://www.brendangregg.com/blog/2017-05-04/the-pmcs-of-ec2.html Brendan Gregg
  • 73. 73 THANK YOU FOR YOUR ATTENTION https://mahmoudhatem.wordpress.com @Hatem__Mahmoud https://linkedin.com/in/mahmoudhatemoracle