Nowadays system administrators have great choices when it comes down to Linux performance profiling and monitoring. The challenge is to pick the appropriate tools and interpret their results correctly.
This talk is a chance to take a tour through various performance profiling and benchmarking tools, focusing on their benefit for every sysadmin.
More than 25 different tools are presented. Ranging from well known tools like strace, iostat, tcpdump or vmstat to new features like Linux tracepoints or perf_events. You will also learn which tools can be monitored by Icinga and which monitoring plugins are already available for that.
At the end the goal is to gather reference points to look at, whenever you are faced with performance problems.
Take the chance to close your knowledge gaps and learn how to get the most out of your system.
9. _ Multi-Core: 8, 10, or 12 Cores
_ 8x Simultaneous Multi-Threading
_ 64, 80, or 96 Threads per CPU!
POWER8 CPU
POWERful Cores
10. _ 8x Simultaneous Multi-Threading
_ POWER8 can do SMT-1 -2 -4 -8
_ Each Core has dedicated resources for
up to 4 threads!
_ SMT-4 sometimes superior to SMT-8
POWER8 CPU
POWERful Cores
15. POWER8 CPU _ 1) per Chip: „2nd gen Nest Accelerator
Complex (NX)“
_ Symmetric Crypto
_ Compression Engine
_ Random Number Generator
_ 2) per Core
_ Symmetric Crypto
_ Cyclic Redundancy Check
POWERful Accelerators
16. _ 1) per Chip: NX
_ Symmetric Crypto
_ 2x 842 (De-)Compression Engine
_ Random Number Generator
_ → Available via Bare-metal Linux
Source: http://de.slideshare.net/sebastienchabrolles/enabling-power-8-advanced-features-on-linux
POWERful Accelerators
17. POWER8 CPU _ 2) per Core
_ Symmetric Crypto
_ Cyclic Redundancy Check
POWERful Accelerators
24. _ CPU address- and data-paths protected
_ parity or error-correcting codes (ECC)
_ Example: ECC error on a L2 cache line
→ line gets deactivated
„Everything has checksums“
25. „Processor Instruction Retry“
_ Within CPU core, soft error events might occur
_ Event can be detected before a failing
instruction is completed?
_ → CPU can try the operation again
_ Detected faults on the memory bus (CPU ↔
DIMM) can be tried again
26. „Enhanced PCIe error handling“
_ Outage/Crash of a PCIe card
→ Recovery Mechanism
_ Re-Init during runtime
_ Server Reboot NOT necessary
27. _ Architecture
_ Overview
_ RAS – Reliability, Availability, Serviceability
_ Performance
_ Open Source Firmware
OpenPOWER for the data center
30. _ SMT-8 enables more parallel transactions
_ large caches minimize latency:
_ CPU can store data and do operations
before beginning writing to the memory
subsystem
_ high memory bandwidth (up to 230 GB/s
per CPU)
_ optimizations for the POWER Platform
http://de.slideshare.net/MariaDB/ibm-linux-onpowerch (Slide 17)
40. Conclusions
_ „OpenPOWER is a very interesting platform to build an open
source stack with PostgreSQL starting with the firmware“
_ „POWER servers can consolidate Linux and Unix platform
solutions with high speed PostgreSQL“
_ „[Power]KVM as a virtualization alternative“
41. _ Architecture
_ Overview
_ RAS – Reliability, Availability, Serviceability
_ Performance
_ Open Source Firmware
OpenPOWER for the data center
44. *
) Independent BIOS Vendors
**
) Original Equipment Manufacturers
https://www.mitre.org/publications/technical-papers/presentation-extreme-privilege-escalation-on-windows-8uefi-systems
http://www.kb.cert.org/vuls/id/552286
22.11.2013 & 04.12.2013
Mitre informs
Intel and US CERT
Case# 14-2221
Intel contacts
IBVs*
& OEMs**
,
updates reference
code in May 2014
22.07.2014
CERT VU#552286
Note to vendors
07.08.2014
CERT VU#552286
published
21.10.2014
press reports
05.11.2014
still no update
from many vendors
45. *
) Independent BIOS Vendors
**
) Original Equipment Manufacturers
https://www.mitre.org/publications/technical-papers/presentation-extreme-privilege-escalation-on-windows-8uefi-systems
http://www.kb.cert.org/vuls/id/552286
22.11.2013 & 04.12.2013
Mitre informs
Intel and US CERT
Case# 14-2221
Intel contacts
IBVs*
& OEMs**
,
updates reference
code in May 2014
22.07.2014
CERT VU#552286
Note to vendors
07.08.2014
CERT VU#552286
published
21.10.2014
press reports
05.11.2014
still no update
from many vendors
1 Year (!)
46.
47. … “no comment to
this topic” ...
“... they were surprised
by the request
or could not
answer so far ...”
61. _ Self Boot Engine (SBE, ISTEPs 1-4)
_ HostBoot (ISTEP 5-21), z.B. ECC leeren
_ microkernel, has userspace
_ CPU bus init, memory init, core init,
_ On Chip Controller (OCC) – hard hw limits
_ PowerPC 405 core, hat eigenes Realtime-OS
_ SkiBoot (OPAL)
_ Linux / Petitboot (Bootloader)
_ Betriebssystem (Linux)
(ISTEP = IPL Step)
(IPL = Initial Program Load)
62. WOW - That's a lot
_ Yes, it's a lot.
_ ~600k unique LOC
_ ~24 million LOC from elsewhere (e.g. Linux, toolchain, libc,
ncruses, lvm, busybox etc)
_ A LOT of things happen before your computer is a computer
_ before the OS runs
_ before the screen works
63. How can they maintain this?
_ Only maintain what they HAVE TO
_ Take everything else from upstream
_ e.g. „Interface to choose the boot media“
_ Option 1: develop own interface,
own method to detect discs, PCI, ...
→ a lot of work!
66. How can they maintain this?
_ Only maintain what they HAVE TO
_ Take everything else from upstream
_ e.g. „Interface to choose the boot media“
_ Option 1: develop own interface,
own method to detect discs, PCI, ...
→ a lot of work!
_ Option 2: Linux is already there,
it has disc drivers, display drivers, ...
67. Was they HAVE TO
_ POWER specific
_ Hostboot
_ OCC
_ Skiboot (OPAL)
_ Generic
_ Petitboot
_ Op-build (build infrastructure) → buildroot → openwrt
_ Flash Manipulating Utilities
68. Was they don't have to
_ Linux (→ use upstream)
_ Userspace for Petitboot (→ taken from buildroot)
_ Build tooling (→ use buildroot)
_ Contributions UPSTREAMs first
_ local patch only when they HAVE TO
_ e.g. 8 Patches for Upstream Linux
(including „-openpower“ for the version string ;-)
69. Development process
_ Hostboot
_ github issues / pull requests
_ most dev done internally before chip exists
_ OCC
_ dev done internally (IBM)
_ Skiboot
_ mailing lists / github
_ Petitboot
_ mailing lists / github
_ op-build
_ github issues / pull requests
70. Interaction with UPSTREAM
_ Linux
_ Buildroot
_ Toolchain
_ POWER specific user space
_ other userspace components
86. Chassis board
Motherboard
Processor
board
Memory
board
Baseboard
Management
Controller
(BMC)
System bus
NVS Storage
SDR
SEL
FRU
Chassis
mgmt.
(Satellite
Controller)
Sensors & Controls
Fan sensor
Temp. sensor
Power control
Reset control
…
FRU
Temp. s.
FRU
private mgmt. busses
IPMB
M/B
Serial
Controller
Serial
Port
Sharing
BMC
Serial
Controller
Serial/Modem
interface
LAN
interface
Serial
Connector
LAN
Connector
PCI mgmt. bus
Network
(LAN)
Controller
Remote Mmgt. Card
(KVM over IP, ...)
Auxillary
IPMB Connector
ICMB
ICMB
bridge
System
interface
Redundant Power
board
FRU Temp.
sensor
…
FRU
Access
as root
user
Access with
username & password
(UDP port 623)
90. _ update BMC firmware
_ use separate LAN – dedicated NIC or at least VLAN
_ deactivate unused services
_ use secure passwords AND usernames for IPMI
_ monitor with user rights ONLY
_ configure BMC firewall (if present)
_ EOL: flash BMC firmware or destroy mainboard
7 Tips for Tip 2 „Secure BMC“
93. 7 Tips for Tip 2 „Secure BMC“
https://www.thomas-krenn.com/de/tkmag/expertentipps/7-punkte-fuer-mehr-sicherheit-im-umgang-mit-ipmi/
94. OpenBMC – a Free Software BMC
_ REST-like API for Automatisierung
_ supports SSH to host console
_ runs „latest stable“ Linux Kernel:
_ Security fixes
_ current drivers
_ new Kernel functions
_ project started by Facebook
95. systemd
Flash LEDs Sensors
Host
Control
SSH
Python
Rocket
u-boot + Linux Kernel
Phosphor
Quelle: „OpenBMC: Boot your server with Python“, Joel Stanley, IBM OzLabs Adelaide, PyCon Australia 2016
https://2016.pycon-au.org/schedule/87/view_talk
https://www.youtube.com/watch?v=XrFaLnjOxQA
96. OpenBMC
_ Modern Kernel
_ Up-to-date Userspace
_ security patches
_ Make it „look good“
_ Make it „work well“ (REST, SSH)
98. Future plans
_ „Support the Server you have at Home“
_ Web Interface
_ Secure Boot, Trusted Boot, Sicherheit-Features
(lock-down Userspace)
_ Upstream of all components (Linux, u-boot, ...)
_ More Hardware with OpenBMC support
_ Redfish Support (check my last year's talk ;-)
_ more projects → see Raptor Engineering
106. Open Source code
on GitHub
You can compile
firmware yourself
→ Fast updates in case of
security issues
Open Source beginning
from the 1st
init up to DB
Direct contact to
developers
107.
108.
109. TombolaWin a Low EnergyServer / SSD / Laptop bagDrawing at 4:30 PM
(last coffee break)