Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

.NET Fest 2018. Sasha Goldshtein. Profiling and Debugging .NET Core Apps on Linux


Published on

OK, great, you can now run your favorite ASP.NET app or console application on Linux using .NET Core. What happens next? In fact, what happens straight after deployment, when you discover memory leaks, surprising crashes, performance problems, and other woes -- in the production environment? The tools and profilers you're used to from Windows don't work on Linux, and a lot of them don't have easy-to-use alternatives. In this talk, we will review the current landscape of profiling and debugging .NET Core apps on Linux. We will talk about performance investigations using `perf`, the Linux performance multi-tool, which can collect stack traces of CPU samples and other interesting events from .NET Core applications. We will review LTTng and how it's used by .NET Core as a replacement for Windows ETW events, and see how to collect LTTng traces efficiently and extract useful insights from them. We will also talk about core dumps, how to generate them on Linux, and how to get .NET-specific information out of a core dump using lldb and SOS. Prepare for a bumpy ride of half-baked tools and command-line magic, with -- hopefully -- a rewarding end!

Published in: Education
  • Be the first to comment

  • Be the first to like this

.NET Fest 2018. Sasha Goldshtein. Profiling and Debugging .NET Core Apps on Linux

  1. 1. Debugging And Profiling .NET Core Apps on Linux Sasha Goldshtein @goldshtn
  2. 2. The Plan • This is a talk on debugging and profiling .NET Core apps on Linux— yes, it’s pretty crazy that we got that far! • You’ll learn: To profile CPU activity in .NET Core apps To visualize stack traces (e.g. of CPU samples) using flame graphs To use Linux tracing tools with .NET Core processes To capture .NET Core runtime events using LTTng To generate and analyze core dumps of .NET Core apps
  3. 3. 🎓 Disclaimer • A lot of this stuff has changed, and is changing • The tools described here mostly work with .NET Core 2.x, but your mileage may vary • Some of this relies on scripts I hacked together, and will hopefully* be officially supported in the future *I’ve been saying this for almost two years now
  4. 4. Tools And Operating Systems Supported Linux Windows macOS CPU sampling perf, BCC ETW Instruments, dtrace Dynamic tracing perf, SystemTap, BCC ⛔️ dtrace Static tracing LTTng ETW ⛔️ Dump generation core_pattern, gcore Procdump, WER kern.corefile, gcore Dump analysis lldb Visual Studio, WinDbg lldb This talk
  5. 5. ⚠️ Mind The Overhead • Any observation can change the state of the system, but some observations are worse than others • Diagnostic tools have overhead • Check the docs • Try on a test system first • Measure degradation introduced by the tool OVERHEAD This traces various kernel page cache functions and maintains in-kernel counts, which are asynchronously copied to user-space. While the rate of operations can be very high (>1G/sec) we can have up to 34% overhead, this is still a relatively efficient way to trace these events, and so the overhead is expected to be small for normal workloads. Measure in a test environment. —man cachestat (from BCC)
  6. 6. Sampling vs. Tracing • Sampling works by getting a snapshot or a call stack every N occurrences of an interesting event • For most events, implemented in the PMU using overflow counters and interrupts • Tracing works by getting a message or a call stack at every occurrence of an interesting event CPU timepid 121 pid 121 pid 408 pid 188 system timepid 121 pid 408 CPU sample disk write
  7. 7. .NET Core on Linux Tracing Architecture kernel user .NET Core app CoreCLR OS libraries CPU EventSource events CLR events (LTTng) Probes, USDT Tracepoints (scheduler, block I/O, networking) “Software events” (core migrations, page faults) PMU events (clock cycles, branches, LLC misses)
  8. 8. The Official Story: perfcollect and PerfView 1. Download perfcollect 2. Install prerequisites: ./perfcollect install 3. Run collection: ./perfcollect collect mytrace 4. Copy the file to a Windows machine 😳 5. Download PerfView 😁 6. Open the trace in PerfView 😱
  9. 9. perf • perf is a Linux multi-tool for performance investigations • Capable of both tracing and sampling • Developed in the kernel tree, must match running kernel’s version • Debian-based: apt install linux-tools-common • RedHat-based: yum install perf
  10. 10. perf_events Architecture UserKernel tcp_msgsend Page fault LLC miss sched_switch perf_events libc:malloc mmap buffer mmap buffer Real-time analysis perf script… file
  11. 11. Five Things That Will Happen To You If You Don’t Have Symbolic Debug Information The stuff we care about
  12. 12. Getting Debug Information Type Debug information source SyS_write Kernel /proc/kallsyms __write Native Debuginfo package System.IO.SyncText… Managed (AOT) Crossgen or COMPlus_ZapDisable=1 System.Console.WriteLine Managed (AOT) Crossgen or COMPlus_ZapDisable=1 MyApp.Program.Foo Managed (JIT) /tmp/perf-$ MyApp.Program.Main Managed (JIT) /tmp/perf-$ ExecuteAssembly Native (CLR) dotnet symbols or source build CorExeMain Native (CLR) dotnet symbols or source build __libc_start_main Native Debuginfo package
  13. 13. 🔥 Flame Graphs • A visualization method (adjacency graph), very useful for stack traces, invented by Brendan Gregg • • Turns thousands of stack trace pages into a single interactive graph • Example scenarios: • Identify CPU hotspots on the system/application • Show stacks that perform heavy disk accesses • Find threads that block for a long time and the stack where they do it
  14. 14. Reading a Flame Graph • Each rectangle is a function • Y-axis: caller-callee • X-axis: sorted stacks (not time) • Wider frames are more common • Supports zoom, find • Filter with grep 😎
  15. 15. $ COMPlus_PerfMapEnabled=1 ./myapp # perf record -g -p $(pidof Buggy) -F 97
  16. 16. # perf report
  17. 17. # perf script | | > flame.svg
  18. 18. The Old Way And The New Way: BPF kernel u{,ret}probe: malloc perf_events user perf | awk | … report bytes # distribution 0 - 1 … |@@@@ | 1 – 2 … |@ | 2 - 4 … |@@@@@@@@ | kernel u{,ret}probe: malloc BPF program BPF map user control program report bytes # distribution 0 - 1 … |@@@@ | 1 – 2 … |@ | 2 - 4 … |@@@@@@@@ | user libc:malloc user libc:malloc
  19. 19. The BCC BPF Front-End • • BPF Compiler Collection (BCC) is a BPF frontend library and a massive collection of performance tools • Contributors from Facebook, PLUMgrid, Netflix, Sela • Helps build BPF-based tools in high- level languages • Python, Lua, C++ • Up-and-coming competitor: bpftrace kernel user BCC tool BCC tool … BCC compiler frontend Clang + LLVM BCC loader library BPF runtime event sources
  20. 20. Managed runtimes Syscall interface Block I/O Ethernet Scheduler Mem Device drivers Filesystem TCP/IP CPU Applications System libraries profile llcstat hardirqs softirqs ttysnoop runqlat cpudist offcputime offwaketime cpuunclaimed memleak oomkill slabratetoptcptop tcplife tcpconnect tcpaccept biotop biolatency biosnoop bitesize filetop filelife fileslower vfscount vfsstat cachestat cachetop mountsnoop *fsslower *fsdist dcstat dcsnoop mdflush execsnoop opensnoop killsnoop statsnoop syncsnoop setuidsnoop mysqld_qslower bashreadline dbslower dbstat mysqlsniff memleak sslsniff gethostlatency deadlock_detector ustat ugc uthreads ucalls uflow argdist trace funccount funclatency stackcount
  21. 21. # syscount -c -p $(pidof Buggy) # opensnoop -p $(pidof Buggy)
  22. 22. # funccount -p $(pidof Buggy) …/*GarbageCollect*
  23. 23. # stackcount -p $(pidof Buggy) …/*GarbageCollect*
  24. 24. LTTng Architecture UserKernel SyS_write SyS_read LLTng kernel module myapp:myprobe buffer lttng --live babeltraceCTF file buffer
  25. 25. $ COMPlus_EnableEventLog=1 ./myapp # lttng create my-trace # lttng enable-event --userspace --tracepoint DotNetRuntime:* # lttng start
  26. 26. Plotting data from GCStats events, original work by Aleksei Vereshchagin at DotNext Saint Petersburg
  27. 27. 🚮 Core Dumps • A core dump is a memory snapshot of a running process • Can be generated on crash or on demand dump file
  28. 28. Generating Core Dumps • /proc/sys/kernel/core_pattern configures the core file name or application to process the crash • ulimit -c controls maximum core file size (often 0 by default) • gcore (part of gdb) can create a core dump on demand • CoreCLR ships with a createdump utility • There’s also Procdump for Linux 🙀😱
  29. 29. Analyzing .NET Core Dumps $ lldb /usr/bin/dotnet -c core.1788 (lldb) bt The stuff we care about
  30. 30. (lldb) plugin load …/ (lldb) setclrpath … DumpObj DumpArray DumpHeap GCRoot PrintException Threads ClrStack EEHeap DumpDomain DumpAssembly
  31. 31. Checklist: Preparing Your Environment export COMPlus_PerfMapEnabled=1 AOT perf map with crossgen, or COMPlus_ZapDisable=1 Debuginfo package for libcoreclr, libc Install perf/BCC tools export COMPlus_EnableEventLog=1 ulimit -c unlimited (or managed by system) Install lldb-3.x
  32. 32. Summary • We have learned: To profile CPU activity in .NET Core apps To visualize stack traces (e.g. of CPU samples) using flame graphs To use Linux tracing tools with .NET Core processes To capture .NET Core runtime events using LTTng To generate and analyze core dumps of .NET Core apps
  33. 33. References • perf and flame graphs • • • • .NET Core diagnostics docs • cumentation/project-docs/linux-performance- • cumentation/building/ • My blog posts • yzing-a-net-core-core-dump-on-linux/ • ling-a-net-core-application-on-linux/ • ng-runtime-events-in-net-core-on-linux/ • BCC tutorials • • •
  34. 34. Thank You! @goldshtn