Introduction to .NET Performance Measurement


Published on

A two hour talk on performance measurement tools for .NET applications, including performance counters, Visual Studio profiler, and ETW with PerfView.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Introduction to .NET Performance Measurement

  1. 1. © Copyright SELA software & Education Labs Ltd. 14-18 Baruch Hirsch St.Bnei Brak 51202 Israel
  2. 2. • Execution time – CPU time, wall-clock time, kernel time vs. user time • I/O requests – Number of disk operations, number of bytes transferred across the wire, number of files accessed • Database access – Sessions opened, transactions committed, grouping of execution time by SQL statement • OS and hardware – System calls, page faults, cache misses, TLB misses
  3. 3. • A set of numeric data exposed by Windows or by individual applications that can be sampled programmatically – Organized hierarchically into Categories, Instances, and Counters • Accessed using System.Diagnostics: – PerformanceCounter, PerformanceCounterCategory – Can expose your own counters as well • Read with the built-in Performance Monitor MMC snap-in (perfmon.exe)
  4. 4. • Supports managed and unmanaged code – Part of Visual Studio 2010/2012 Premium/Ultimate – Can be run stand-alone from the command-line • Operation modes: Sampling •CPU-bound apps, very low overhead •Full program stacks (including all system DLLs) •Tier interactions Instrumentation •I/O-bound apps, CPU-bound apps, higher overhead •More detailed timing data, limited stacks (just my code) Allocations •Details on who allocated and what •Managed code only Concurrency
  5. 5. • Periodically interrupt the application – Timer (default = 10,000,000 clock cycles) – Page faults – System calls – CPU performance counters (cache misses, branch mispredictions, etc.) • Walk the application’s stack – Record the frames, no symbol resolution yet – Very fast, small intrusion, little effect on profiled app
  6. 6. • Exclusive samples – Function was on the top of the stack – Function is doing a lot of individual work • Inclusive samples – Function was on the stack (but not the top) – Function causes a lot of work to be done Bar Foo Main Top +1 Inc +1 Exc +1 Inc +1 Inc
  7. 7. • Samples ≠ Time – Blocked functions don’t get samples – There may be statistical errors (an “evasive” function that never shows up during a sample) • Very long runs are not necessary – Long runs = more noise = less clarity • Make sure you have debugging symbols – Use the Microsoft symbol server,
  8. 8. • The profiler instruments the binary before it’s launched – Emits markers that record function execution times and counts – In other profilers, can work at the line-level as well – but very expensive void foo() { FUNC_ENTER(foo); // do some work CALL_ENTER(ExtCall); // call another function ExtCall(); CALL_EXIT(ExtCall); // do some more work FUNC_EXIT(foo); }
  9. 9. • More detailed performance data – Number of calls – Actual Time (probe overhead is subtracted) • Elapsed time – Raw time spent in the function (wall clock time) • Application time – Probes are marked when kernel transitions occur between two probes – That time is discounted in Application time
  10. 10. • Memory allocations incur a significant cost – The allocations are cheap, but the GC isn’t! – You won’t always see the cost at the source, because the allocating function runs quickly • Profiling an application for excessive allocations may be more important than CPU time – Another aspect is diagnosing memory leaks or sources of excess memory consumption
  11. 11. • Identifies the locations making the most allocations, and lists the types and allocation counts
  12. 12. • Analyze the application’s concurrency characteristics – CPU utilization – are all CPU cores active? – Thread migration between cores – Thread blocking patterns – why are threads blocked/unblocked, preempted, executing? – Resource contention – which threads are competing for the same resources? • In-depth analysis is very difficult – lots of information in a very short time
  13. 13. • Common Patterns for Poorly-Behaved Multithreaded Applications
  14. 14. • To get a quick result, an idea of where to focus • To analyze sources of cache misses, page faults, and other environmental factors • To profile a running process (e.g. Web server) that can’t be restarted easily Sampling • To get more accurate results, function call counts • To get wall-clock time information including block and wait timesInstrumentation • To get a general idea of CPU utilization and thread migration • To understand why threads are blocked and unblockedConcurrency
  15. 15. • xperf.exe: Command-line tool for ETW capturing and processing • xperfview.exe: Visual trace analysis tool • xbootmgr.exe: On/off transition state capture tool • PerfView.exe: ETW capture tool for managed apps • Works on Windows Vista SP1 and above
  16. 16. • Turn tracing on: xperf -on <PROVIDER> • Perform activities • Capture a log: xperf -d <LOG_FILE_NAME> • Analyze it: xperf <LOG_FILE_NAME>
  17. 17. Performance Counters Visual Studio Profiler Event Tracing for Windows