Your SlideShare is downloading. ×
0
PmcTools: Whole-system, low-overhead             performance measurement in FreeBSD                                   Jose...
Outline1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      API      Design Issues    ...
IntroductionGoals Of This Talk      Introduce FreeBSD/PmcTools.      Introduce BSD culture.Joseph Koshy (FreeBSD Developer...
IntroductionAbout FreeBSD      http://www.freebsd.org/      Popular among appliance makers, ISPs, web hosting providers:  ...
IntroductionAbout jkoshy@FreeBSD.org      FreeBSD developer since 1998.      Technical interests:              Performance...
IntroductionThe Three Big Questions in Performance Analysis  1   What is the system doing?  2   Where in the code does the...
IntroductionQuestion 1: What is the system doing?      System behaviour:              Traditional UNIX tools: vmstat, iost...
IntroductionQuestion 2: Which portion of the system isresponsible?      Which subsystems are involved?      Where specifica...
Introduction   Introducing PmcTools1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview    ...
Introduction   Introducing PmcToolsPmcTools Project Goals                                                Low−overheads    ...
Introduction   Introducing PmcToolsPerformance Analysis: Conventional vs. PmcTools  Description                           ...
Introduction   Introducing PmcToolsRelated open-source projects         Linux Many projects related to PMCs:              ...
PmcTools   Architectural Overview1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      ...
PmcTools                   Architectural OverviewOverview: Architecture                                                   ...
PmcTools   Architectural OverviewComponents      hwpmc kernel bitskernel changes (see later)       libpmc application APIp...
PmcTools    Architectural OverviewPMC Scopes                                                                          #0  ...
PmcTools   Architectural OverviewCounting vs. Sampling                       Process−scope,                    System−scop...
PmcTools      Architectural OverviewUsing system-scope PMCs                                                    process    ...
PmcTools       Architectural OverviewUsing process-scope PMCs                                                  owner      ...
PmcTools   API1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      API      Design Iss...
PmcTools   APIAPI OverviewCategories:      Administration (2).      Convenience Functions (8).      Initialization (1).   ...
PmcTools     APIExample: API Usage                                                     attach                             ...
PmcTools   Design Issues1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      API      ...
PmcTools   Design IssuesPMCs Vary A Lot    AMD Athlon XP                  4 PMCs, 48 bits wide.    AMD Athlon64           ...
PmcTools   Design IssuesAPI Design IssuesIssues:      Designing an extensible programming interface for application use.  ...
PmcTools   Profiling1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      API      Desig...
PmcTools   ProfilingConventional Statistical Profiling      Needs specially compiled binaries (cc -pg).      Sampling runs o...
PmcTools   ProfilingPmcTools’ Statistical Profiling      Sets up PMCs to interrupt the CPU on overflow.      Uses an NMI to d...
PmcTools         ProfilingProfiling with NMIs                                 HWPMC(4)                                      ...
PmcTools       ProfilingProfiling Workflow             System/application             under investigation               hwpmc...
PmcTools        ProfilingProfiling of Shared Objects                 0                                                      ...
PmcTools      ProfilingRemote Profiling                                          log data                        system unde...
PmcTools   Implementation1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      API     ...
PmcTools   ImplementationImplementation Information      Module                                     Comments      sys/dev/...
PmcTools   ImplementationImpact on Base KernelSpace Requirements 2 bits (P HWPMC, TDP CALLCHAIN). Uses free           bits...
PmcTools   ImplementationPortabilityApplication Portability High.Portability of libpmc Moderate. Requires a POSIX-like sys...
Status & Future Work   Status1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      API ...
Status & Future Work   StatusCurrent State      Proof-of-concept application pmcstat is the current “user      interface”....
Status & Future Work   StatusSupport load      Volunteer project. Initial hardware bought from pocket, or on loan.      Cu...
Status & Future Work   Future Projects1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview ...
Status & Future Work   Future ProjectsProfiling the Cloud                                                                  ...
Status & Future Work   Future ProjectsOther project ideas      A graphical visualizer “console”.      Enhance gprof, or wr...
Status & Future Work   Research Topics1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview ...
Status & Future Work    Research TopicsProfile guided system layout                                       applications     ...
Status & Future Work   Research TopicsDetection of SMP data structure layout bugsstruct shared  ...  char sh foo;  int sh ...
Status & Future Work   Research TopicsProfiling for power use      What part of the system consumes power?      Where in th...
Conclusion1   Introduction       Introducing PmcTools2   PmcTools      Architectural Overview      API      Design Issues ...
ConclusionTalk Summary      FreeBSD/PmcTools was introduced.      The design & implementation of PmcTools was looked at.  ...
Upcoming SlideShare
Loading in...5
×

PmcTools: Whole-System, Low-Overhead Performance Measurement in FreeBSD

334

Published on

An overview of the PmcTools toolkit in FreeBSD, presented at ACM Bangalore.
The presentation briefly touches upon: the motivations and goals of the project, the programming APIs, some aspects of the implementation and on possible future work.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
334
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "PmcTools: Whole-System, Low-Overhead Performance Measurement in FreeBSD"

  1. 1. PmcTools: Whole-system, low-overhead performance measurement in FreeBSD Joseph Koshy FreeBSD Developer 18 April 2009Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 1 / 48
  2. 2. Outline1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 2 / 48
  3. 3. IntroductionGoals Of This Talk Introduce FreeBSD/PmcTools. Introduce BSD culture.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 3 / 48
  4. 4. IntroductionAbout FreeBSD http://www.freebsd.org/ Popular among appliance makers, ISPs, web hosting providers: Fast, stable, high-quality code, liberal license. FreeBSD culture in one sentence: “Shut up and code”.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 4 / 48
  5. 5. IntroductionAbout jkoshy@FreeBSD.org FreeBSD developer since 1998. Technical interests: Performance analysis; the design of high performance software. Low power computing. Higher order, typed languages. Writing clean, well-designed code.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 5 / 48
  6. 6. IntroductionThe Three Big Questions in Performance Analysis 1 What is the system doing? 2 Where in the code does the behaviour arise? 3 What is to be done about it?Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 6 / 48
  7. 7. IntroductionQuestion 1: What is the system doing? System behaviour: Traditional UNIX tools: vmstat, iostat, top, systat, ktrace, truss. Counters under the sysctl hierarchy. Compile time options such as LOCK PROFILING. New tools like dtrace. Machine behaviour: Modern CPUs have in-CPU hardware counters measuring hardware behaviour: bus utilization, cache operations, instructions decoded and executed, branch behaviour, floating point and vector operations, speculative execution, . . . Near zero overheads, good precision.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 7 / 48
  8. 8. IntroductionQuestion 2: Which portion of the system isresponsible? Which subsystems are involved? Where specifically in the code is the problem arising? System performance is a “global” property. “Local” inspection of code not always sufficient. As a community we are still exploring the domain of performance analysis tools: Which data to collect. Collecting it with low-overheads. Making sense of the information collected.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 8 / 48
  9. 9. Introduction Introducing PmcTools1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 9 / 48
  10. 10. Introduction Introducing PmcToolsPmcTools Project Goals Low−overheads Whole−system Measurements FreeBSD Tier−1 Architectures PmcTools Use in production Platform for Have Fun! Interesting toolsJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 10 / 48
  11. 11. Introduction Introducing PmcToolsPerformance Analysis: Conventional vs. PmcTools Description Conventional PmcTools Need special binaries Yes No Dynamically loaded objects No Yes Profiling scope Executable Process & System Need restart Yes No Measurement overheads High Low Profiling tick Time Many options Profile inside critical sections No Yes (x86) Cross-architecture analysis No Yes Distributed profiling No Yes Production use No YesJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 11 / 48
  12. 12. Introduction Introducing PmcToolsRelated open-source projects Linux Many projects related to PMCs: Oprofile: http://oprofile.sourceforge.net/ Perfmon: http://perfmon2.sourceforge.net/ Perfctr: http://sourceforge.net/projects/perfctr/ Rabbit: http://www.scl.ameslab.gov/Projects/Rabbit/ Solaris CPC(3) library. NetBSD A pmc(3) API.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 12 / 48
  13. 13. PmcTools Architectural Overview1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 13 / 48
  14. 14. PmcTools Architectural OverviewOverview: Architecture process process process log libpmc kernel hwpmc CPU CPU CPU CPU CPU CPU A platform to build tools that use PMC data.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 14 / 48
  15. 15. PmcTools Architectural OverviewComponents hwpmc kernel bitskernel changes (see later) libpmc application APIpmccontrol management tool pmcstat proof-of-concept applicationpmcannotate contributed tool etc... others in the futureJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 15 / 48
  16. 16. PmcTools Architectural OverviewPMC Scopes #0 #1 #2 #3 #4 CPU 0 CPU 1 CPU 2 CPU 3 Process−scope PMC System−scope PMC1 process-scope PMC & 2 system-scope PMCs simultaneously active.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 16 / 48
  17. 17. PmcTools Architectural OverviewCounting vs. Sampling Process−scope, System−scope, Counting Counting Process−scope, System−scope, Sampling SamplingJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 17 / 48
  18. 18. PmcTools Architectural OverviewUsing system-scope PMCs process process kernel Three system scope PMCs, on three CPUs. Measure behaviour of the system as a whole.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 18 / 48
  19. 19. PmcTools Architectural OverviewUsing process-scope PMCs owner target 2: attach 1: allocate kernel A process-scope PMC is allocated & attached to a target process. Entire “row” of PMCs reserved across CPUs.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 19 / 48
  20. 20. PmcTools API1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 20 / 48
  21. 21. PmcTools APIAPI OverviewCategories: Administration (2). Convenience Functions (8). Initialization (1). Log file handling (3). PMC Management (10). Queries (7). Arch-specific functions (1).32 functions documented in 15 manual pages. 10 manual pages forsupported hardware events.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 21 / 48
  22. 22. PmcTools APIExample: API Usage attach allocate start release stop read pmc allocate() Allocate a PMC; returns a handle. pmc attach() Attach a PMC to a target process. pmc start() Start a PMC. pmc read() Read PMC values. pmc stop() Stop a PMC. pmc release() Release resources.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 22 / 48
  23. 23. PmcTools Design Issues1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 23 / 48
  24. 24. PmcTools Design IssuesPMCs Vary A Lot AMD Athlon XP 4 PMCs, 48 bits wide. AMD Athlon64 4 PMCs, Different set of hardware events. Intel Pentium MMX 2 PMCs. 40 bits wide. Counting only. Intel Pentium Pro 2 PMCs, 40 bits for reads, 32 bits for writes. Intel Pentium IV 18 PMCs shared across logical CPUs. En- tirely different programming model. Intel Core Number of PMCs and widths vary. Has programmable & fixed-function PMCs. Intel Core/i7 As above, but also has per-package PMCs. PMCs are closely tied to CPU micro-architecture. PMC capabilities, supported events, access methods, programming constraints can vary across CPU generations.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 24 / 48
  25. 25. PmcTools Design IssuesAPI Design IssuesIssues: Designing an extensible programming interface for application use. Allowing knowledgeable applications to make full use of hardware.PmcTools philosophy: Make simple things easy to do. Make complex things possible.Current “UI” uses name=value pairs:% pmcstat -p k8-bu-fill-request-l2-miss, mask=dc-fill+ic-fill,usrJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 25 / 48
  26. 26. PmcTools Profiling1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 26 / 48
  27. 27. PmcTools ProfilingConventional Statistical Profiling Needs specially compiled binaries (cc -pg). Sampling runs off the clock tick. Cannot profile inside kernel critical sections. “In-place” record keeping. Call graph is approximated.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 27 / 48
  28. 28. PmcTools ProfilingPmcTools’ Statistical Profiling Sets up PMCs to interrupt the CPU on overflow. Uses an NMI to drive sampling (on x86): Can profile inside kernel critical sections. Needs lock-free implementation techniques. Separates record keeping from data collection. Captures the exact callchain at the point of the sample.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 28 / 48
  29. 29. PmcTools ProfilingProfiling with NMIs HWPMC(4) hwpmc_hook() machine dependent trap handler trap() low level trap handler log file NMI #0 #1 #2 #3 hardclock() hwpmc helper thread record sample lock−less ring per−owner CPU 0 CPU 1 CPU 2 CPU 3 buffers log bufferJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 29 / 48
  30. 30. PmcTools ProfilingProfiling Workflow System/application under investigation hwpmc(4) logfile pmcstat pmcstat gprof(1) data gprof gprof(1) output Uses gprof(1) to do user reports (currently): Needs to be redone: gprof(1) limitations. Call chains are captured and used to generate call graphs.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 30 / 48
  31. 31. PmcTools ProfilingProfiling of Shared Objects 0 0xFFFFFFFF text data bss heap mmap stack Kernel unmapped rtld command line arguments data data data text text Each executable object in the system gets its own gmon.out file: kernel kernel modules executables shared libraries run time loaderJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 31 / 48
  32. 32. PmcTools ProfilingRemote Profiling log data system under analysis measurement console pmc configure log() takes a file descriptor. Can log to a disk file, a pipe, or to a network socket. Events in log file carry timestamps for disambiguation.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 32 / 48
  33. 33. PmcTools Implementation1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 33 / 48
  34. 34. PmcTools ImplementationImplementation Information Module Comments sys/dev/hwpmc, 31K LoC, i386&amd64 sys/sys/pmc*.h lib/libpmc 3.3K LoC usr.sbin/* 5.4K LoC documentation 29 manual pages, 11K LoD All public APIs have manual pages. All hardware events, and their modifiers are documented. The internal API between libpmc and hwpmc is also documented.See also: “Beautiful Code Exists, If You Know Where To Look”, CACM,July 2008.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 34 / 48
  35. 35. PmcTools ImplementationImpact on Base KernelSpace Requirements 2 bits (P HWPMC, TDP CALLCHAIN). Uses free bits in existing flags words.Kernel Changes Clock handling, kernel linker, MD code, process handling, scheduler, VM (options HWPMC HOOKS).Kernel Callbacks scheduler low−level (assembler) kernel linker trap() VM (mmap) hwpmc exec clock handlingJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 35 / 48
  36. 36. PmcTools ImplementationPortabilityApplication Portability High.Portability of libpmc Moderate. Requires a POSIX-like system.Adding support for new PMC hardware Moderate. Kernel bits Low.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 36 / 48
  37. 37. Status & Future Work Status1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 37 / 48
  38. 38. Status & Future Work StatusCurrent State Proof-of-concept application pmcstat is the current “user interface”. Crufty. Low overheads (design goal: 5%) and tunable. In production use. Being shipped on customer boxes by appliance vendors. Support load on the rise (esp. requests for support of new hardware).Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 38 / 48
  39. 39. Status & Future Work StatusSupport load Volunteer project. Initial hardware bought from pocket, or on loan. Current hardware support: 12 combinations of {PMC hardware × 32/64 bit OS variants} × 9 OS versions [FreeBSD 6.0 · · · 6.4, 7.0 · · · 7.2, 8.0] = 108 combinations! Need a hardware lab to manage testing and bug reports. Need an automated test suite that is run continuously. Also useful for detecting OS & application performance regressions early. Email support load is on the rise: Rise in FreeBSD adoption. FreeBSD users and developers worldwide now chipping in with features, bug fixes, offering tutorials and spreading the word.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 39 / 48
  40. 40. Status & Future Work Future Projects1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 40 / 48
  41. 41. Status & Future Work Future ProjectsProfiling the Cloud traffic Control/Data Analysis ConsoleJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 41 / 48
  42. 42. Status & Future Work Future ProjectsOther project ideas A graphical visualizer “console”. Enhance gprof, or write a report generator afresh. Link up with existing profile based optimization frameworks. Allow performance analysis of non-native architectures. Support non-x86 PMCs. Integrate PmcTools and DTrace. Port to other BSDs and/or OpenSolaris.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 42 / 48
  43. 43. Status & Future Work Research Topics1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 43 / 48
  44. 44. Status & Future Work Research TopicsProfile guided system layout applications kernel L1/L2/L3 Cache Lay out the whole system to help “hot” portions remain in cache. Would require an augmented toolchain (http://elftoolchain.sourceforge.net/) & enhancements to hwpmc(4). Useful for low end devices using direct-mapped caches.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 44 / 48
  45. 45. Status & Future Work Research TopicsDetection of SMP data structure layout bugsstruct shared ... char sh foo; int sh bar; char sh buzz; ...; Would use a combination of static analysis & hwpmc(4) data. Detection of the poor cache line layout behaviour. Cache line ping-ponging between CPUs.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 45 / 48
  46. 46. Status & Future Work Research TopicsProfiling for power use What part of the system consumes power? Where in the code is power being spent?Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 46 / 48
  47. 47. Conclusion1 Introduction Introducing PmcTools2 PmcTools Architectural Overview API Design Issues Profiling Implementation3 Status & Future Work Status Future Projects Research Topics4 ConclusionJoseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 47 / 48
  48. 48. ConclusionTalk Summary FreeBSD/PmcTools was introduced. The design & implementation of PmcTools was looked at. Possible future development and research directions for the project were touched upon.Joseph Koshy (FreeBSD Developer) FreeBSD/PmcTools 18 April 2009 48 / 48
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×