What is CPU
A profile is a set of statistics that
describes how often and for how long
various parts of the program executed.
See output sample.
● Most profilers run inside your Python process.
● If you’re inside a Python program you generally have
pretty easy access to its stack.
● There are two types of profilers that differ upon their
○ Tracing profilers - triggered on function/line called
○ Sampling profilers - triggered on a time interval
● Python let you specify a callback that gets run when
various interpreter events (like “calling a function” or
“executing a line of code”) happen.
● When the callback gets called, it records the stack for
● You can set up that callback with:
○ PyEval_SetProfile - triggered only when a function is called
○ PyEval_SetTrace - triggered when a function is called or a
line of code is executed
● Cprofile uses PyEval_SetProfile
● line_profile uses PyEval_SetTrace
e of tracing
● The main disadvantage of tracing profilers implemented
in this way is that they introduce a fixed amount of
latency for every function call / line of code executed.
● See example
● The documentation for cProfile says:
○ “The interpreted nature of Python tends to add so much
overhead to execution, that deterministic profiling tends to
only add small processing overhead in typical applications”
● Makes sense since standard programs does not have
so many function calls.
Well – let’s say you want to get a snapshot of a program’s
stack 50 times a second. A way to do that is:
● Ask the Linux kernel to send you a signal every 20
milliseconds (using the setitimer system call)
● Register a signal handler to record the stack every time
you get a signal.
● When you’re done profiling print the output!
Comparison ● A sample profile in 61 LOC
● A demo using it.
● A comparison of sampling vs. tracing
● Real Projects:
Flame graphs adds a nice visual touch
to understand the profiler output. See
python-flamegraph and the FlameGraph
ns in design
● Consider performance at design time, not all
business/API requirement can be answered and there
might need to be some compromises need to be made.
● Do your best to understand the performance impact, but
no more (beware of analysis paralysis). Invest in a
testable, monitored environment instead.
After design and implementation we can check performance
● CI and test suite:
○ Expose speed degradation
○ Have the base to run profiling on.
○ Notice! test fakers might have different data then the
actual data in production
● Saging environment:
○ Test scenarios on the same data as in production.
○ There might be some actions that will not know to do and
have significant performance issue in production
● Application Performance Monitoring (APM) such as
New Relic or Datadog allows to:
○ Set alerts on certain metrics
○ Analyze real transactions
○ Add custom instrumentation
● Users are the best QA 😱