Performance engineering
for children and grownups
Alex Chistyakov, Principal Engineer @ Git in Sky
23.04.2016
Piter Py
Who is Mr. Chistyakov?
●
Hi, I’m Alex
●
I’m a Go programmer...
●
...and became a Mac user recently
●
So you may wonder what I am doing here
●
Don’t know for sure, maybe just IT
Who is Mr. Chistyakov?
●
Hi, I’m Alex
●
I’m a Go programmer...
●
...and became a Mac user recently
●
So you may wonder what I am doing here
●
Don’t know for sure, maybe just IT
●
I realized I never used Cthulhu on my
slides => challenge accepted!
A word on Cthulhu
●
Cthulhu is horrendous, but...
●
...the most dreaded operation engineer’s
nightmare is not Cthulhu...
●
...but a phone call at 3:40AM
●
“The app is down!”
●
Enough on Cthulhu then
So we need a plan
●
Blame Python!
●
Seriously, just use Go
●
Python is slow as a snail
●
It has GIL, etc
●
...
●
Looks like a plan!
So we need a plan
●
Blame Python!
●
Seriously, just use Go
●
Python is slow as a snail
●
It has GIL, etc
●
...
●
Looks like a plan!
●
Measure then optimize!
Measure then optimize!
●
If you know Latin please make a
translation for me and I will get a tattoo
●
Okay, we probably know how to optimize
●
(Just rewrite everything)
●
But what to measure?
●
And how to measure in a production
environment?
The basic idea seems to be quite old
●
http://poormansprofiler.org/
●
1) Periodically take CPU stack traces
●
2) Aggregate raw samples somehow
●
3) Analyze
●
4) ???????
●
5) PROFIT!
Let’s profile something!
●
We are in 2016, so Docker is imminent
●
https://hub.docker.com/r/gitinsky/deluge/
●
Deluge is a BitTorrent client w/web
interface
●
(I use BitTorrent for educational purposes
only)
●
So I have 282 torrents in the list, most of
them are inactive
Let’s profile something!
●
We are in 2016, so Docker is imminent
●
https://hub.docker.com/r/gitinsky/deluge/
●
Deluge is a BitTorrent client w/web
interface
●
(I use BitTorrent for educational purposes
only)
●
So I have 282 torrents in the list, most of
them are inactive
WTF did we just see?
●
Respect the screen resolution of Mac!
●
But even Retina display is not enough
●
Frankly speaking, PMP results are a
mess
Meet flamegraphs!
●
http://www.brendangregg.com/flamegraphs.html
●
https://github.com/brendangregg/FlameGraph
●
https://github.com/brendangregg/FlameGraph/blob/master/stackcolla
pse-gdb.pl
●
(The last one is in Perl, sorry)
●
We can connect PMP and this script
Meet flamegraphs!
●
http://www.brendangregg.com/flamegraphs.html
●
https://github.com/brendangregg/FlameGraph
●
https://github.com/brendangregg/FlameGraph/blob/master/stackcolla
pse-gdb.pl
●
(The last one is in Perl, sorry)
●
We can connect PMP and this script
Nice try, but WTF again?
●
Color intensity means nothing (colors are used just to differentiate
bars)
●
Bar length represents a total number of all samples of that kind
●
Stack roots are at the bottom of Y axis, stack leaves are at the top
●
There is a common stack root (imaginary in this case)
●
Time to analyze!
Nice try, but WTF again?
●
Color intensity means nothing (colors are used just to differentiate
bars)
●
Bar length represents a total number of all samples of that kind
●
Stack roots are at the bottom of Y axis, stack leaves are at the top
●
There is a common stack root (imaginary in this case)
●
Time to analyze!
Analyzing like a CIA agent!
●
Left to right:
●
Python interpreter
●
C++, boost
●
C++, libtorrent
●
But we did not start anything except
Python interpreter!
●
clone() → start_thread() → all that C++
stuff
Captain Unobvious to the rescue
●
What does our non-C++ part do?
●
Not much:
●
PyRun_FileExFlags
●
PyEval_EvalCode
●
PyEval_EvalCodeEx
●
And so on
●
What else did you expect from an interpreter?
One does not simply sample Python w/gdb
●
Wait, we were going to sample Python code?
●
Okay, what if we sample from inside the interpreter, not from
outside of it?
●
Basic idea is to use a signal handler or a separate thread to
collect stack traces right from the Python program periodically
●
A seasoned Pythonist writes this code in 20 to 40 mins, but
someone did that already
Okay next try
●
https://github.com/evanhempel/python-flamegraph
●
Works as a module
●
Starts a separate stack frames collecting thread
●
Wraps a script in question
●
Writes result to a file at the end of the run
All hail the glory of Python!
●
https://github.com/evanhempel/python-flamegraph
●
Works as a module
●
Starts a separate stack frames collecting thread
●
Wraps a script in question
●
What if a script in question is a daemon? (fork() then exit(), you know)
●
Writes result to a file at the end of the run
●
What if a script being profiled does not intend to terminate?
All hail the glory of Python!
●
What if a script in question is a daemon? (fork() then exit(), you
know)
●
What if a script being profiled does not intend to terminate?
●
I fixed both problems in 20 mins and I’m not even a Pythonist!*
* my slides can contain blatant lies
First 60 secs after start
●
What if a script in question is a daemon? (fork() then exit(), you
know)
●
What if a script being profiled does not intend to
termhttps://github.com/evanhempel/python-flamegraphinate?
●
I fixed both problems in 20 mins and I’m not even a Pythonist!*
* my slides can contain blatant lies
First 300 secs after start
●
What if a script in question is a daemon? (fork() then exit(), you
know)
●
What if a script being profiled does not intend to terminate?
●
I fixed both problems in 20 mins and I’m not even a Pythonist!*
* my slides can contain blatant lies
doPoll(), are you kidding?
●
A cat does “meow-meow”
●
An imperial trooper with a blaster does “pew-pew”
●
Your code does nothing at all
●
Deluge is fast, it seems to be just a wrapper on libtorrent
●
With ~30% overhead on CPU for some reason, but that’s why
profilers exist – we are aware of it now at least
That’s all folks!
●
Questions, please!
●
Starring:
●
Alex Chistyakov, Principal Engineer, Git in Sky
●
http://gitinsky.com
●
alex@gitinsky.com
●
http://meetup.com/DevOps-40

My talk on Piter Py 2016

  • 1.
    Performance engineering for childrenand grownups Alex Chistyakov, Principal Engineer @ Git in Sky 23.04.2016 Piter Py
  • 2.
    Who is Mr.Chistyakov? ● Hi, I’m Alex ● I’m a Go programmer... ● ...and became a Mac user recently ● So you may wonder what I am doing here ● Don’t know for sure, maybe just IT
  • 3.
    Who is Mr.Chistyakov? ● Hi, I’m Alex ● I’m a Go programmer... ● ...and became a Mac user recently ● So you may wonder what I am doing here ● Don’t know for sure, maybe just IT ● I realized I never used Cthulhu on my slides => challenge accepted!
  • 4.
    A word onCthulhu ● Cthulhu is horrendous, but... ● ...the most dreaded operation engineer’s nightmare is not Cthulhu... ● ...but a phone call at 3:40AM ● “The app is down!” ● Enough on Cthulhu then
  • 5.
    So we needa plan ● Blame Python! ● Seriously, just use Go ● Python is slow as a snail ● It has GIL, etc ● ... ● Looks like a plan!
  • 6.
    So we needa plan ● Blame Python! ● Seriously, just use Go ● Python is slow as a snail ● It has GIL, etc ● ... ● Looks like a plan! ● Measure then optimize!
  • 7.
    Measure then optimize! ● Ifyou know Latin please make a translation for me and I will get a tattoo ● Okay, we probably know how to optimize ● (Just rewrite everything) ● But what to measure? ● And how to measure in a production environment?
  • 8.
    The basic ideaseems to be quite old ● http://poormansprofiler.org/ ● 1) Periodically take CPU stack traces ● 2) Aggregate raw samples somehow ● 3) Analyze ● 4) ??????? ● 5) PROFIT!
  • 9.
    Let’s profile something! ● Weare in 2016, so Docker is imminent ● https://hub.docker.com/r/gitinsky/deluge/ ● Deluge is a BitTorrent client w/web interface ● (I use BitTorrent for educational purposes only) ● So I have 282 torrents in the list, most of them are inactive
  • 10.
    Let’s profile something! ● Weare in 2016, so Docker is imminent ● https://hub.docker.com/r/gitinsky/deluge/ ● Deluge is a BitTorrent client w/web interface ● (I use BitTorrent for educational purposes only) ● So I have 282 torrents in the list, most of them are inactive
  • 11.
    WTF did wejust see? ● Respect the screen resolution of Mac! ● But even Retina display is not enough ● Frankly speaking, PMP results are a mess
  • 12.
  • 13.
  • 14.
    Nice try, butWTF again? ● Color intensity means nothing (colors are used just to differentiate bars) ● Bar length represents a total number of all samples of that kind ● Stack roots are at the bottom of Y axis, stack leaves are at the top ● There is a common stack root (imaginary in this case) ● Time to analyze!
  • 15.
    Nice try, butWTF again? ● Color intensity means nothing (colors are used just to differentiate bars) ● Bar length represents a total number of all samples of that kind ● Stack roots are at the bottom of Y axis, stack leaves are at the top ● There is a common stack root (imaginary in this case) ● Time to analyze!
  • 16.
    Analyzing like aCIA agent! ● Left to right: ● Python interpreter ● C++, boost ● C++, libtorrent ● But we did not start anything except Python interpreter! ● clone() → start_thread() → all that C++ stuff
  • 17.
    Captain Unobvious tothe rescue ● What does our non-C++ part do? ● Not much: ● PyRun_FileExFlags ● PyEval_EvalCode ● PyEval_EvalCodeEx ● And so on ● What else did you expect from an interpreter?
  • 18.
    One does notsimply sample Python w/gdb ● Wait, we were going to sample Python code? ● Okay, what if we sample from inside the interpreter, not from outside of it? ● Basic idea is to use a signal handler or a separate thread to collect stack traces right from the Python program periodically ● A seasoned Pythonist writes this code in 20 to 40 mins, but someone did that already
  • 19.
    Okay next try ● https://github.com/evanhempel/python-flamegraph ● Worksas a module ● Starts a separate stack frames collecting thread ● Wraps a script in question ● Writes result to a file at the end of the run
  • 20.
    All hail theglory of Python! ● https://github.com/evanhempel/python-flamegraph ● Works as a module ● Starts a separate stack frames collecting thread ● Wraps a script in question ● What if a script in question is a daemon? (fork() then exit(), you know) ● Writes result to a file at the end of the run ● What if a script being profiled does not intend to terminate?
  • 21.
    All hail theglory of Python! ● What if a script in question is a daemon? (fork() then exit(), you know) ● What if a script being profiled does not intend to terminate? ● I fixed both problems in 20 mins and I’m not even a Pythonist!* * my slides can contain blatant lies
  • 22.
    First 60 secsafter start ● What if a script in question is a daemon? (fork() then exit(), you know) ● What if a script being profiled does not intend to termhttps://github.com/evanhempel/python-flamegraphinate? ● I fixed both problems in 20 mins and I’m not even a Pythonist!* * my slides can contain blatant lies
  • 23.
    First 300 secsafter start ● What if a script in question is a daemon? (fork() then exit(), you know) ● What if a script being profiled does not intend to terminate? ● I fixed both problems in 20 mins and I’m not even a Pythonist!* * my slides can contain blatant lies
  • 24.
    doPoll(), are youkidding? ● A cat does “meow-meow” ● An imperial trooper with a blaster does “pew-pew” ● Your code does nothing at all ● Deluge is fast, it seems to be just a wrapper on libtorrent ● With ~30% overhead on CPU for some reason, but that’s why profilers exist – we are aware of it now at least
  • 25.
    That’s all folks! ● Questions,please! ● Starring: ● Alex Chistyakov, Principal Engineer, Git in Sky ● http://gitinsky.com ● alex@gitinsky.com ● http://meetup.com/DevOps-40