Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Debugging of
(C)Python applications
June the 20th, KharkivPy
Roman Podoliaka (@amd4ever)
http://bit.ly/1LpjXGL
Why debugging?
• open source cloud platform
• dozens of (micro-)services
• new features are important, but
making OpenStac...
A little humble OpenStack
Typical environment
• CentOS 6 or Ubuntu 14.04
• CPython 2.6 or 2.7
• eventlet-based concurrency model for Python
services...
Credits
• “Debugging Python applications in Production”
by Vladimir Kirillov (https://www.youtube.com/
watch?v=F9FHIghn_Vk...
printf() debugging
printf() debugging: python-memcache
def _get_server(self, key):
if isinstance(key, tuple):
serverhash, key = key
else:
ser...
printf() debugging: just don’t do that!
• the most primitive way of introspection at runtime
• either always enabled or ex...
Logging
Logging: basics
import logging
FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"
logging.basicConfig(format=FO...
Logging: log levels
if is_pid_cmdline_correct(pid, conffile.split('/')[-1]):
try:
_execute('kill', '-HUP', pid, run_as_roo...
Logging: log records propagation
import logging
LOG = logging.getLogger('sqlalchemy.orm')
...
LOG.debug('Instance changed ...
Logging: context matters
cfg.StrOpt('logging_context_format_string',
default='%(asctime)s.%(msecs)03d %(process)d %(leveln...
Logging: log processing
• logs are collected from different sources and parsed
(Logstash)
• then they are imported into a ...
Logging: log processing
title: Kernel Neighbour table overflow
query: >
filename:kernel.log
AND level:warning
AND message:...
Logging: summary
• useful for both developers and operators
• developers define verbosity by the means of
logging levels
• ...
Logging: useful links
• General info: https://docs.python.org/3.3/howto/
logging.html#logging-howto
• Adding contextual in...
pdb
pdb: basics
def _binary_search(arr, left, right, key):
if left == right:
return -1
middle = left + (right - left) / 2
if k...
pdb: basics
Romans-MacBook-Air:03-pdb malor$ python -m pdb basics.py
> /Users/malor/Dropbox/talks/kharkivpy-debugging/exam...
pdb: basics
(Pdb) list
1 def _binary_search(arr, left, right, key):
2 -> if left == right:
3 return -1
4
5 middle = left +...
pdb: post-mortem debugging
Romans-MacBook-Air:03-pdb malor$ python -m pdb basics.py
> /Users/malor/Dropbox/talks/kharkivpy...
pdb: commands
(Pdb) break
Num Type Disp Enb Where
1 breakpoint keep yes at /Users/malor/Dropbox/talks/kharkivpy-debugging/...
pdb: conditional break points
(Pdb) break binary_search
Breakpoint 1 at /Users/malor/Dropbox/talks/kharkivpy-debugging/exa...
pdb: summary
• bread and butter of Python developers
• usually the easiest and the quickest way of debugging scripts/apps
...
winpdb
winpdb: attaching to a process
rpodolyaka@rpodolyaka-pc:~/sandbox/debugging$ rpdb2 -d search.py
A password should be set t...
winpdb: attaching to a process
> bp binary_search
> bl
List of breakpoints:
Id State Line Filename-Scope-Condition-Encodin...
winpdb: embedded debugging
def add_lease(mac, ip_address):
"""Set the IP that was assigned by the DHCP server."""
import r...
winpdb: debugging of threads
def allocate_ips(engine, host):
while True:
with engine.begin() as conn:
result = conn.execut...
winpdb: debugging of threads
t1 = threading.Thread(target=allocate_ips, args=(eng, 'host1'))
t1.start()
t2 = threading.Thr...
winpdb: debugging of threads
> thread 2
Focus was set to chosen thread.
> stack
Stack trace for thread 140456380675840:
Fr...
winpdb: summary
• allows to debug multithreaded Python
applications
• remote debugging (which effectively means, no
stdout...
cProfile
cProfile: basics
def count_freq(stream):
res = {}
for i in iter(lambda: stream.read(1), ''):
try:
res[i] += 1
except KeyErr...
cProfile: Amdahl's law
cProfile: basics
Romans-MacBook-Air:07-cprofile malor$ python -m cProfile -s cumtime huffman.py ~/
Downloads/kharkivpy-debu...
cProfile: visualisation
cProfile: context matters
import cProfile as profiler
import gc, pstats, time
def profile(fn):
def wrapper(*args, **kw):
el...
cProfile: context matters
from werkzeug.contrib.profiler import ProfilerMiddleware
app = ProfilerMiddleware(app)
cProfile: context matters
PATH: '/6e0f43cd74db46f5b95f2142fe0c9431/flavors/detail'
2732 function calls (2602 primitive call...
cProfile: summary
• easy CPU profiling of Python code with low
overhead
• text/binary representation of profiling results (th...
objgraph
objgraph: basics
In [1]: import objgraph
In [2]: objgraph.show_most_common_types()
function 4530
dict 2483
tuple 1428
wrap...
objgraph: basics
In [3]: objgraph.show_growth()
function 4530 +4530
dict 2412 +2412
tuple 1353 +1353
wrapper_descriptor 12...
objgraph: graphs
>>> x = []
>>> y = [x, [x], {‘x’: x}]
>>> objgraph.show_refs([y], filename='sample-graph.png')
strace
strace: tracing syscalls
rpodolyaka@rpodolyaka-pc:~$ strace -e network python sa.py
. . .
socket(PF_INET6, SOCK_STREAM, IP...
strace: tracing syscalls
root@node-13:~# strace -p 1508 -s 4096 -tt
. . .
16:53:29.532770 epoll_wait(7, {}, 1023, 0) = 0
1...
strace: summary
• allows tracing of applications interactions with `outside
world`
• points possible problems with perform...
gdb
gdb: prerequisites
• Ubuntu/Debian:
• sudo apt-get install gdb python-dbg
• CentOS/RHEL/Fedora (separate debuginfo
package...
gdb: basics
• python-dbg is a CPython binary built with
‘--with-debug -g’ options. It’s slow and verbose
about memory mana...
gdb: `hanging` app
def allocate_ips(eng, host):
while True:
with eng.begin() as conn:
row = conn.execute(
ip_addresses.sel...
gdb: `hanging` app
rpodolyaka@rpodolyaka-pc:~$ strace -p 20267
Process 20267 attached
futex(0x7fea50000c10, FUTEX_WAIT_PRI...
gdb: `hanging` app
(gdb) t a 2 py-bt
Thread 2 (Thread 0x7f7702c83700 (LWP 20353)):
Traceback (most recent call first):
Fil...
gdb: virtualenv pitfalls
rpodolyaka@rpodolyaka-pc:~$ gdb -p 20656 # WARN: executable not passed!
(gdb) py-bt
Undefined com...
gdb: summary
• allows to debug multithreaded applications
• allows to attach to a running process at any given moment of t...
htop
htop
lsof
lsof: lsof -p $PID
nova-api 5910 nova mem REG 252,0 141574 3586 /lib/x86_64-linux-
gnu/libpthread-2.19.so
nova-api 5910 no...
netstat
netstat: netstat -nlap
tcp 8 0 192.168.0.16:52819 192.168.0.11:5673 ESTABLISHED
5975/python
tcp 0 0 192.168.0.16:36901 192...
perf_events
perf_events: perf top
perf_events: perf trace
254.663 ( 0.001 ms): sshd/22802 clock_gettime(which_clock: 7, tp: 0x7ffd0e807970
) = 0
254.666 ( 0...
perf_events: perf stat
Performance counter stats for 'python sa.py':
125.242831 task-clock (msec) # 0.004 CPUs utilized
94...
Questions?
slides: http://bit.ly/1LpjXGL
twitter: @amd4ever
Upcoming SlideShare
Loading in …5
×

Debugging of (C)Python applications

322 views

Published on

A talk on debugging of Python applications given at a local KharkivPy event.

A brief introduction into a set of tools that allow Python developers to debug common issues in their applications.

Published in: Software
  • Be the first to comment

Debugging of (C)Python applications

  1. 1. Debugging of (C)Python applications June the 20th, KharkivPy Roman Podoliaka (@amd4ever) http://bit.ly/1LpjXGL
  2. 2. Why debugging? • open source cloud platform • dozens of (micro-)services • new features are important, but making OpenStack stable, scalable and HA is even more important • every day performance testing on hundreds of bare metal nodes • nightly CI jobs running functional and destructive tests • things break… pretty much all the time!
  3. 3. A little humble OpenStack
  4. 4. Typical environment • CentOS 6 or Ubuntu 14.04 • CPython 2.6 or 2.7 • eventlet-based concurrency model for Python services • MySQL (Galera), memcache [, MongoDB] • RabbitMQ
  5. 5. Credits • “Debugging Python applications in Production” by Vladimir Kirillov (https://www.youtube.com/ watch?v=F9FHIghn_Vk) • Brendan Gregg’s Blog (http:// www.brendangregg.com/blog/index.html)
  6. 6. printf() debugging
  7. 7. printf() debugging: python-memcache def _get_server(self, key): if isinstance(key, tuple): serverhash, key = key else: serverhash = serverHashFunction(key) if not self.buckets: return None, None for i in range(Client._SERVER_RETRIES): server = self.buckets[serverhash % len(self.buckets)] if server.connect(): # print("(using server %s)" % server,) return server, key serverhash = serverHashFunction(str(serverhash) + str(i)) return None, None
  8. 8. printf() debugging: just don’t do that! • the most primitive way of introspection at runtime • either always enabled or explicitly commented in the code • limited to stdout/stderror streams • information is only (barely) usable for developers • pollutes the code when committed to VCS repositories
  9. 9. Logging
  10. 10. Logging: basics import logging FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s" logging.basicConfig(format=FORMAT) d = {'clientip': '192.168.0.1', 'user': 'fbloggs'} logging.warning("Protocol problem: %s", "connection reset", extra=d) 2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset
  11. 11. Logging: log levels if is_pid_cmdline_correct(pid, conffile.split('/')[-1]): try: _execute('kill', '-HUP', pid, run_as_root=True) _add_dnsmasq_accept_rules(dev) return except Exception as exc: LOG.error(_LE('kill -HUP dnsmasq threw %s'), exc) else: LOG.debug('Pid %d is stale, relaunching dnsmasq', pid) Level Numeric value CRITICAL 50 ERROR 40 WARNING 30 INFO 20 DEBUG 10 NOTSET 0
  12. 12. Logging: log records propagation import logging LOG = logging.getLogger('sqlalchemy.orm') ... LOG.debug('Instance changed state from `%(prev_state)s` to `%(new_state)s`', prev_state=prev_state, new_state=new_state) sqlalchemy.orm -> sqlalchemy -> (root)
  13. 13. Logging: context matters cfg.StrOpt('logging_context_format_string', default='%(asctime)s.%(msecs)03d %(process)d %(levelname)s ' '%(name)s [%(request_id)s %(user_identity)s] ' ‘%(instance)s%(message)s’) 2015-06-10 12:42:00.765 27516 INFO nova.osapi_compute.wsgi.server [req-58f233ab- f2b6-452f-b4fe-0c781ce8f8d0 None] 192.168.0.1 "GET /v2/ fc7f78f1c53d4443976514d2fd16e5cb/images/det ail HTTP/1.1" status: 200 len: 905 time: 0.1043971 2015-06-10 12:41:57.004 2760 AUDIT nova.virt.block_device [req-209db629-0d06-4f81-92ad-b910f1a72b36 None] [instance: a0d1c6ef-1fa8-46f9-a19d- f8fb7d2df6a2] Booting with volume 8bad9533-9d6f-4be8-939d-b7a28a536a1a at /dev/vda
  14. 14. Logging: log processing • logs are collected from different sources and parsed (Logstash) • then they are imported into a full-text search system (ElasticSearch) • Web UI is used for providing easy access to results and querying (Kibana)
  15. 15. Logging: log processing title: Kernel Neighbour table overflow query: > filename:kernel.log AND level:warning AND message:neighbour AND message:overflow title: Neutron Skipping router removal query: > filename:neutron-l3-agent.log AND location:neutron.agent.l3_agent AND message:skipping AND message:removal title: Neutron OVS lib errors and warnings query: > filename:neutron-openvswitch-agent.log AND location:neutron.agent.linux.ovs_lib AND level:(error OR warning) title: Neutron race condition at subnet deletion query: > filename:neutron AND level:trace AND message:AttributeError
  16. 16. Logging: summary • useful for both developers and operators • developers define verbosity by the means of logging levels • configurable handlers (file, syslog, network, etc) • advanced tooling for log processing / monitoring
  17. 17. Logging: useful links • General info: https://docs.python.org/3.3/howto/ logging.html#logging-howto • Adding contextual information: https:// docs.python.org/2/howto/logging- cookbook.html#adding-contextual-information-to- your-logging-output • Logstash/ElasticSearch/Kibana: http:// www.logstash.net/docs/1.4.2/tutorials/getting- started-with-logstash
  18. 18. pdb
  19. 19. pdb: basics def _binary_search(arr, left, right, key): if left == right: return -1 middle = left + (right - left) / 2 if key == arr[middle]: return middle elif key > arr[middle]: return _binary_search(arr, middle, right, key) else: return _binary_search(arr, left, middle, key) def binary_search(arr, key): return _binary_search(arr, 0, len(arr), key) l = list(range(10)) assert binary_search(l, 5) == 5 assert binary_search(l, 0) == 0 assert binary_search(l, 9) == 9 assert binary_search(l, 10) == -1 assert binary_search(l, -5) == -1
  20. 20. pdb: basics Romans-MacBook-Air:03-pdb malor$ python -m pdb basics.py > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(1)<module>() -> def _binary_search(arr, left, right, key): (Pdb) break binary_search Breakpoint 1 at /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py:15 (Pdb) continue > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(16)binary_search() -> return _binary_search(arr, 0, len(arr), key) (Pdb) args arr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] key = 5 (Pdb) step --Call-- > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(1)_binary_search() -> def _binary_search(arr, left, right, key): (Pdb) next > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(2)_binary_search() -> if left == right:
  21. 21. pdb: basics (Pdb) list 1 def _binary_search(arr, left, right, key): 2 -> if left == right: 3 return -1 4 5 middle = left + (right - left) / 2 6 7 if key == arr[middle]: 8 return middle 9 elif key > arr[middle]: 10 return _binary_search(arr, middle, right, key) 11 else: (Pdb) where /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ bdb.py(400)run() -> exec cmd in globals, locals <string>(1)<module>() /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(20)<module>() -> assert binary_search(l, 5) == 5 /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(16)binary_search() -> return _binary_search(arr, 0, len(arr), key) > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(2)_binary_search() -> if left == right:
  22. 22. pdb: post-mortem debugging Romans-MacBook-Air:03-pdb malor$ python -m pdb basics.py > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(1)<module>() -> def _binary_search(arr, left, right, key): (Pdb) continue Traceback (most recent call last): File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ pdb.py", line 1314, in main pdb._runscript(mainpyfile) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ pdb.py", line 1233, in _runscript self.run(statement) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ bdb.py", line 400, in run exec cmd in globals, locals File "<string>", line 1, in <module> File "basics.py", line 1, in <module> def _binary_search(arr, left, right, key): File "basics.py", line 16, in binary_search return _binary_search(arr, 0, len(arr), key) … RuntimeError: maximum recursion depth exceeded Uncaught exception. Entering post mortem debugging Running 'cont' or 'step' will restart the program -> return _binary_search(arr, middle, right, key) py.test --pdb nosetest --pdb -s . . .
  23. 23. pdb: commands (Pdb) break Num Type Disp Enb Where 1 breakpoint keep yes at /Users/malor/Dropbox/talks/kharkivpy-debugging/ examples/03-pdb/basics.py:15 breakpoint already hit 2 times (Pdb) commands 1 (com) args (com) where (com) end (Pdb) continue arr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] key = 5 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ bdb.py(400)run() -> exec cmd in globals, locals <string>(1)<module>() /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(20)<module>() -> assert binary_search(l, 5) == 5 > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(16)binary_search() -> return _binary_search(arr, 0, len(arr), key) > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(16)binary_search() -> return _binary_search(arr, 0, len(arr), key)
  24. 24. pdb: conditional break points (Pdb) break binary_search Breakpoint 1 at /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py:15 (Pdb) break Num Type Disp Enb Where 1 breakpoint keep yes at /Users/malor/Dropbox/talks/kharkivpy-debugging/ examples/03-pdb/basics.py:15 (Pdb) condition 1 key == 10 (Pdb) continue > /Users/malor/Dropbox/talks/kharkivpy-debugging/examples/03-pdb/ basics.py(16)binary_search() -> return _binary_search(arr, 0, len(arr), key) (Pdb) args arr = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] key = 10
  25. 25. pdb: summary • bread and butter of Python developers • usually the easiest and the quickest way of debugging scripts/apps • integrated with popular test runners • greenlet-friendly • requires stdin/stdout, thus not usable for debugging daemons or embedded Python code (like Gimp or Blender plugins) • not suitable for debugging of multithreaded/multiprocessing applications • can’t attach to a running process (if not modified in advance)
  26. 26. winpdb
  27. 27. winpdb: attaching to a process rpodolyaka@rpodolyaka-pc:~/sandbox/debugging$ rpdb2 -d search.py A password should be set to secure debugger client-server communication. Please type a password:r00tme Password has been set rpodolyaka@rpodolyaka-pc:~$ rpdb2 RPDB2 - The Remote Python Debugger, version RPDB_2_4_8, Copyright (C) 2005-2009 Nir Aides. > password r00tme Password is set to: "r00tme" > attach Connecting to 'localhost'... Scripts to debug on 'localhost': pid name -------------------------- 3706 /home/rpodolyaka/sandbox/debugging/search.py > attach 3706 > *** Attaching to debuggee...
  28. 28. winpdb: attaching to a process > bp binary_search > bl List of breakpoints: Id State Line Filename-Scope-Condition-Encoding ------------------------------------------------------------------------------ 0 enabled 15 /home/rpodolyaka/sandbox/debugging/search.py binary_search > go > *** Debuggee is waiting at break point for further commands. > stack Stack trace for thread 140416296978176: Frame File Name Line Function ------------------------------------------------------------------------------ > 0 ...ndbox/debugging/search.py 15 <module> 1 ....7/dist-packages/rpdb2.py 14220 StartServer 2 ....7/dist-packages/rpdb2.py 14470 main 3 /usr/bin/rpdb2 31 <module>
  29. 29. winpdb: embedded debugging def add_lease(mac, ip_address): """Set the IP that was assigned by the DHCP server.""" import rpdb2; rpdb2.start_embedded_debugger('r00tme') api = network_rpcapi.NetworkAPI() api.lease_fixed_ip(context.get_admin_context(), ip_address, CONF.host) dnsmasq daemon forks and executes this like: nova-dhcpbridge add AA:BB:CC:DD:EE:FF 10.0.0.2
  30. 30. winpdb: debugging of threads def allocate_ips(engine, host): while True: with engine.begin() as conn: result = conn.execute( ip_addresses.select() .where(ip_addresses.c.host.is_(None)) ).first() if result is None: # no IPs left break id, address = result.id, result.address rows = conn.execute( ip_addresses.update() .values(host=host) .where(ip_addresses.c.id == id) .where(ip_addresses.c.address == address) .where(ip_addresses.c.host.is_(None)) ) if not rows: # concurrent update continue
  31. 31. winpdb: debugging of threads t1 = threading.Thread(target=allocate_ips, args=(eng, 'host1')) t1.start() t2 = threading.Thread(target=allocate_ips, args=(eng, 'host2')) t2.start() t1.join() t2.join() > attach $PID … > thread List of active threads known to the debugger: No Tid Name State ----------------------------------------------- 0 140456866166528 MainThread waiting at break point > 1 140456389068544 Thread-1 waiting at break point 2 140456380675840 Thread-2 waiting at break point
  32. 32. winpdb: debugging of threads > thread 2 Focus was set to chosen thread. > stack Stack trace for thread 140456380675840: Frame File Name Line Function ------------------------------------------------------------------------------ > 0 /home/rpodolyaka/sa.py 30 allocate_ips 1 ...ib/python2.7/threading.py 763 run > go > break > *** Debuggee is waiting at break point for further commands. > stack Stack trace for thread 140456380675840: Frame File Name Line Function ------------------------------------------------------------------------------ > 0 ...alchemy/engine/default.py 409 do_commit 1 ...sqlalchemy/engine/base.py 525 _commit_impl 2 ...sqlalchemy/engine/base.py 1364 _do_commit
  33. 33. winpdb: summary • allows to debug multithreaded Python applications • remote debugging (which effectively means, no stdout/stdint limitations as with pdb) • wxWidgets-based GUI • to attach to a running process you need to modified it in advance (embedded debugging) or start it with rpdb2
  34. 34. cProfile
  35. 35. cProfile: basics def count_freq(stream): res = {} for i in iter(lambda: stream.read(1), ''): try: res[i] += 1 except KeyError: res[i] = 1 return res def build_tree(stream): queue = [Node(freq=v, symb=k) for k, v in count_freq(stream).items()] while len(queue) > 1: queue.sort(key=lambda k: k.freq) first = queue.pop(0) second = queue.pop(0) queue.append( Node(freq=(first.freq + second.freq), left=first, right=second) ) return queue[0]
  36. 36. cProfile: Amdahl's law
  37. 37. cProfile: basics Romans-MacBook-Air:07-cprofile malor$ python -m cProfile -s cumtime huffman.py ~/ Downloads/kharkivpy-debugging.key 24868775 function calls in 14.059 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.008 0.008 14.059 14.059 huffman.py:1(<module>) 1 0.001 0.001 14.051 14.051 huffman.py:33(build_tree) 1 5.029 5.029 14.035 14.035 huffman.py:23(count_freq) 12417038 3.863 0.000 9.006 0.000 huffman.py:25(<lambda>) 12417038 5.143 0.000 5.143 0.000 {method 'read' of 'file' objects} 255 0.009 0.000 0.014 0.000 {method 'sort' of 'list' objects} 32895 0.005 0.000 0.005 0.000 huffman.py:36(<lambda>) 511 0.001 0.000 0.001 0.000 huffman.py:7(__init__) 510 0.000 0.000 0.000 0.000 {method 'pop' of 'list' objects} 1 0.000 0.000 0.000 0.000 functools.py:53(total_ordering) 1 0.000 0.000 0.000 0.000 {open} 256 0.000 0.000 0.000 0.000 {len} 255 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {dir} 1 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects} 3 0.000 0.000 0.000 0.000 {setattr} 3 0.000 0.000 0.000 0.000 {getattr} 1 0.000 0.000 0.000 0.000 {max}
  38. 38. cProfile: visualisation
  39. 39. cProfile: context matters import cProfile as profiler import gc, pstats, time def profile(fn): def wrapper(*args, **kw): elapsed, stat_loader, result = _profile(“out.prof”, fn, *args, **kw) stats = stat_loader() stats.sort_stats('cumulative') stats.print_stats() return result return wrapper def _profile(filename, fn, *args, **kw): load_stats = lambda: pstats.Stats(filename) gc.collect() began = time.time() profiler.runctx('result = fn(*args, **kw)', globals(), locals(), filename=filename) ended = time.time() return ended - began, load_stats, locals()['result']
  40. 40. cProfile: context matters from werkzeug.contrib.profiler import ProfilerMiddleware app = ProfilerMiddleware(app)
  41. 41. cProfile: context matters PATH: '/6e0f43cd74db46f5b95f2142fe0c9431/flavors/detail' 2732 function calls (2602 primitive calls) in 1.294 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 1.287 1.287 /usr/lib/python2.7/dist-packages/nova/ api/compute_req_id.py:38(__call__) 2/1 0.008 0.004 1.287 1.287 /usr/lib/python2.7/dist-packages/ webob/request.py:1300(send) 2/1 0.000 0.000 1.287 1.287 /usr/lib/python2.7/dist-packages/ webob/request.py:1262(call_application) 1 0.000 0.000 1.287 1.287 /usr/lib/python2.7/dist-packages/nova/ api/openstack/__init__.py:121(__call__) 1 0.000 0.000 1.271 1.271 /usr/lib/python2.7/dist-packages/ keystonemiddleware/auth_token.py:686(__call__) 1 0.000 0.000 1.270 1.270 /usr/lib/python2.7/dist-packages/ keystonemiddleware/auth_token.py:829(_validate_token) 1 0.000 0.000 1.270 1.270 /usr/lib/python2.7/dist-packages/ keystonemiddleware/auth_token.py:1669(get) 1 0.000 0.000 1.270 1.270 /usr/lib/python2.7/dist-packages/ keystonemiddleware/auth_token.py:1726(_cache_get)
  42. 42. cProfile: summary • easy CPU profiling of Python code with low overhead • text/binary representation of profiling results (the latter can be used for merging results and/or visualisation done by external tools) • can’t attach to a running process • can’t profile Python interpreter-level code (Py_EvaluateFrameEx, etc)
  43. 43. objgraph
  44. 44. objgraph: basics In [1]: import objgraph In [2]: objgraph.show_most_common_types() function 4530 dict 2483 tuple 1428 wrapper_descriptor 1260 weakref 981 list 911 builtin_function_or_method 897 method_descriptor 705 getset_descriptor 531 type 473
  45. 45. objgraph: basics In [3]: objgraph.show_growth() function 4530 +4530 dict 2412 +2412 tuple 1353 +1353 wrapper_descriptor 1272 +1272 weakref 985 +985 list 904 +904 builtin_function_or_method 897 +897 method_descriptor 706 +706 getset_descriptor 535 +535 type 473 +473 In [4]: objgraph.show_growth() weakref 986 +1 list 905 +1 tuple 1354 +1
  46. 46. objgraph: graphs >>> x = [] >>> y = [x, [x], {‘x’: x}] >>> objgraph.show_refs([y], filename='sample-graph.png')
  47. 47. strace
  48. 48. strace: tracing syscalls rpodolyaka@rpodolyaka-pc:~$ strace -e network python sa.py . . . socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 5 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0 setsockopt(5, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0 connect(5, {sa_family=AF_INET6, sin6_port=htons(5432), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress) getsockopt(5, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 getsockname(5, {sa_family=AF_INET6, sin6_port=htons(36894), inet_pton(AF_INET6, ":: 1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 sendto(5, "00010432226/", 8, MSG_NOSIGNAL, NULL, 0) = 8 recvfrom(5, "S", 16384, 0, NULL, NULL) = 1 . . .
  49. 49. strace: tracing syscalls root@node-13:~# strace -p 1508 -s 4096 -tt . . . 16:53:29.532770 epoll_wait(7, {}, 1023, 0) = 0 16:53:29.532832 epoll_wait(7, {}, 1023, 0) = 0 16:53:29.532892 epoll_wait(7, {}, 1023, 0) = 0 16:53:29.532953 epoll_wait(7, {}, 1023, 0) = 0 16:53:29.533022 epoll_wait(7, {{EPOLLIN, {u32=9, u64=39432335262744585}}}, 1023, 915) = 1 16:53:29.596409 epoll_ctl(7, EPOLL_CTL_DEL, 9, {EPOLLRDNORM|EPOLLWRBAND|EPOLLMSG| 0x28c45820, {u32=32644, u64=22396489217113988}}) = 0 16:53:29.596494 accept(9, 0x7ffe1ef32b10, [16]) = -1 EAGAIN (Resource temporarily unavailable) 16:53:29.596638 epoll_ctl(7, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=9, u64=39432335262744585}}) = 0 16:53:29.596747 epoll_wait(7, {{EPOLLIN, {u32=9, u64=39432335262744585}}}, 1023, 851) = 1 16:53:29.611852 epoll_ctl(7, EPOLL_CTL_DEL, 9, {EPOLLRDNORM|EPOLLWRBAND|EPOLLMSG| 0x28c45820, {u32=32644, u64=22396489217113988}}) = 0 16:53:29.611937 accept(9, 0x7ffe1ef32b10, [16]) = -1 EAGAIN (Resource temporarily unavailable) . . .
  50. 50. strace: summary • allows tracing of applications interactions with `outside world` • points possible problems with performance (like excessive system calls, polling of events with too small timeout, etc) • limited to tracing of system calls of one process and its forks • use cautiously on production environments as it greatly affects performance
  51. 51. gdb
  52. 52. gdb: prerequisites • Ubuntu/Debian: • sudo apt-get install gdb python-dbg • CentOS/RHEL/Fedora (separate debuginfo package repository): • sudo yum install gdb python-debuginfo
  53. 53. gdb: basics • python-dbg is a CPython binary built with ‘--with-debug -g’ options. It’s slow and verbose about memory management • you can debug regular CPython processes in production using the debug symbols shipped separately • gdb has Python bindings to write scripts for it • CPython is shipped with a gdb script allowing to analyse interpreter-level stack frames to get app-level backtraces
  54. 54. gdb: `hanging` app def allocate_ips(eng, host): while True: with eng.begin() as conn: row = conn.execute( ip_addresses.select() .where(ip_addresses.c.host.is_(None)) ).fetchone() if row is None: break id, address = row.id, row.address updated_rows = conn.execute( ip_addresses.update() .values(host=host) .where(ip_addresses.c.id == id) .where(ip_addresses.c.host.is_(None)) ) if not updated_rows: continue t = threading.Thread(target=allocate_ips, args=(eng, 'host1')) t.start() t.join()
  55. 55. gdb: `hanging` app rpodolyaka@rpodolyaka-pc:~$ strace -p 20267 Process 20267 attached futex(0x7fea50000c10, FUTEX_WAIT_PRIVATE, 0, NULL rpodolyaka@rpodolyaka-pc:~$ gdb /usr/bin/python3.4 -p 20216 (gdb) t a a frame Thread 2 (Thread 0x7f7702c83700 (LWP 20353)): #0 sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101 101 ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S: No such file or directory. Thread 1 (Thread 0x7f770a03b700 (LWP 20350)): #0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85 85 ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S: No such file or directory.
  56. 56. gdb: `hanging` app (gdb) t a 2 py-bt Thread 2 (Thread 0x7f7702c83700 (LWP 20353)): Traceback (most recent call first): File "/usr/lib/python3.4/threading.py", line 294, in wait gotit = waiter.acquire(True, timeout) File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/util/ queue.py", line 157, in get self.not_empty.wait(remaining) File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/pool.py", line 1039, in _do_get return self._pool.get(wait, self._timeout) File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/engine/ base.py", line 2037, in contextual_connect self._wrap_pool_connect(self.pool.connect, None), File "/home/rpodolyaka/venv3/lib/python3.4/site-packages/sqlalchemy/engine/ base.py", line 1906, in begin conn = self.contextual_connect(close_with_result=close_with_result) File "sa.py", line 31, in allocate_ips with eng.begin() as conn: File "/usr/lib/python3.4/threading.py", line 868, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner self.run() File "/usr/lib/python3.4/threading.py", line 888, in _bootstrap self._bootstrap_inner()
  57. 57. gdb: virtualenv pitfalls rpodolyaka@rpodolyaka-pc:~$ gdb -p 20656 # WARN: executable not passed! (gdb) py-bt Undefined command: "py-bt". Try "help". (gdb) bt #0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85 #1 0x00000000004cdff5 in PyThread_acquire_lock_timed () #2 0x0000000000522039 in ?? () #3 0x00000000004ee01a in PyEval_EvalFrameEx () #4 0x00000000004ec9fc in PyEval_EvalCodeEx () #5 0x00000000004f25a9 in PyEval_EvalFrameEx () #6 0x00000000004ec9fc in PyEval_EvalCodeEx () #7 0x00000000004f25a9 in PyEval_EvalFrameEx () #8 0x00000000004ec9fc in PyEval_EvalCodeEx () #9 0x0000000000581115 in ?? () #10 0x00000000005ab019 in PyRun_FileExFlags () #11 0x00000000005aa194 in PyRun_SimpleFileExFlags () #12 0x00000000004cb4cb in Py_Main () #13 0x00000000004ca8ef in main ()
  58. 58. gdb: summary • allows to debug multithreaded applications • allows to attach to a running process at any given moment of time • can be used for analysing of core dumps (e.g. if we don’t want to stop a process, or if it died unexpectedly) • can be used for debugging of C-extensions, CFFI calls, etc • success depends on how CPython was built and whether you have installed debug symbols or not • used by pyringe to provide pdb-like experience (https:// github.com/google/pyringe)
  59. 59. htop
  60. 60. htop
  61. 61. lsof
  62. 62. lsof: lsof -p $PID nova-api 5910 nova mem REG 252,0 141574 3586 /lib/x86_64-linux- gnu/libpthread-2.19.so nova-api 5910 nova mem REG 252,0 149120 3582 /lib/x86_64-linux- gnu/ld-2.19.so nova-api 5910 nova mem REG 252,0 26258 52555 /usr/lib/x86_64- linux-gnu/gconv/gconv-modules.cache nova-api 5910 nova 0u CHR 1,3 0t0 1029 /dev/null nova-api 5910 nova 1u CHR 136,13 0t0 16 /dev/pts/13 nova-api 5910 nova 2u CHR 136,13 0t0 16 /dev/pts/13 nova-api 5910 nova 3w REG 252,0 34967268 135756 /var/log/nova/ nova-api.log nova-api 5910 nova 4u unix 0xffff880850b92a00 0t0 260406 socket nova-api 5910 nova 5r FIFO 0,8 0t0 260407 pipe nova-api 5910 nova 6w FIFO 0,8 0t0 260407 pipe nova-api 5910 nova 7u IPv4 260408 0t0 TCP node-13.domain.tld:8773 (LISTEN) nova-api 5910 nova 8r CHR 1,9 0t0 1034 /dev/urandom nova-api 5910 nova 9u IPv4 260409 0t0 TCP node-13.domain.tld:8774 (LISTEN) nova-api 5910 nova 10u IPv4 260420 0t0 TCP *:8775 (LISTEN) nova-api 5910 nova 15u 0000 0,9 0 7380 anon_inode
  63. 63. netstat
  64. 64. netstat: netstat -nlap tcp 8 0 192.168.0.16:52819 192.168.0.11:5673 ESTABLISHED 5975/python tcp 0 0 192.168.0.16:36901 192.168.0.11:5673 ESTABLISHED 1513/python tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 3042/sshd tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 13888/mysqld tcp 0 0 0.0.0.0:25 0.0.0.0:* LISTEN 7433/master tcp 0 0 0.0.0.0:3260 0.0.0.0:* LISTEN 19704/tgtd tcp 0 0 192.168.0.16:35357 0.0.0.0:* LISTEN 5546/python
  65. 65. perf_events
  66. 66. perf_events: perf top
  67. 67. perf_events: perf trace 254.663 ( 0.001 ms): sshd/22802 clock_gettime(which_clock: 7, tp: 0x7ffd0e807970 ) = 0 254.666 ( 0.003 ms): sshd/22802 read(fd: 14</dev/ptmx>, buf: 0x7ffd0e8038b0, count: 16384 ) = 4095 254.672 ( 0.243 ms): chrome/11973 epoll_wait(epfd: 16, events: 0x6a6a1b73480, maxevents: 32, timeout: 4294967295) = 1 254.678 ( 0.003 ms): chrome/11973 read(fd: 24<socket:[147806]>, buf: 0x6a6a2d5b018, count: 4096 ) = 32 254.685 ( 0.003 ms): chrome/11973 write(fd: 11<pipe:[147797]>, buf: 0x7f940dfa55e7, count: 1 ) = 1 254.688 ( 0.001 ms): chrome/11973 read(fd: 24<socket:[147806]>, buf: 0x6a6a2d5b018, count: 4096 ) = -1 EAGAIN Resource temporarily unavailable 254.691 ( 0.001 ms): chrome/11973 epoll_wait(epfd: 16, events: 0x6a6a1b73480, maxevents: 32 ) = 0 254.693 ( 0.001 ms): chrome/11973 epoll_wait(epfd: 16, events: 0x6a6a1b73480, maxevents: 32 ) = 0
  68. 68. perf_events: perf stat Performance counter stats for 'python sa.py': 125.242831 task-clock (msec) # 0.004 CPUs utilized 945 context-switches # 0.008 M/sec 14 cpu-migrations # 0.112 K/sec 6,996 page-faults # 0.056 M/sec 408,133,256 cycles # 3.259 GHz 213,117,410 stalled-cycles-frontend # 52.22% frontend cycles idle <not supported> stalled-cycles-backend 432,245,331 instructions # 1.06 insns per cycle # 0.49 stalled cycles per insn 91,417,607 branches # 729.923 M/sec 3,937,108 branch-misses # 4.31% of all branches 30.130596204 seconds time elapsed
  69. 69. Questions? slides: http://bit.ly/1LpjXGL twitter: @amd4ever

×