Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Monitoring the
performance of Python
web applications
Graham Dumpleton
PyCon HK - November 2015
http://newrelic.com
http://newrelic.com
http://blog.newrelic.com/wp-content/uploads/rum_timeline_diagram_aligned_web_res.jpg
http://newrelic.com
http://newrelic.com
http://www.modwsgi.org
Why it matters
• Reduce transaction response times so users are
happier.
• Reduce costs by making better use of the
resour...
Visualising traffic
Concurrent requests
3
2
1
Processes
Threads
1
2
3
Capacity utilisation
3
2
1
Processes
Threads
1
2
3
CPU burn (request)
I/O Bound - 1 Client
I/O Bound - 4 Clients
CPU Bound - 1 Client
CPU Bound - 2 Clients
CPU Bound - 4 Clients (1)
4
1
2
CPU burn calculation
CPU usage
CPU burn = ————
request time
CPU usage = user CPU time + system CPU time
Increasing concurrency
0%
25%
50%
75%
100%
0 secs
3 secs
6 secs
9 secs
12 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time ...
CPU burn (process)
CPU Bound - 4 Clients (2)
4
1
2
100% CPU burn
0%
40%
80%
120%
160%
0 secs
3 secs
6 secs
9 secs
12 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time (request...
25% CPU burn
0%
40%
80%
120%
160%
0 secs
0.75 secs
1.5 secs
2.25 secs
3 secs
1 2 3 4 5 6 7 8 9 10
Request time CPU time (r...
Global interpreter lock
Poor mans threading
Waiting for I/O (thread is blocked)
Running (thread active)
Waiting for GIL
Thread 1
Thread 2
1
2
3
4
...
100% CPU burn

4 Processes / 1 Thread
0%
40%
80%
120%
160%
0 secs
0.25 secs
0.5 secs
0.75 secs
1 secs
1 2 3 4 5 6 7 8 9 10...
100% CPU burn + Queue time

4 Processes / 1 Thread
0%
40%
80%
120%
160%
0 secs
0.5 secs
1 secs
1.5 secs
2 secs
1 2 3 4 5 6...
Reaching capacity
4 Clients ==> 4 Processes / 1 Thread
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5 clients ==> 4 Processes / 1 Thread
Capacity reached
Delayed
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3 4
All requests are not the same
Don’t trust benchmarks
Is there an answer?
I/O vs CPU
• I/O bound request handlers.
• Okay to use multiple threads.
• CPU bound request handlers.
• Better to use mul...
I/O and CPU
• Use no more than 3 to 5 threads per process.
• Use a small number of processes.
• Watch the CPU utilisation ...
Partitioning
Proxy
CPU
I/O
CPU
Multiple threads.
Single threaded.
Multiple processes./cpu-tasks
/io-tasks
Daemon mode
WSGIDaemonProcess mixed 

processes=3 threads=5
WSGIDaemonProcess io 

processes=1 threads=25
WSGIDaemonProces...
Request Delegation
WSGIScriptAlias / /some/path/app.wsgi 
application-group=%{GLOBAL}
<Location /io-tasks>
WSGIProcessGrou...
DEMO TIME
(If we have enough time)
Contact me
Graham.Dumpleton@gmail.com
@GrahamDumpleton
http://blog.dscpl.com.au
http://blog.openshift.com
PyCon HK 2015 -  Monitoring the performance of python web applications
PyCon HK 2015 -  Monitoring the performance of python web applications
PyCon HK 2015 -  Monitoring the performance of python web applications
Upcoming SlideShare
Loading in …5
×

PyCon HK 2015 - Monitoring the performance of python web applications

588 views

Published on

This talk uses targeted testing to analyse the performance of WSGI servers used to host Python web applications.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

PyCon HK 2015 - Monitoring the performance of python web applications

  1. 1. Monitoring the performance of Python web applications Graham Dumpleton PyCon HK - November 2015
  2. 2. http://newrelic.com
  3. 3. http://newrelic.com
  4. 4. http://blog.newrelic.com/wp-content/uploads/rum_timeline_diagram_aligned_web_res.jpg
  5. 5. http://newrelic.com
  6. 6. http://newrelic.com
  7. 7. http://www.modwsgi.org
  8. 8. Why it matters • Reduce transaction response times so users are happier. • Reduce costs by making better use of the resources you have available.
  9. 9. Visualising traffic
  10. 10. Concurrent requests 3 2 1 Processes Threads 1 2 3
  11. 11. Capacity utilisation 3 2 1 Processes Threads 1 2 3
  12. 12. CPU burn (request)
  13. 13. I/O Bound - 1 Client
  14. 14. I/O Bound - 4 Clients
  15. 15. CPU Bound - 1 Client
  16. 16. CPU Bound - 2 Clients
  17. 17. CPU Bound - 4 Clients (1) 4 1 2
  18. 18. CPU burn calculation CPU usage CPU burn = ———— request time CPU usage = user CPU time + system CPU time
  19. 19. Increasing concurrency 0% 25% 50% 75% 100% 0 secs 3 secs 6 secs 9 secs 12 secs 1 2 3 4 5 6 7 8 9 10 Request time CPU time (request) CPU burn (request) Concurrent requests
  20. 20. CPU burn (process)
  21. 21. CPU Bound - 4 Clients (2) 4 1 2
  22. 22. 100% CPU burn 0% 40% 80% 120% 160% 0 secs 3 secs 6 secs 9 secs 12 secs 1 2 3 4 5 6 7 8 9 10 Request time CPU time (request) CPU burn (request) CPU burn (process) Concurrent requests
  23. 23. 25% CPU burn 0% 40% 80% 120% 160% 0 secs 0.75 secs 1.5 secs 2.25 secs 3 secs 1 2 3 4 5 6 7 8 9 10 Request time CPU time (request) CPU burn (request) CPU burn (process) Concurrent requests
  24. 24. Global interpreter lock
  25. 25. Poor mans threading Waiting for I/O (thread is blocked) Running (thread active) Waiting for GIL Thread 1 Thread 2 1 2 3 4 5 6
  26. 26. 100% CPU burn
 4 Processes / 1 Thread 0% 40% 80% 120% 160% 0 secs 0.25 secs 0.5 secs 0.75 secs 1 secs 1 2 3 4 5 6 7 8 9 10 Request time CPU time (request) CPU burn (request) CPU burn (process) Concurrent requests
  27. 27. 100% CPU burn + Queue time
 4 Processes / 1 Thread 0% 40% 80% 120% 160% 0 secs 0.5 secs 1 secs 1.5 secs 2 secs 1 2 3 4 5 6 7 8 9 10 Request time CPU time (request) CPU burn (request) CPU burn (process) Queue time (max) Concurrent requests
  28. 28. Reaching capacity 4 Clients ==> 4 Processes / 1 Thread 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4
  29. 29. 5 clients ==> 4 Processes / 1 Thread Capacity reached Delayed 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4
  30. 30. All requests are not the same
  31. 31. Don’t trust benchmarks
  32. 32. Is there an answer?
  33. 33. I/O vs CPU • I/O bound request handlers. • Okay to use multiple threads. • CPU bound request handlers. • Better to use multiple processes. • Restrict processes to single threads, or at most two if requests have very short response time.
  34. 34. I/O and CPU • Use no more than 3 to 5 threads per process. • Use a small number of processes. • Watch the CPU utilisation of processes. • Be prepared to scale out to more hosts.
  35. 35. Partitioning Proxy CPU I/O CPU Multiple threads. Single threaded. Multiple processes./cpu-tasks /io-tasks
  36. 36. Daemon mode WSGIDaemonProcess mixed 
 processes=3 threads=5 WSGIDaemonProcess io 
 processes=1 threads=25 WSGIDaemonProcess cpu 
 processes=5 threads=1 WSGIProcessGroup mixed
  37. 37. Request Delegation WSGIScriptAlias / /some/path/app.wsgi application-group=%{GLOBAL} <Location /io-tasks> WSGIProcessGroup io </Location> <Location /cpu-tasks> WSGIProcessGroup cpu </Location>
  38. 38. DEMO TIME (If we have enough time)
  39. 39. Contact me Graham.Dumpleton@gmail.com @GrahamDumpleton http://blog.dscpl.com.au http://blog.openshift.com

×