Swift	
  Profiling	
  Middleware	
  and	
  Tools
	
  

	
  
Agenda
	
  
l 

Background	
  

l 

Profiling	
  Proposal	
  

l 

Profiling	
  Architecture	
  

l 

Profiling	
  Data	
...
Background
	
  
l 

Profiling	
  -­‐	
  a	
  form	
  of	
  dynamic	
  program	
  analysis	
  that	
  measures	
  	
  
- 
-...
Profiling	
  Proposal
	
  
l 

The	
  Goal	
  
l 

l 

Target	
  for	
  researchers,	
  developers	
  and	
  admins,	
  ...
Swift	
  Profiling	
  Architecture
	
  
Profiling	
  Granularity
	
  
l 

System	
  Level	
  
- 

Region	
  
- 

- 

Zone	
  
- 

- 

Availability	
  zone	
  

No...
Profiling	
  Granularity
	
  
l 

Code	
  Level	
  (Python	
  Runtime)	
  
- 

Package	
  

- 

eventlet,	
  xattr,	
  swi...
Profiling	
  Deployment	
  and	
  Data	
  Model
	
  
Region

Zone

Node
Node

WSGI Server
WSGI Server

CPU Profiler
Profile...
Profiling	
  Tools	
  Available	
  or	
  Needed
	
  
Profiling report and visualization open source tools
Granularity
All l...
Profiling	
  Middleware
	
  
[pipeline:main]
pipeline = profile … proxy-server
[filter:profile]
use = egg:swift#profile
log...
Performance	
  Overhead	
  for	
  Profiling	
  Middleware	
  
	
  

Node
CosBench
Controller
Cosbench Driver1
Cosbench Driv...
swift/common/profile.py
from eventlet.green import profile
from memory_profiler import LineProfiler
import linecache
impor...
eventlet	
  awared	
  profiling
	
  
import sys
import eventlet
from eventlet.green import urllib2
import time
sys.path.app...
Profiling	
  Analysis
	
  
•  Top-­‐K	
  statistics	
  analysis	
  through	
  drill-­‐down,	
  roll-­‐up,	
  slicing	
  to	...
Profiling	
  Report	
  Tool	
  –	
  pstat2
	
  
#python pstats2.py '../data/hybrid/object.*’
%?
Documented commands (type h...
Profiling	
  Visualization	
  Tool	
  -­‐	
  kcachegrind
	
  
Profiling	
  Visualization	
  Tool	
  -­‐	
  kcachegrind
	
  
Call graph of PUT function for object server
Example 1 - Profiling Analysis of File System Call
•  posix call time consumption on object server(1MB, R80/W20)
Time	
  o...
Example 2 - Profiling Analysis of sqlite db call
Time	
  of	
  DB	
  CALL	
  of	
  A/C	
  Server	
  
db.py:107(<lambda>)	
...
Swift profiling middleware and tools
Upcoming SlideShare
Loading in …5
×

Swift profiling middleware and tools

1,217 views

Published on

Published in: Design, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,217
On SlideShare
0
From Embeds
0
Number of Embeds
22
Actions
Shares
0
Downloads
26
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Swift profiling middleware and tools

  1. 1. Swift  Profiling  Middleware  and  Tools    
  2. 2. Agenda   l  Background   l  Profiling  Proposal   l  Profiling  Architecture   l  Profiling  Data  Model   l  Profiling  Tools   l  Profiling  Analysis  
  3. 3. Background   l  Profiling  -­‐  a  form  of  dynamic  program  analysis  that  measures     -  -  l  the  usage  of  particular  instructions   -  l  the  space  (memory)  or  time  complexity  of  a  program   Frequency  and  duration  of  function  calls   Instrument  either  source  code  or  binary  executable  form  using  a  tool  called  profiler.   The  missing  part  of  current  profiling  method  is  to  provide  details  of  code  level   information  and  explains:     l  How  often  the  significant  part  of  code  is  executed  or  called?   l  How  long  it  take  to  execute  these  calls?   l  l  l  Where's  the  most  time  consumed?    On  I/O  operations,  waiting  for  db  lock  or  wasting   cycles  in  loop?   Why  does  the  response  time  of  container  PUT  operation  increase?   Where  does  the  memory  leaking  happen?    how  much  memory  consumed  by  specific   code  snippet?  
  4. 4. Profiling  Proposal   l  The  Goal   l  l  Target  for  researchers,  developers  and  admins,  provide  a  method  of  profiling  Swift  code  to   improve  current  implementation  and  architecture  based  on  the  generated  data  and  its  analysis.   Scope   l  A  WSGI  middleware  to  inject  swift  servers  to  collect  profiling  data   l  l  l  The  middleware  can  be  configured  with  parameters  in  paste  file   Dump  the  profiling  data  periodically  into  local  disk   A  multi-­‐dimension  data  model     l  l  profiling  analysis,  including  dimension  of  workload,  system,  code,  time  and  metrics  of   frequency,  duration,  memory  consumed,  object  counts,  call  graph  etc.   Analysis  tools  of  report  and  visualization   l  l  l  Can  leverage  open  source  tools   Can  be  integrated  into  admin  dashboard  of  Horizon   Blueprint  and  POC  are  submitted  for  discussion.  
  5. 5. Swift  Profiling  Architecture  
  6. 6. Profiling  Granularity   l  System  Level   -  Region   -  -  Zone   -  -  Availability  zone   Node   l  -  Higher  latency  off-­‐site  locations    e.g.  storage  node,  proxy  node   Process   l  l    Daemons  such  as  replicator,  auditor,  updater   WSGI  application  such  as  Proxy  server,  a/c/o  server  
  7. 7. Profiling  Granularity   l  Code  Level  (Python  Runtime)   -  Package   -  eventlet,  xattr,  swift.common   Module   l  l  -  Function   l  -  e.g.  db.py,  swob.py,  wsgi.py,  http.py    e.g.    __init__,  __call__,  HEAD,  GET,  PUT,  POST,  DELETE   Code  Line   l  specific  line  of  code  
  8. 8. Profiling  Deployment  and  Data  Model   Region Zone Node Node WSGI Server WSGI Server CPU Profiler Profile Data Model Memory Profiler Daemon Daemon IO Profiler Multi-Dimensional Profiling Data Model Time Workload Read/Write Object Size Guests Dimensions System Zone Code Profiling Data Model Region Package Module Time Metrics Frequency Memory Consumed Objects Count Logic Call Graph Process Function Duration Space Node Memory Leaks LineNo
  9. 9. Profiling  Tools  Available  or  Needed   Profiling report and visualization open source tools Granularity All layers CPU Time/Call Graph pstat, runsnake, kcachegrind Profiling open source hooks Granularity Process Memory memstat? Disk/Network I/O iostat? aggregate/slice/drill-down CPU Time/Call Graph repoze.profile Memory Disk/Network I/O objgraph Package Module Function Code Line profile, cProfile, hotshot eventlet.green.profile memory_profiler eventlet_io_profiler?
  10. 10. Profiling  Middleware   [pipeline:main] pipeline = profile … proxy-server [filter:profile] use = egg:swift#profile log_filename_prefix = /opt/stack/data/swift/profile/pn1/proxy.profile dump_interval = 5 dump_timestamp = false discard_first_request = true path = /__profile__ flush_at_shutdown = false unwind = false
  11. 11. Performance  Overhead  for  Profiling  Middleware     Node CosBench Controller Cosbench Driver1 Cosbench Driver2 Proxy Account Container Object1 Object2 Memory 3GB 3GB 3GB 31GB 35GB 35GB 31GB 31GB Worker Replicas 120 120 24 24 24 24 24 3 3 3 3 3
  12. 12. swift/common/profile.py from eventlet.green import profile from memory_profiler import LineProfiler import linecache import inspect import time, io, sys, os def cpu_profiler(func): def cpu_profiler(log_file, with_timestamp=False): def _outer_fn(func): def _inner_fn(*args, **kwargs): ts = time.time() fpath = ''.join([log_file,'-', str(ts)]) prof = profile.Profile() pcall = prof.runcall(func, *args, **kwargs) prof.dump_stats(fpath) return pcall return _inner_fn return _outer_fn def mem_profiler(log_file, with_timestamp=False):  Profile  Hook  for  Swift   swift/swift/proxy/server.py: from swift.common.profile import cpu_profiler, mem_profiler @cpu_profiler(‘/opt/stack/data/swift/profile/proxy.cprofile’) def __call__(self, env, start_response): @mem_profiler(‘/opt/stack/data/swift/profile/proxy.mprofile’) def handle_request(self, req): … import cpu and swift/swift/container/server.py memory profiler from swift.common.profile import cpu_profiler, mem_profiler @cpu_profiler(‘/opt/stack/data/swift/profile/container.cprofile’) def __call__(self, env, start_response): ... def _outer_fn(func): def _inner_fn(*args, **kwargs): ts = time.time() prof = LineProfiler() val = prof(func)(*args, **kwargs) dump fpath = ''.join([log_file, '-' , str(ts)]) astream = io.open(fpath,'w') show_results(prof, astream, precision=3) astream.flush() return val openstack@openstackvm:/opt/stack/data/swift/profile$ ll total 188 return _inner_fn drwxrwxr-x 2 openstack openstack 4096 Jul 18 16:35 ./ return _outer_fn drwxr-xr-x 7 openstack openstack 4096 Jul 18 15:17 ../ -rw-r--r-- 1 openstack openstack 105502 Jul 18 16:35 proxy.cprofile -rw-r--r-- 1 openstack openstack 1391 Jul 18 16:35 proxy.mprofile -rw-r--r-- 1 openstack openstack 7195 Jul 18 16:35 container.cprofile profile data
  13. 13. eventlet  awared  profiling   import sys import eventlet from eventlet.green import urllib2 import time sys.path.append('./') from decorators import profile_eventlet def some_long_calculation(id) x = 0 for i in xrange(1,100000000): x= i+x/i print x Output of standard profile:✖ ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 7.071 7.071 test_regular_profile2.py:10(some_work) 1 7.070 7.070 7.070 7.070 test_regular_profile2.py:5(some_long_calculation) Output of eventlet aware profile:✔ ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 test_eventlet_builtin_profile2.py:14(some_work) 1 7.380 7.380 7.380 7.380 test_eventlet_builtin_profile2.py:9(some_long_calculation) def some_work(id): print('start') eventlet.sleep(0) print('end') @profile_eventlet('./ep1.profile') def main(): pile = eventlet.GreenPool(1000) pile.spawn(some_work, 1) #pile.spawn(some_long_calculation, 2) pile.waitall() if __name__ == '__main__': main() some prior art •  •  https://github.com/colinhowe/eventlet_profiler https://lists.secondlife.com/pipermail/eventletdev/2012-September/ 001094.html
  14. 14. Profiling  Analysis   •  Top-­‐K  statistics  analysis  through  drill-­‐down,  roll-­‐up,  slicing  to  identity  hot  code   snippets  or  potential  bottleneck  to  be  optimized   –  e.g.  function  call  frequency  and  duration  per  node  (sortable,  filterable,  aggregation)   –  e.g.  module  call  frequency  and  duration  per  node  (sortable,  filterable,  aggregation)   •  Linear  or  non-­‐linear  algorithm  analysis  to  identify  scalability  problem   –  e.g.  Object  read/write  throughput  at  different  workload   •  Evolution  analysis   l  l  Code  association  analysis   l      e.g.  Capture  profile  data  by  time  interval  and  compare   e.g.  Call  graph  
  15. 15. Profiling  Report  Tool  –  pstat2   #python pstats2.py '../data/hybrid/object.*’ %? Documented commands (type help <topic>): ======================================== EOF callees dump kcachegrind quit read runsnake stats tojson add callers help list rawdata reverse sort strip % sort calls % stats swift 5 3909969520 function calls (3495132609 primitive calls) in 77381.834 seconds Ordered by: call count List reduced from 526 to 110 due to restriction <'swift'> List reduced from 110 to 5 due to restriction <5> ncalls tottime percall cumtime percall filename:lineno(function) 54546321 130.314 0.000 220.887 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:211(_normalize) 44597503 80.804 0.000 258.501 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:219(__getitem__) 17635615 25.768 0.000 34.190 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:659(getter) 16130776 61.326 0.000 85.730 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:267(__setitem__) 9948818 19.429 0.000 62.618 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/swob.py:230(__contains__) % kcachegrind
  16. 16. Profiling  Visualization  Tool  -­‐  kcachegrind  
  17. 17. Profiling  Visualization  Tool  -­‐  kcachegrind   Call graph of PUT function for object server
  18. 18. Example 1 - Profiling Analysis of File System Call •  posix call time consumption on object server(1MB, R80/W20) Time  of  POSIX  CALL  of  Object  Server  (1M)   0.114   0.001   0%   0%   13.44   8%   {posix.stat}   {posix.unlink}   {posix.open}   {posix.close}   {posix.read}   {posix.listdir}   {posix.getpid}   11.547   6.898   4%   7%   71.866   44%   18.216   11%   18.46   12%   {posix.write}   21.977   14%   {posix.urandom}  
  19. 19. Example 2 - Profiling Analysis of sqlite db call Time  of  DB  CALL  of  A/C  Server   db.py:107(<lambda>)   db.py: db.py: db.py: 27.244   db.py: 809(_commit_puts)   751(get_db_version)   db.py: 2%   173(__init__)   102(execute)   10.252   3.932   1162(merge_items)   14.711   1%   0%   22.603   40.968   1%   2%   4%   db.py:92(_timeout)   57.488   5%   db.py: 119(chexor)   73.197   7%   db.py:887(put_object)   157.067   db.py:103(<lambda>)   14%   711.755   64%   db.py:86(__init__)   2.353   0%   db.py:103(<lambda>)   db.py:887(put_object)   db.py:119(chexor)   db.py:92(_timeout)   db.py:1162(merge_items)   db.py:107(<lambda>)   db.py:173(__init__)   db.py:102(execute)   db.py:809(_commit_puts)   db.py:751(get_db_version)   db.py:86(__init__)  

×