Your SlideShare is downloading. ×
Swift profiling middleware and tools
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Swift profiling middleware and tools


Published on

Published in: Design, Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Swift  Profiling  Middleware  and  Tools    
  • 2. Agenda   l  Background   l  Profiling  Proposal   l  Profiling  Architecture   l  Profiling  Data  Model   l  Profiling  Tools   l  Profiling  Analysis  
  • 3. Background   l  Profiling  -­‐  a  form  of  dynamic  program  analysis  that  measures     -  -  l  the  usage  of  particular  instructions   -  l  the  space  (memory)  or  time  complexity  of  a  program   Frequency  and  duration  of  function  calls   Instrument  either  source  code  or  binary  executable  form  using  a  tool  called  profiler.   The  missing  part  of  current  profiling  method  is  to  provide  details  of  code  level   information  and  explains:     l  How  often  the  significant  part  of  code  is  executed  or  called?   l  How  long  it  take  to  execute  these  calls?   l  l  l  Where's  the  most  time  consumed?    On  I/O  operations,  waiting  for  db  lock  or  wasting   cycles  in  loop?   Why  does  the  response  time  of  container  PUT  operation  increase?   Where  does  the  memory  leaking  happen?    how  much  memory  consumed  by  specific   code  snippet?  
  • 4. Profiling  Proposal   l  The  Goal   l  l  Target  for  researchers,  developers  and  admins,  provide  a  method  of  profiling  Swift  code  to   improve  current  implementation  and  architecture  based  on  the  generated  data  and  its  analysis.   Scope   l  A  WSGI  middleware  to  inject  swift  servers  to  collect  profiling  data   l  l  l  The  middleware  can  be  configured  with  parameters  in  paste  file   Dump  the  profiling  data  periodically  into  local  disk   A  multi-­‐dimension  data  model     l  l  profiling  analysis,  including  dimension  of  workload,  system,  code,  time  and  metrics  of   frequency,  duration,  memory  consumed,  object  counts,  call  graph  etc.   Analysis  tools  of  report  and  visualization   l  l  l  Can  leverage  open  source  tools   Can  be  integrated  into  admin  dashboard  of  Horizon   Blueprint  and  POC  are  submitted  for  discussion.  
  • 5. Swift  Profiling  Architecture  
  • 6. Profiling  Granularity   l  System  Level   -  Region   -  -  Zone   -  -  Availability  zone   Node   l  -  Higher  latency  off-­‐site  locations    e.g.  storage  node,  proxy  node   Process   l  l    Daemons  such  as  replicator,  auditor,  updater   WSGI  application  such  as  Proxy  server,  a/c/o  server  
  • 7. Profiling  Granularity   l  Code  Level  (Python  Runtime)   -  Package   -  eventlet,  xattr,  swift.common   Module   l  l  -  Function   l  -  e.g.,,,    e.g.    __init__,  __call__,  HEAD,  GET,  PUT,  POST,  DELETE   Code  Line   l  specific  line  of  code  
  • 8. Profiling  Deployment  and  Data  Model   Region Zone Node Node WSGI Server WSGI Server CPU Profiler Profile Data Model Memory Profiler Daemon Daemon IO Profiler Multi-Dimensional Profiling Data Model Time Workload Read/Write Object Size Guests Dimensions System Zone Code Profiling Data Model Region Package Module Time Metrics Frequency Memory Consumed Objects Count Logic Call Graph Process Function Duration Space Node Memory Leaks LineNo
  • 9. Profiling  Tools  Available  or  Needed   Profiling report and visualization open source tools Granularity All layers CPU Time/Call Graph pstat, runsnake, kcachegrind Profiling open source hooks Granularity Process Memory memstat? Disk/Network I/O iostat? aggregate/slice/drill-down CPU Time/Call Graph repoze.profile Memory Disk/Network I/O objgraph Package Module Function Code Line profile, cProfile, hotshot memory_profiler eventlet_io_profiler?
  • 10. Profiling  Middleware   [pipeline:main] pipeline = profile … proxy-server [filter:profile] use = egg:swift#profile log_filename_prefix = /opt/stack/data/swift/profile/pn1/proxy.profile dump_interval = 5 dump_timestamp = false discard_first_request = true path = /__profile__ flush_at_shutdown = false unwind = false
  • 11. Performance  Overhead  for  Profiling  Middleware     Node CosBench Controller Cosbench Driver1 Cosbench Driver2 Proxy Account Container Object1 Object2 Memory 3GB 3GB 3GB 31GB 35GB 35GB 31GB 31GB Worker Replicas 120 120 24 24 24 24 24 3 3 3 3 3
  • 12. swift/common/ from import profile from memory_profiler import LineProfiler import linecache import inspect import time, io, sys, os def cpu_profiler(func): def cpu_profiler(log_file, with_timestamp=False): def _outer_fn(func): def _inner_fn(*args, **kwargs): ts = time.time() fpath = ''.join([log_file,'-', str(ts)]) prof = profile.Profile() pcall = prof.runcall(func, *args, **kwargs) prof.dump_stats(fpath) return pcall return _inner_fn return _outer_fn def mem_profiler(log_file, with_timestamp=False):  Profile  Hook  for  Swift   swift/swift/proxy/ from swift.common.profile import cpu_profiler, mem_profiler @cpu_profiler(‘/opt/stack/data/swift/profile/proxy.cprofile’) def __call__(self, env, start_response): @mem_profiler(‘/opt/stack/data/swift/profile/proxy.mprofile’) def handle_request(self, req): … import cpu and swift/swift/container/ memory profiler from swift.common.profile import cpu_profiler, mem_profiler @cpu_profiler(‘/opt/stack/data/swift/profile/container.cprofile’) def __call__(self, env, start_response): ... def _outer_fn(func): def _inner_fn(*args, **kwargs): ts = time.time() prof = LineProfiler() val = prof(func)(*args, **kwargs) dump fpath = ''.join([log_file, '-' , str(ts)]) astream =,'w') show_results(prof, astream, precision=3) astream.flush() return val openstack@openstackvm:/opt/stack/data/swift/profile$ ll total 188 return _inner_fn drwxrwxr-x 2 openstack openstack 4096 Jul 18 16:35 ./ return _outer_fn drwxr-xr-x 7 openstack openstack 4096 Jul 18 15:17 ../ -rw-r--r-- 1 openstack openstack 105502 Jul 18 16:35 proxy.cprofile -rw-r--r-- 1 openstack openstack 1391 Jul 18 16:35 proxy.mprofile -rw-r--r-- 1 openstack openstack 7195 Jul 18 16:35 container.cprofile profile data
  • 13. eventlet  awared  profiling   import sys import eventlet from import urllib2 import time sys.path.append('./') from decorators import profile_eventlet def some_long_calculation(id) x = 0 for i in xrange(1,100000000): x= i+x/i print x Output of standard profile:✖ ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 7.071 7.071 1 7.070 7.070 7.070 7.070 Output of eventlet aware profile:✔ ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 1 7.380 7.380 7.380 7.380 def some_work(id): print('start') eventlet.sleep(0) print('end') @profile_eventlet('./ep1.profile') def main(): pile = eventlet.GreenPool(1000) pile.spawn(some_work, 1) #pile.spawn(some_long_calculation, 2) pile.waitall() if __name__ == '__main__': main() some prior art •  • 001094.html
  • 14. Profiling  Analysis   •  Top-­‐K  statistics  analysis  through  drill-­‐down,  roll-­‐up,  slicing  to  identity  hot  code   snippets  or  potential  bottleneck  to  be  optimized   –  e.g.  function  call  frequency  and  duration  per  node  (sortable,  filterable,  aggregation)   –  e.g.  module  call  frequency  and  duration  per  node  (sortable,  filterable,  aggregation)   •  Linear  or  non-­‐linear  algorithm  analysis  to  identify  scalability  problem   –  e.g.  Object  read/write  throughput  at  different  workload   •  Evolution  analysis   l  l  Code  association  analysis   l      e.g.  Capture  profile  data  by  time  interval  and  compare   e.g.  Call  graph  
  • 15. Profiling  Report  Tool  –  pstat2   #python '../data/hybrid/object.*’ %? Documented commands (type help <topic>): ======================================== EOF callees dump kcachegrind quit read runsnake stats tojson add callers help list rawdata reverse sort strip % sort calls % stats swift 5 3909969520 function calls (3495132609 primitive calls) in 77381.834 seconds Ordered by: call count List reduced from 526 to 110 due to restriction <'swift'> List reduced from 110 to 5 due to restriction <5> ncalls tottime percall cumtime percall filename:lineno(function) 54546321 130.314 0.000 220.887 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/ 44597503 80.804 0.000 258.501 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/ 17635615 25.768 0.000 34.190 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/ 16130776 61.326 0.000 85.730 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/ 9948818 19.429 0.000 62.618 0.000 /usr/local/lib/python2.7/dist-packages/swift-1.9.1-py2.7.egg/swift/common/ % kcachegrind
  • 16. Profiling  Visualization  Tool  -­‐  kcachegrind  
  • 17. Profiling  Visualization  Tool  -­‐  kcachegrind   Call graph of PUT function for object server
  • 18. Example 1 - Profiling Analysis of File System Call •  posix call time consumption on object server(1MB, R80/W20) Time  of  POSIX  CALL  of  Object  Server  (1M)   0.114   0.001   0%   0%   13.44   8%   {posix.stat}   {posix.unlink}   {}   {posix.close}   {}   {posix.listdir}   {posix.getpid}   11.547   6.898   4%   7%   71.866   44%   18.216   11%   18.46   12%   {posix.write}   21.977   14%   {posix.urandom}  
  • 19. Example 2 - Profiling Analysis of sqlite db call Time  of  DB  CALL  of  A/C  Server<lambda>) 27.244 809(_commit_puts)   751(get_db_version) 2%   173(__init__)   102(execute)   10.252   3.932   1162(merge_items)   14.711   1%   0%   22.603   40.968   1%   2%   4%   57.488   5% 119(chexor)   73.197   7%   157.067<lambda>)   14%   711.755   64%   2.353   0%<lambda>)<lambda>)