• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Python profiling
 

Python profiling

on

  • 643 views

Python profiling article

Python profiling article
http://huangx.in/2012/9/python_profiling.html

Statistics

Views

Total Views
643
Views on SlideShare
643
Embed Views
0

Actions

Likes
3
Downloads
14
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Python profiling Python profiling Presentation Transcript

    • Python ProfilingDreampuf Sep. 2012http://huangx.in
    • Introducton 性能分析(Performance analysis 或 profiling)是收集 程序运行时信息为手段研究程序行为的分析方法。 -- Wikipedia
    • Introductondef qks(ls):    if not ls : return ls   p, ls = ls[0], ls[1:]    return qks([i for i in ls if i < p]) + [p] + qks([i for i in ls if i >= p])    if __name__ == "__main__":    import random   ls = [random.randint(0, 100) for i in xrange(10)]
    • Agendatime utility, time module, timeit, sys.getsizeofModule: profile, cProfile, hotshot, pystatsLine Profiler, Memory Profiler, ObjgraphRunSnakeRunIn ActionConclusion
    • 绿曼巴蛇,最快的蛇,每小时11公里。
    • time utilitydreampuf@HX:~/hub/fortest$ time python test_timeit.py0.143877983093real     0m0.184s  #实际总耗时user     0m0.166s  #应用耗时sys     0m0.015s  #应用期间执行系统指令耗时
    • time utility    dreampuf@HX:~/hub/fortest$ time sleep 5        real     0m5.036s    user     0m0.001s    sys     0m0.003s
    • time utility 系统级别耗时(⼀一般而言 real > user + sys) 应用耗时和系统调用耗时比率
    • time module    dreampuf@HX:~/hub/fortest$ python test_timeit.py    2.31266021729e-05    import time    def time_module():        start = time.time()        qks(ls)        return time.time() - start        COUNT = 100000    print sum([time_module() for i in xrange(COUNT)])/COUNT
    • time module平台精度灵活
    • time moduleclass TimerIt(object): def __enter__(self): self.start = time.time() def __exit__(self, *args): print time.time() - self.start with t: qks(ls*10) with t: qks(ls*100)
    • timeit    @Timer    def caller():        qks(ls)        print caller.timeit(10000)   #0.14427113533
    • timeitIn [1]: from timeit import timeitIn [2]: timeit(s.appendleft(3),   ...:        import collections; s= collections.deque(), number=1000000)Out[2]: 0.1150519847869873In [3]: timeit(s.insert(0, 3), s=[], number=1000000)Out[3]: 392.4638919830322
    • timeit 不同平台使用的time不同(Xinux 使用 `time.time()`, Windows 使用 `time.clock()`) python -m timeit -s "import collections; s = collections.deque()" -n 1000000 "s.appendleft(3)"
    • sys.getsizeof    import sys    def sys_getsizeof():        a = range(1000000)        b = xrange(1000000)        c = [i for i in xrange(1000000)]        print sys.getsizeof(a)            print sys.getsizeof(b)            print sys.getsizeof(c)    
    • sys.getsizeof    import sys    def sys_getsizeof():        a = range(1000000)        b = xrange(1000000)        c = [i for i in xrange(1000000)]        print sys.getsizeof(a)     #8000072        print sys.getsizeof(b)     #40        print sys.getsizeof(c)     #8697472
    • sys.getsizeof class Entity(object):     def __init__(self, name, age):         self.name = name         self.age = age es = [Entity("dreampuf", 100) for i in xrange(1000000)] print sys.getsizeof(es)     #8697472 print sum([sys.getsizeof(i) for i in es])   #64000000
    • sys.getsizeof 和实际有所出入(Python对象缓存池) 容器还是对象
    • 眼镜蛇王,聪明
    • Module: profile    from cProfile import Profile as profile    def profile_module():        p = profile()        p.enable()        qks(ls*100)        p.disable()        print p.getstats()
    • Module: profiledreampuf@HX:~/hub/fortest$ python test_timeit.py[_lsprof.profiler_entry(code="<method disable of_lsprof.Profiler objects>", callcount=1....
    • Module: profile    from cProfile import Profile as profile    from pstats import Stats    def profile_module():        p = profile()        p.snapshot_stats()        p.enable()        dirs(".")        p.disable()        p.print_stats(2)
    • Module: profile  dreampuf@HX:~/hub/fortest$ python test_timeit.py           135259 function calls in 0.071 seconds        Ordered by: cumulative time        ncalls  tottime  percall  cumtime  percall filename:lineno(function)          1    0.002    0.002    0.071    0.071 test_timeit.py:101(dirs)         22    0.005    0.000    0.068    0.003 test_timeit.py:74(main)      10519    0.008    0.000    0.047    0.000 ast.py:203(walk)      10498    0.004    0.000    0.035    0.000 {method extend of c...      20975    0.015    0.000    0.031    0.000 ast.py:173(iter_child_nodes)         21    0.000    0.000    0.014    0.001 ast.py:32(parse)         21    0.014    0.001    0.014    0.001 {compile}      26234    0.009    0.000    0.011    0.000 ast.py:161(iter_fields)      39368    0.008    0.000    0.008    0.000 {isinstance} .....
    • Module: profile行级别调用信息(时间,次数)可以保存执行结果友好的输出信息
    • profile vs. cProfile vs. hotshot 实现 开销 维护 平台 Python profile Python 高 持续 Like Python C cProfile 中 持续 CPython API hotshot C module 低 停止 CPython
    • Line Profiler    from line_profiler import LineProfiler    def line_profiler():        p = LineProfiler()        dirs_lineprofiler = p(dirs)        dirs_lineprofiler(".")        p.print_stats()
    • Line Profiler  dreampuf@HX:~/hub/fortest$ python test_timeit.py  Timer unit: 1e-06 s  File: test_timeit.py  Function: dirs at line 101  Total time: 0.108716 s   Line #      Hits         Time  Per Hit   % Time  Line Contents  ==============================================================     101                                           def dirs(dirpath):     102         1         1129   1129.0      1.0      for current, ds, fs in os.walk(dirpath):     103        49           56      1.1      0.1          for f in fs:     104        48           50      1.0      0.0              if f[-3:] not in (tml, .py):     105        26           16      0.6      0.0                  continue     106        22       107461   4884.6     98.8              main(os.path.join(current, f))     107         1            4      4.0      0.0          break
    • Line Profiler 直观 准确(Python C API 实现的trace)
    • Memory Profiler    dreampuf@HX:~/hub/fortest$ python test_timeit.py    Filename: test_timeit.py        Line #    Mem usage    Increment   Line Contents    ================================================       101      8.68 MB      0.00 MB   def dirs(dirpath):       102      8.68 MB      0.01 MB       for current, ds, fs in os.walk(dirpath):       103     11.59 MB      2.91 MB           for f in fs:       104     11.59 MB      0.00 MB               if f[-3:] not in (tml, .py):       105     11.59 MB      0.00 MB                   continue       106     11.59 MB      0.00 MB               main(os.path.join(current, f))       107     11.59 MB      0.00 MB           break
    • Objgraph    import objgraph    def objgraph_profile():        c = (i for i in xrange(1000000))        objgraph.show_refs([c],filename=simple-c.png)
    • Objgraph a = range(1000000)
    • Objgraph b = xrange(1000000)
    • Objgraph c = (i for i in xrange(1000000))
    • Objgraph c = (i for i in xrange(1000000))
    • Objgraph es = [Entity("dreampuf", 100) for i in xrange(1000)]
    • Objgraph es = [Entity("dreampuf", 100) for i in xrange(1000)]
    • Run Snake Run
    • In Action    File: /home/dreampuf/hub/guokr/dev_guokr/nutshell/tag/views_tag.py    Function: tag_index at line 50    Total time: 0.144165 s         Line #      Hits         Time  Per Hit   % Time  Line Contents    ==============================================================        50                                           @profile.profile        51                                           def tag_index(req, tag_name):                  73                                               #标签页问答动态        74         1         4536   4536.0      3.1      question_list, page_next = qa_question_list(tag_name,TAG_ACTIVITY, page_len*(page_num-1), page_len)        75         2         4237   2118.5      2.9      activity_info_list = [qa_item(question_id) forquestion_id in question_list]        79                                               #取出100条做合并,不⼀一定能合并出5个        80         1          544    544.0      0.4      tag_hot_answer_id = cacheutil.mccallcache(600)(pgrpc.backend.get_tag_hot_qa)(tag_name, count=20)        81         1        23148  23148.0     16.1      tag_hot_answer = contents.get_list(tag_hot_answer_id)        82                                               #tag_hot_qa_id = list(set([i.related_content_id for iin tag_hot_answer]))        93                                               #标签页侧边栏        94         1        63515  63515.0     44.1      side = _tag_side(tag_name)
    • In Action    Filename: /home/dreampuf/hub/guokr/dev_guokr/nutshell/tag/views_tag.py       Line #    Mem usage    Increment   Line Contents    ================================================        50                             @profile.mem_profile        51                             def tag_index(req, tag_name):        55     26.09 MB      1.68 MB       if req.method != GET:        56                                     raise Http404        73                                 #标签页问答动态        74     25.26 MB     -0.82 MB       question_list, page_next = qa_question_list(tag_name, TAG_ACTIVITY,page_len*(page_num-1), page_len)        75     26.09 MB      0.82 MB       activity_info_list = [qa_item(question_id) for question_id inquestion_list]               83                                 #有序且不重复        84     24.84 MB     -1.25 MB       tag_hot_qa_id = []        85     26.09 MB      1.25 MB       for i in tag_hot_answer:
    • In Action    File: /home/dreampuf/hub/guokr/dev_guokr/nutshell/api/views_tag.py    Function: tag_logo at line 29    Total time: 0.32325 s       Line #      Hits         Time  Per Hit   % Time  Line Contents    ==============================================================        31                                           @profile.profile        32                                           def tag_logo(req, tag_name):        33         1           13     13.0      0.0      ukey = req.session.get(ukey)        36         1        32441  32441.0     10.0      tag_info = services.minerva.get_tag(tag=tag_name)               46         1        17087  17087.0      5.3      form = TagLogoForm(req.POST, req.FILES)        47         1          274    274.0      0.1      if not form.is_valid():        48                                                   return sessauth.api_error(2,u参数格式错误)               51         1         8438   8438.0      2.6      data,request_id, filename = get_upload_file(req,FILE_UPLOAD_USER_MAXSIZE)               55         1            5      5.0      0.0      try:        56         1       104795 104795.0     32.4          gkimage = imageutil.GKImage(data).resize((48,48))        57         1          395    395.0      0.1          new_data = gkimage.str_value()        58                                               except Exception, ex:        59                                                   print ex
    • In Action    Filename: /home/dreampuf/hub/guokr/dev_guokr/nutshell/api/views_tag.py    Line #    Mem usage    Increment   Line Contents    ================================================        31                             @profile.mem_profile        32     25.62 MB      0.00 MB   def tag_logo(req, tag_name):        33     25.62 MB      0.01 MB       ukey = req.session.get(ukey)        46     26.31 MB      0.68 MB       form = TagLogoForm(req.POST, req.FILES)        47     26.31 MB      0.00 MB       if not form.is_valid():        48                                     return sessauth.api_error(2,u参数格式错误)                           50     26.31 MB      0.00 MB       try:        51     26.95 MB      0.64 MB           data,request_id, filename = get_upload_file(req,FILE_UPLOAD_USER_MAXSIZE)        52                                 except UploadError,ex:        53                                     return sessauth.api_error(ex.errnum,ex.errmsg)        55     26.95 MB      0.00 MB       try:        56     27.11 MB      0.16 MB           gkimage = imageutil.GKImage(data).resize((48, 48))        57     27.11 MB      0.00 MB           new_data = gkimage.str_value()        58                                 except Exception, ex:        59                                     print ex        79     27.29 MB      0.18 MB       pgrpc.cache.kill(tag, tag_name)        87     26.66 MB     -0.64 MB       context: tag_info[logo],        89     26.68 MB      0.02 MB       services.juno.audit(**data)
    • How to 定位问题,缓存   优化算法,使用trick   理清逻辑,缓存,分时计算  
    • Conclusion理解你的程序不要过度抽象使用合适的Entity使用`__slots__`避免重复引用不常见情况使用异常,避免对常见情况使用异常推崇函数式编程和迭代,使用装饰器和元类
    • Reference- [究竟是什么使Linux滴答地响](http://article.yeeyan.org/view/20180/5371)- [The Python Profilers](http://docs.python.org/library/profile.html)- [Profiling threads](http://code.activestate.com/recipes/465831-profiling-threads/)- [High Performance Python tutorial](http://ianozsvald.com/HighPerformancePythonfromTrainingatEuroPython2011_v0.2.pdf)- [RunSnakeRun](http://www.vrplumber.com/programming/runsnakerun/) 图形性能分析工具- [用profile协助程序性能优化](http://blog.csdn.net/lanphaday/article/details/1483728)- [profile,cProfile, and stats - Performance analysis of Python programs](http://www.doughellmann.com/PyMOTW/profile/)- [Line Profiler](http://packages.python.org/line_profiler/)- [Memory Profiler](https://github.com/fabianp/memory_profiler)- [Objgraph](http://mg.pov.lt/objgraph/)- [A guide to analyzing Python performance](http://www.huyng.com/posts/python-performance-analysis/)- [性能分析](http://zh.wikipedia.org/wiki/性能分析)