Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Writing Fast Code - PyCon HK 2015

426 views

Published on

Do profile, please.

Published in: Engineering

Writing Fast Code - PyCon HK 2015

  1. 1. Writing Fast Code PyCon HK 2015 iam@younggun.kim
  2. 2. Younggun, Kim http://younggun.kim @scari_net scari
  3. 3. Badass Alien @ District 9, SMARTSTUDY http://pengpenghu.com PyCon Korea Organizer http://pycon.kr @pyconkr PyCon APAC 2016 Host (2016/Aug/13-15)
  4. 4. What I Think My Code Run Movie - The Good The Bad The Weird, 2008
  5. 5. How My Code Really Run The Killers : All These Things That I’ve Done M/V
 https://youtu.be/sZTpLvsYYHw
  6. 6. Objective
  7. 7. 1. Understanding How Computer Works
  8. 8. 2. How to use Profiler
  9. 9. But why?
  10. 10. Say, thousands of people using your code everyday and if you save 1 second to run it, this means you could save over 4 days of time human race wasted per a year.
  11. 11. See How Computer Works and How Fast Computer and it s peripherals
  12. 12. I/O >> 4D Wall >> Memory
  13. 13. Morse Code Modem (2400) CDMA(2G) HSPA(3G, DL) LTE* USB 2.0 802.11n USB 3.0 SATA 3.0 Thunderbolt 2 DDR2 1066Mhz DDR3 1600Mhz ≈ 21 bps ≈ 2400 bps ≈ 153 kbit/s ≈ 13.98 Mbit/s ≈ 100 Mbit/s ≈ 480 Mbit/s ≈ 600 Mbit/s ≈ 3 Gbit/s ≈ 6 Gbit/s ≈ 20 Gbit/s ≈ 64 Gbit/s ≈ 102.4 Gbit/s https://en.wikipedia.org/wiki/List_of_device_bit_rates
  14. 14. Yes! Memory is blazing fast! (Really?)
  15. 15. DDR3 1600Mhz FSB 400 (old Xeon) PCI Express 3.0 (x16) QuickPath Interconnect HyperTransport 3.1 L3 Cache(i7-4790X) L2 Cache(i7-4790X) ≈ 12.8 GB/s ≈ 12.8 GB/s ≈ 16 GB/s ≈ 38.4 GB/s ≈ 51.2 GB/s ≈ 170 GB/s ≈ 308 GB/s Nope!
  16. 16. Computer Knows Only 0 and 1
  17. 17. 00100000001000100000000101011110 Like This
  18. 18. 00100000001000100000000101011110 opcode addr 1 addr 2 value MIPS32 Add Immediate instruction (ADDI) addi $r1, $r2, 350 $r1 = $r2 + 350
  19. 19. Computer Execute These Instruction per clock basis
  20. 20. Clock (Hz)
  21. 21. 1Hz
  22. 22. 1Hz L1 Cache Acces L2 Cache Access L3 Cache Access RAM Access SSD I/O HDD I/O Internet: Tokyo to SF Run IPython (0.6s) Reboot (5m) 3s 9s 43s 6m 2-6 days 1-12 months 12 years 63 years 32,000 years!!
  23. 23. Disassemble Python Code To CPython Bytecode To Support Analysis dis module https://docs.python.org/3/library/dis.html
 https://github.com/python/cpython/blob/master/Include/opcode.h
  24. 24. line # of source op addr / instruction annotations param
  25. 25. An Empty List Creation [] vs list()
  26. 26. Dictionary {} vs dict()
  27. 27. Find an element in a list using for-loop vs in
  28. 28. A tool for dynamic program analysis that measure the space or time complexity of a program. Profilers
  29. 29. • cProfile (profile) • hotshot • line_profiler • memory_profiler • yappi • profiling • pyinstrument • plop • pprofile
  30. 30. cProfile • built-in profiling tool • hook into the VM in CPython • introduces overhead a bit https://docs.python.org/3.5/library/profile.html
  31. 31. cProfile python -m cProfile python_code.py
  32. 32. line_profiler • can profile line-by-line basis • Uses a decorator to mark the chosen function (@profile) • introduces greater overhead https://github.com/rkern/line_profiler
  33. 33. profiling • Interactive Python profiler which inspired from Unity3D Profiler • Keep the call stack. • Live Profiling • Only Support Linux https://github.com/what-studio/profiling
  34. 34. https://github.com/sublee/pyconkr2015-profiling-resources/blob/master/continuous.gif
  35. 35. fibona Use profiler with real code
  36. 36. fibona Korean Fried Chicken Served as one chicken. (not pieces) And it’s quite complex to determine how many chicken would enough for N people.
  37. 37. fibona The problem can be solved easily using fibonacci number. 1 1 2 3 5 8 13 21 34 … For Nth fibonacci number of people, N-1 th fibonacci number of chicken would be perfect.
  38. 38. fibona Awesome Idea! but how do you get enough chicken if number of the people is not an fibonacci number?
  39. 39. fibona Apply Zeckendorf’s theorem, which is about the representation of integers as sum of Fibonacci number
  40. 40. https://en.wikipedia.org/wiki/Zeckendorf's_theorem
  41. 41. KEEP
 CALM AND USE THE
 PROFILER
  42. 42. cProfile python -m cProfile fibonachicken.py
  43. 43. cProfile
  44. 44. line_profiler
  45. 45. line_profiler kernprof -l -v fibonachicken.py
  46. 46. line_profiler
  47. 47. line_profiler
  48. 48. line_profiler
  49. 49. line_profiler
  50. 50. Both fib() and is_fibonacci() is the bottleneck. Should replace these with better one
  51. 51. Hypothesis #1 Improvement of fib() could result better performance
  52. 52. Binet s Formula https://en.wikipedia.org/wiki/Jacques_Philippe_Marie_Binet
  53. 53. cProfile
  54. 54. Hypothesis #2 Can we improve is_fibonacci() not to use fib() at all?
  55. 55. n is a Fibonacci number if and only if 5n*n+4 or 5n*n-4 is a square Gessel s Formula http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fibFormula.html
  56. 56. cProfile
  57. 57. Summary Consider efficiency of codes, along with peripherals, and 
 circumstances around you Form a hypothesis and confirm (using good profilers)
  58. 58. QA
  59. 59. Thanks!

×