Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Concurrency in Python

922 views

Published on

It is mainly about the multithreading and the multiprocessing in Python, and *in Python's flavor*.

It's also the share at Taipei.py [1].

[1] http://www.meetup.com/Taipei-py/events/220452029/

Published in: Software
  • Be the first to comment

Concurrency in Python

  1. 1. CONCURRENCY IN PYTHON MOSKY 1
  2. 2. MULTITHREADING & 
 MULTIPROCESSING IN PYTHON MOSKY 2
  3. 3. MOSKY PYTHON CHARMER @ PINKOI 
 MOSKY.TW 3
  4. 4. OUTLINE 4
  5. 5. OUTLINE • Introduction 4
  6. 6. OUTLINE • Introduction • Producer-Consumer Pattern 4
  7. 7. OUTLINE • Introduction • Producer-Consumer Pattern • Python’s Flavor 4
  8. 8. OUTLINE • Introduction • Producer-Consumer Pattern • Python’s Flavor • Misc. Techiques 4
  9. 9. INTRODUCTION 5
  10. 10. MULTITHREADING 6
  11. 11. MULTITHREADING • GIL 6
  12. 12. MULTITHREADING • GIL • Only one thread runs at any given time. 6
  13. 13. MULTITHREADING • GIL • Only one thread runs at any given time. • It still can improves IO-bound problems. 6
  14. 14. MULTIPROCESSING 7
  15. 15. MULTIPROCESSING • It uses fork. 7
  16. 16. MULTIPROCESSING • It uses fork. • Processes can run at the same time. 7
  17. 17. MULTIPROCESSING • It uses fork. • Processes can run at the same time. • Use more memory. 7
  18. 18. MULTIPROCESSING • It uses fork. • Processes can run at the same time. • Use more memory. • Note the initial cost. 7
  19. 19. IS IT HARD? 8
  20. 20. IS IT HARD? • Avoid shared resources. 8
  21. 21. IS IT HARD? • Avoid shared resources. • e.g., vars or shared memory, files, connections, … 8
  22. 22. IS IT HARD? • Avoid shared resources. • e.g., vars or shared memory, files, connections, … • Understand Python’s flavor. 8
  23. 23. IS IT HARD? • Avoid shared resources. • e.g., vars or shared memory, files, connections, … • Understand Python’s flavor. • Then it will be easy. 8
  24. 24. SHARED RESOURCE 9
  25. 25. SHARED RESOURCE • Race condition:
 T1: RW
 T2: RW
 T1+T2: RRWW 9
  26. 26. SHARED RESOURCE • Race condition:
 T1: RW
 T2: RW
 T1+T2: RRWW • Use lock → Thread-safe:
 T1+T2: (RW) (RW) 9
  27. 27. SHARED RESOURCE • Race condition:
 T1: RW
 T2: RW
 T1+T2: RRWW • Use lock → Thread-safe:
 T1+T2: (RW) (RW) • But lock causes worse performance and deadlock. 9
  28. 28. SHARED RESOURCE • Race condition:
 T1: RW
 T2: RW
 T1+T2: RRWW • Use lock → Thread-safe:
 T1+T2: (RW) (RW) • But lock causes worse performance and deadlock. • Which is the hard part. 9
  29. 29. DIAGNOSE PROBLEM 10
  30. 30. DIAGNOSE PROBLEM • Where is the bottleneck? 10
  31. 31. DIAGNOSE PROBLEM • Where is the bottleneck? • Divide your problem. 10
  32. 32. PRODUCER-CONSUMER PATTERN 11
  33. 33. PRODUCER-CONSUMER PATTERN 12
  34. 34. PRODUCER-CONSUMER PATTERN • A queue 12
  35. 35. PRODUCER-CONSUMER PATTERN • A queue • Producers → A queue 12
  36. 36. PRODUCER-CONSUMER PATTERN • A queue • Producers → A queue • A queue → Consumers 12
  37. 37. PRODUCER-CONSUMER PATTERN • A queue • Producers → A queue • A queue → Consumers • Python has built-in Queue module for it. 12
  38. 38. EXAMPLES • https://docs.python.org/2/library/ queue.html#queue-objects • https://github.com/moskytw/mrbus/blob/master/ mrbus/base/pool.py 13
  39. 39. WHY .TASK_DONE? 14
  40. 40. WHY .TASK_DONE? • It’s for .join. 14
  41. 41. WHY .TASK_DONE? • It’s for .join. • When the counter goes zero, 
 it will notify the threads which are waiting. 14
  42. 42. WHY .TASK_DONE? • It’s for .join. • When the counter goes zero, 
 it will notify the threads which are waiting. • It’s implemented by threading.Condition. 14
  43. 43. 15 THE THREADING MODULE
  44. 44. 15 • Lock — primitive lock: .acquire / .release THE THREADING MODULE
  45. 45. 15 • Lock — primitive lock: .acquire / .release • RLock — owner can reenter THE THREADING MODULE
  46. 46. 15 • Lock — primitive lock: .acquire / .release • RLock — owner can reenter • Semaphore — lock when counter goes zero THE THREADING MODULE
  47. 47. 16
  48. 48. • Condition — 
 .wait for .notify / .notify_all 16
  49. 49. • Condition — 
 .wait for .notify / .notify_all • Event — .wait for .set; simplifed Condition 16
  50. 50. • Condition — 
 .wait for .notify / .notify_all • Event — .wait for .set; simplifed Condition • with lock: … 16
  51. 51. THE MULTIPROCESSING MODULE 17
  52. 52. THE MULTIPROCESSING MODULE • .Process 17
  53. 53. THE MULTIPROCESSING MODULE • .Process • .JoinableQueue 17
  54. 54. THE MULTIPROCESSING MODULE • .Process • .JoinableQueue • .Pool 17
  55. 55. THE MULTIPROCESSING MODULE • .Process • .JoinableQueue • .Pool • … 17
  56. 56. PYTHON’S FLAVOR 18
  57. 57. 19 DAEMONIC THREAD
  58. 58. 19 • It’s not that “daemon”. DAEMONIC THREAD
  59. 59. 19 • It’s not that “daemon”. • Just will be killed when Python shutting down. DAEMONIC THREAD
  60. 60. 19 • It’s not that “daemon”. • Just will be killed when Python shutting down. • Immediately. DAEMONIC THREAD
  61. 61. 19 • It’s not that “daemon”. • Just will be killed when Python shutting down. • Immediately. • Others keep running until return. DAEMONIC THREAD
  62. 62. SO, HOW TO STOP? 20
  63. 63. SO, HOW TO STOP? • Set demon and let Python clean it up. 20
  64. 64. SO, HOW TO STOP? • Set demon and let Python clean it up. • Let it return. 20
  65. 65. BUT, THE THREAD IS BLOCKING 21
  66. 66. BUT, THE THREAD IS BLOCKING • Set timeout. 21
  67. 67. HOW ABOUT CTRL+C? 22
  68. 68. HOW ABOUT CTRL+C? • Only main thread can receive that. 22
  69. 69. HOW ABOUT CTRL+C? • Only main thread can receive that. • BSD-style. 22
  70. 70. BROADCAST SIGNAL 
 TO SUB-THREAD 23
  71. 71. BROADCAST SIGNAL 
 TO SUB-THREAD • Set a global flag when get signal. 23
  72. 72. BROADCAST SIGNAL 
 TO SUB-THREAD • Set a global flag when get signal. • Let thread read it before each task. 23
  73. 73. BROADCAST SIGNAL 
 TO SUB-THREAD • Set a global flag when get signal. • Let thread read it before each task. • No, you can’t kill non-daemonic thread. 23
  74. 74. BROADCAST SIGNAL 
 TO SUB-THREAD • Set a global flag when get signal. • Let thread read it before each task. • No, you can’t kill non-daemonic thread. • Just can’t do so. 23
  75. 75. BROADCAST SIGNAL 
 TO SUB-THREAD • Set a global flag when get signal. • Let thread read it before each task. • No, you can’t kill non-daemonic thread. • Just can’t do so. • It’s Python. 23
  76. 76. BROADCAST SIGNAL 
 TO SUB-PROCESS 24
  77. 77. BROADCAST SIGNAL 
 TO SUB-PROCESS • Just broadcast the signal to sub-processes. 24
  78. 78. BROADCAST SIGNAL 
 TO SUB-PROCESS • Just broadcast the signal to sub-processes. • Start with register signal handler:
 signal(SIGINT, _handle_to_term_signal) 24
  79. 79. 25
  80. 80. • Realize process context if need:
 pid = getpid()
 pgid = getpgid(0)
 proc_is_parent = (pid == pgid) 25
  81. 81. • Realize process context if need:
 pid = getpid()
 pgid = getpgid(0)
 proc_is_parent = (pid == pgid) • Off the handler:
 signal(signum, SIG_IGN) 25
  82. 82. • Realize process context if need:
 pid = getpid()
 pgid = getpgid(0)
 proc_is_parent = (pid == pgid) • Off the handler:
 signal(signum, SIG_IGN) • Broadcast:
 killpg(pgid, signum) 25
  83. 83. MISC. TECHIQUES 26
  84. 84. JUST THREAD IT OUT 27
  85. 85. JUST THREAD IT OUT • Or process it out. 27
  86. 86. JUST THREAD IT OUT • Or process it out. • Let main thread exit earlier. (Looks faster!) 27
  87. 87. JUST THREAD IT OUT • Or process it out. • Let main thread exit earlier. (Looks faster!) • Let main thread keep dispatching tasks. 27
  88. 88. JUST THREAD IT OUT • Or process it out. • Let main thread exit earlier. (Looks faster!) • Let main thread keep dispatching tasks. • “Async” 27
  89. 89. JUST THREAD IT OUT • Or process it out. • Let main thread exit earlier. (Looks faster!) • Let main thread keep dispatching tasks. • “Async” • And fix some stupid behavior.
 (I meant atexit with multiprocessing.Pool.) 27
  90. 90. COLLECT RESULT SMARTER 28
  91. 91. COLLECT RESULT SMARTER • Put into a safe queue. 28
  92. 92. COLLECT RESULT SMARTER • Put into a safe queue. • Use a thread per instance. 28
  93. 93. COLLECT RESULT SMARTER • Put into a safe queue. • Use a thread per instance. • Learn “let it go”. 28
  94. 94. EXAMPLES • https://github.com/moskytw/mrbus/blob/master/ mrbus/base/pool.py#L45 • https://github.com/moskytw/mrbus/blob/master/ mrbus/model/core.py#L30 29
  95. 95. MONITOR THEM 30
  96. 96. MONITOR THEM • No one is a master at first. 30
  97. 97. MONITOR THEM • No one is a master at first. • Don’t guess. 30
  98. 98. MONITOR THEM • No one is a master at first. • Don’t guess. • Just use a function to print log. 30
  99. 99. BENCHMARK THEM 31
  100. 100. BENCHMARK THEM • No one is a master at first. 31
  101. 101. BENCHMARK THEM • No one is a master at first. • Don’t guess. 31
  102. 102. BENCHMARK THEM • No one is a master at first. • Don’t guess. • Just prove it. 31
  103. 103. CONCLUSION 32
  104. 104. CONCLUSION • Avoid shared resource 
 — or just use producer-consumer pattern. 32
  105. 105. CONCLUSION • Avoid shared resource 
 — or just use producer-consumer pattern. • Signals only go main thread. 32
  106. 106. CONCLUSION • Avoid shared resource 
 — or just use producer-consumer pattern. • Signals only go main thread. • Just thread it out. 32
  107. 107. CONCLUSION • Avoid shared resource 
 — or just use producer-consumer pattern. • Signals only go main thread. • Just thread it out. • Collect your result smarter. 32
  108. 108. CONCLUSION • Avoid shared resource 
 — or just use producer-consumer pattern. • Signals only go main thread. • Just thread it out. • Collect your result smarter. • Monitor and benchmark your code. 32

×