Do more than one thing at the same time, the Python way

5,810 views

Published on

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,810
On SlideShare
0
From Embeds
0
Number of Embeds
1,560
Actions
Shares
0
Downloads
32
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • \n
  • The first point is, why complicate the code trying to do more than one thing at the same time. The answer is to better use the resources. We need spare resources to start with\n\nThe typical applications that need these kind of techniques are three (divide a big tasks into several computers/cores, do the same thing for different actors, do different things for the same actor)\n
  • The typical example is crunching numbers. Render a movie, data mining, etc. Usually can be treated as the next case (do the same thing more than once) with an algorithm that works with parts of the data.\n
  • Do the same thing, for different actors\n Web server\n Different tabs on a web browser\n
  • Do different things for the same actor (usually some coordination is needed, more later):\n Game (AI of each enemy, render image, sound)\n
  • Debug can be very tricky, you’ve been advised. If coordination is needed, that can be tricky. If only one thing is do (several times), the complexity is highly reduced due help from the Operative System\n
  • There are several ways of execute code concurrently, we’ll discuss threads, processes and asynchronous programming\n
  • \n
  • module thread\n Low level, in general is better avoid it\n module threading\n Higher level. More functionality\n
  • \n
  • \n
  • \n
  • A long, long time ago, that was the way of dealing with multiprocessing\nOnly available on Unix machines\nYou can fork using os.fork() in Python!\nFork is not in common use anymore due problems (fork bomb, etc)\n
  • \n
  • OS are very good at multitasking (and have been for some time). Communication can be achieved through ports, files, pipes, external queues, etc...\nBe radical, use more than one program\n
  • Threads share all the memory, so they can access whatever. That is good (no communication overhead), but can lead to abuse and instability. \n
  • it is the new cool technology, and there has been some recent uses, like NodeJS\n
  • On a thread model, you have a supervisor (OS) that will stop one thread and execute the other, transparently for the program\n
  • On asynchronous programing, every task is waiting for the others to end their dance to get on the dance floor\n
  • Or voluntarily releasing control (sleep, yield, etc). The typical way of calling the next block is to add callbacks to the code (when this result is available, use it as input for this code).\n
  • Callbacks make the code hard to follow and debug\n
  • No threads! All the fetch functions start, there is no need for callbacks, the urlopen call will yield and resume at the same point\n
  • The typical example is a web app, which normally takes time in get data from a DB and spend a small amount of time in composing the response.\nYou can keep a huge number of connections, like in chat servers, etc.\n
  • Number crunching, 3d calculations, etc... In a threaded model, the rest of the thread will be executed from time to time.\n
  • Use of Twisted and Tornado a some Node.js\n
  • \n
  • This will only happen on multicore processors.There is some overhead on the blocking. It also make UNIX signals to act weird (CTRL + C example)\n
  • The OS could (and probably do, depending on the load) set a thread to run on each core.\n
  • Num iterations is constant, with different threads. Increasing the number of threads magnifies the problem, as adds more overhead. Adding threads adds MORE time instead of reducing it.\n
  • \n
  • it is a very “first world problem”. It is only causing problems on a subset of programs, in general the effects are limited.\n
  • Only if CPU-bound, multithread program is running on a multicore machine, you need to worry.\nEven one CPU bound thread can have an effect due overhead and could be more efficient to use sequential programming\n\n
  • \n
  • time vs iterations for 10 threads/processes\n
  • He has a couple more articles on his web page www.dabeaz.com\n
  • In general, for concurrent programming\n
  • probably true for all software development\n\nkeep the tasks separated and reduce communication between elements\n\n\n
  • That also includes C extensions. You don’t need a lock for reads, only for writes that are not a single python operation.\n
  • Python should be enough for most of the situations. Think carefully if you need a lock, mutex or semaphore\n\n\n
  • \n
  • There are different kind of Queues, like LIFO, FIFO, Priority, etc. multiprocess has also queues.\nExternal queues can also be useful (RabbitMQ, AMQP, etc)\nA pipe is a queue with one input and one output.\n
  • \n
  • Limit the number of workers. Unlimited workers is unsafe and can block the system.\n\nA lot of threads can be worse than less threads. Do some tests to find the sweet spo\n
  • Thread coordination is very difficult. Try to create and destroy short-lived threads instead.\n
  • This structure is typical of games (and real time systems)\n\nEach interval (1 second), you evaluate the needs and throw the needed threads (or execute sequentially)\nThe main periodic thread can cancel next time threads that haven’t finished their task.\n\nKill unused threads\n
  • So consider using other languages...\n
  • \n
  • Do more than one thing at the same time, the Python way

    1. 1. DO MORE THAN ONE THING AT THE TIME the Python way! Jaime Buelta
    2. 2. WHY?
    3. 3. SLICE A PROBLEM TOSOLVE IT USING MORE RESOURCES
    4. 4. SAME THING FOR DIFFERENT ACTORS
    5. 5. DIFFERENT THINGS FOR THE SAME ACTOR
    6. 6. DOING MORE THAN ONE THING IS TOUGH
    7. 7. CHOOSE WISELY
    8. 8. THREADS
    9. 9. THREADS IN PYTHON module threading module thread
    10. 10. THREAD EXAMPLEimport threadingITERATIONS = 1000000class Example1(threading.Thread): def __init__(self, num): self.num = num super(Example1, self).__init__() def run(self): for i in xrange(ITERATIONS): passdef main(): for j in xrange(10): t = Example1(j) t.start()if __name__ == __main__: main()
    11. 11. TIMERSfrom threading import TimerDELAYED_TIME = 10.5def delayed(): print This call is delayedt = Timer(10.5, delayed)t.start()t.cancel() # Cancels the execution
    12. 12. PROCESSES
    13. 13. YE OLDE FORK
    14. 14. MULTIPROCESS MODULEimport multiprocessingITERATIONS = 1000000class Example1(multiprocessing.Process): def __init__(self, num): self.num = num super(Example1, self).__init__() def run(self): for i in xrange(ITERATIONS): passdef main(): for j in xrange(10): t = Example1(j) t.start()if __name__ == __main__: main()
    15. 15. OS ARE GREAT AT MULTITASKING
    16. 16. PROCESS COMMUNICATION NEEDS TO BE STRUCTURED but that is not necessarily a bad thing
    17. 17. ASYNCHRONOUS PROGRAMMING
    18. 18. Thread 1 Thread 2
    19. 19. Task 1 Task 2 Task 3 The task will releasewaiting control once they areready! blocked waiting for an done! input from IO waiting ready! Callbackdone! done!
    20. 20. death by callback
    21. 21. EVENTLETNUM_URLS = 1000URL = http://www.some_address.com/urls = [URL] * NUM_URLSimport eventletfrom eventlet.green import urllib2def fetch(url): return urllib2.urlopen(url).read()pool = eventlet.GreenPool()for body in pool.imap(fetch, urls): do_something_with_result(body)
    22. 22. Asynchronousprograming is greatwhen the tasks are IO - Bound So the CPU is basically waiting...
    23. 23. Asynchronous programing is notgood when tasks are CPU - BoundIf one tasks enters on an infinite loop, the whole system is blocked
    24. 24. YESTERDAY THERE WAS ATALK ABOUT ASYNC PYTHON PROGRAMMING Hope you attended, I did. If you don’t, you can watch it online later
    25. 25. THE INFAMOUS GIL
    26. 26. It doesn’t allow to run two threads at the same time, even if theOS will do it.Only one thread run. The rest will be blocked.
    27. 27. 2 core machineThread AThread B
    28. 28. 100,000,000 iterations40 s30 s20 s10 s 0s 1 10 100 1000 10000 4 core machine
    29. 29. IS IT REALLYA PROBLEM?
    30. 30. I WANT TO WATCH THIS YOUTUBE BUT I’M ALREADY LISTENING TO MUSIC AT THE SAME TIMEEEH, NOT AS BIG AS IT LOOKS
    31. 31. GIL MAKES CONCURRENTPROGRAMMING MUCH EASIER And the problems are quite limited in practice
    32. 32. BUT MAYBE YOUR PROGRAM IS ONE OF THE FEW Avoid problems not using threads, but processes
    33. 33. threading multiprocess sequential100,000ms 10,000ms 1,000ms 100ms 10ms 1ms 1000 10000 100000 1000000 10000000
    34. 34. Great, detailed talk “Understanding GIL” by David Beazlyhttp://www.dabeaz.com/python/UnderstandingGIL.pdf
    35. 35. AVOID PROBLEMS
    36. 36. SIMPLE ARCHITECTURE
    37. 37. ALL PYTHON OPERATIONS ARE ATOMIC Hey, that’s what the GIL is for
    38. 38. LOCKING TOBE USED WITH EXTREME CAUTION If you need to set exclusive sections, you are probably doing it wrong
    39. 39. BUT WHEN I DO, I USE WITHfrom threading import Lockmy_lock = Lock()def some_function(args): with my_lock: protected_section()
    40. 40. USE QUEUES (AND PIPES)
    41. 41. PROCESSTHE TASK WITHWORKERS
    42. 42. LIMIT THE NUMBERS!!!
    43. 43. THREAD COORDINATION IS HELL
    44. 44. Main periodic thread task A task B task C task DMain periodic threadMain periodic thread
    45. 45. but that’s probably not the best use of Python
    46. 46. QUESTIONS? THANKS FOR YOUR ATTENTION @jaimebuelta wrongsideofmemphis.wordpress.com

    ×