Multithreading 101

4,424 views

Published on

An introduction to multithreading and why it is harder than you think.

Published in: Technology
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,424
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
0
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide
  • - thanks for coming – ensured my free entry to the conference :)
  • Multithreading 101

    1. 1. Multithreading 101 <ul><ul><li>Tim Penhey </li></ul></ul>
    2. 2. Contents <ul><li>Terms and definitions </li></ul><ul><li>When to use threads, or not </li></ul><ul><li>Threading primitives </li></ul><ul><li>Threading concepts </li></ul><ul><ul><li>Memory barriers </li></ul></ul><ul><ul><li>Race conditions </li></ul></ul><ul><ul><li>Priority inversion </li></ul></ul><ul><li>Where to from here? </li></ul>
    3. 3. Multitasking <ul><li>Allows multiple programs to appear to be running at the same time </li></ul><ul><li>Context switch to move from one task to another </li></ul><ul><li>Two types of Multitasking Operating Systems </li></ul><ul><ul><li>Co-operative </li></ul></ul><ul><ul><ul><li>Application relinquishes control </li></ul></ul></ul><ul><ul><li>Preemptive </li></ul></ul><ul><ul><ul><li>OS kernel interrupts task </li></ul></ul></ul>
    4. 4. Processes and Threads <ul><li>Processes </li></ul><ul><ul><li>Own memory </li></ul></ul><ul><ul><ul><li>OS protected – GPF </li></ul></ul></ul><ul><ul><li>Own resources </li></ul></ul><ul><ul><ul><li>file handles, sockets, ... </li></ul></ul></ul><ul><li>Thread </li></ul><ul><ul><li>Short for “thread of execution” </li></ul></ul><ul><ul><li>Every process has “main thread” </li></ul></ul><ul><ul><li>Additional threads for a process share resources </li></ul></ul>
    5. 5. Threading <ul><li>Don't Do It </li></ul>
    6. 6. I Mean It <ul><li>Just say NO! </li></ul>
    7. 7. Why Avoid Threads? <ul><li>Threads add complexity </li></ul><ul><li>Subtle bugs that are hard to find </li></ul><ul><li>Debugging threads is hard </li></ul><ul><li>Often using a debugger can hide a race condition </li></ul><ul><li>Harder to test </li></ul>
    8. 8. What Are The Alternatives? <ul><li>Multiple processes </li></ul><ul><li>Just using one thread </li></ul><ul><li>Event based callbacks </li></ul>
    9. 9. When To Use Threads <ul><li>GUI responsiveness </li></ul><ul><li>Complete processor utilisation </li></ul><ul><li>Network connectivity </li></ul>
    10. 10. GUI Responsiveness <ul><li>User interfaces are event driven </li></ul><ul><li>If a long running function blocks events the GUI can become unresponsive </li></ul><ul><li>Repaint is a classic example </li></ul>
    11. 11. Complete Processor Utilisation <ul><li>When the following criteria meet </li></ul><ul><ul><li>Problem can be broken into independent parts </li></ul></ul><ul><ul><li>Results needed as soon as possible </li></ul></ul><ul><ul><li>Problem is only computationally bound </li></ul></ul><ul><li>One thread per (effective) CPU </li></ul><ul><ul><li>Often called worker threads </li></ul></ul><ul><li>Need an abstraction to separate work generation from the workers </li></ul><ul><ul><li>Queues are popular </li></ul></ul>
    12. 12. Worker Threads <ul><li>More threads than processors means that the extra threads are not being used effectively </li></ul><ul><li>Overhead of context switching reduces computation time </li></ul>
    13. 13. Queues <ul><li>Work generator thread is not handing work to a particular worker </li></ul><ul><li>An abstraction needed between producer and consumer </li></ul><ul><li>Often a queue – push on the back, pop from the front </li></ul><ul><li>When a worker is available, it tries to pop an item from the queue </li></ul>
    14. 14. Network Connectivity <ul><li>Communicating over a network adds a level of non-determinism </li></ul><ul><li>Client side threads </li></ul><ul><ul><li>Useful when the client needs to do more than just wait for the response – such as GUI events </li></ul></ul><ul><li>Server side threads </li></ul><ul><ul><li>Server waits for connections </li></ul></ul><ul><ul><li>Socket descriptor represents connection can be used independently of waiting for more connections </li></ul></ul>
    15. 15. Threading Primitives <ul><li>Threads </li></ul><ul><li>Atomic functions </li></ul><ul><li>Mutual exclusion blocks </li></ul><ul><li>Events / Conditions </li></ul><ul><li>Semaphores </li></ul>
    16. 16. Common Thread Functionality <ul><li>Create a thread </li></ul><ul><ul><li>Given a function to execute </li></ul></ul><ul><ul><li>When that function is finished, so is the thread </li></ul></ul><ul><li>Wait for a thread to finish </li></ul><ul><ul><li>Either wait forever or wait for a time limit </li></ul></ul><ul><li>Kill a thread </li></ul><ul><ul><li>Not always available – Java </li></ul></ul>
    17. 17. Atomic Functions <ul><li>Provided by the OS kernel </li></ul><ul><li>Guaranteed to complete before task is switched out </li></ul><ul><li>Examples: </li></ul><ul><ul><li>increment and decrement </li></ul></ul><ul><ul><li>decrement and test </li></ul></ul><ul><ul><ul><li>subtracts one and returns true if value now zero </li></ul></ul></ul><ul><ul><li>compare and swap </li></ul></ul><ul><ul><ul><li>three params, the variable, the expected value, the new value – only updates if current value is the expected one </li></ul></ul></ul>
    18. 18. Mutual Exclusion Blocks <ul><li>Primitive that provides interface to mutual exclusion blocks often referred to as a mutex </li></ul><ul><li>A mutex is owned by at most one thread </li></ul><ul><li>Attempts to acquire ownership of an already owned mutex will block the thread </li></ul><ul><li>Releasing an owned mutex make it available for other threads to acquire </li></ul>
    19. 19. Common Mutex Functionality <ul><li>create mutex </li></ul><ul><li>destroy mutex </li></ul><ul><li>acquire mutex </li></ul><ul><li>try to acquire mutex </li></ul><ul><li>release mutex </li></ul>
    20. 20. Why Are Mutexes Needed? <ul><li>Any non-trivial data structure uses multiple variables to define state </li></ul><ul><li>Thread may be switched out after only modifying part of the state </li></ul><ul><li>Other thread accessing the inconsistent state of the data structure is now in the world of undefined behaviour </li></ul>
    21. 21. Mutex Example <ul><li>There is a container of Foo objects called foo </li></ul><ul><li>A mutex to protect foo access called m </li></ul><ul><li>Thread 1 </li></ul><ul><ul><ul><li>acquire mutex m </li></ul></ul></ul><ul><ul><ul><li>bar = reference to a value in foo </li></ul></ul></ul><ul><ul><ul><li>release mutex m </li></ul></ul></ul><ul><ul><ul><li>use bar... </li></ul></ul></ul><ul><li>Thread 2 </li></ul><ul><ul><ul><li>acquire mutex m </li></ul></ul></ul><ul><ul><ul><li>modify foo </li></ul></ul></ul><ul><ul><ul><li>release mutex m </li></ul></ul></ul>
    22. 22. Mutex Example Discussion <ul><li>Mutual exclusion by itself does not ensure correct behaviour </li></ul><ul><li>It is more than just acquiring access to an object that needs to be protected </li></ul><ul><li>If the granularity of mutex use is too fine, it can have speed implications </li></ul><ul><li>Over zealous use of mutexes can lead to serialisation of access – sometimes to the level of rendering the threads useless </li></ul>
    23. 23. Event / Condition <ul><li>Win32 and Posix implementations differ </li></ul><ul><ul><li>event on Win32 </li></ul></ul><ul><ul><li>condition on Posix </li></ul></ul><ul><li>A thread can wait on an event – this will cause the waiting thread to block (most of the time) </li></ul><ul><li>When the event is signalled , one or more threads (implementation dependant ) will wake and continue executing </li></ul>
    24. 24. Event Example <ul><li>A client / server system where the client can cancel long running requests </li></ul><ul><li>Client thread passes work request to a worker thread </li></ul><ul><li>Worker thread informs client when the work is complete </li></ul><ul><li>Client thread waits on an event </li></ul><ul><li>Worker thread signals the event </li></ul>
    25. 25. Event Example II <ul><li>Thread 1: </li></ul><ul><ul><ul><li>gets client request </li></ul></ul></ul><ul><ul><ul><li>passes on to worker thread </li></ul></ul></ul><ul><ul><ul><li>wait on event </li></ul></ul></ul><ul><li>Worker Thread: </li></ul><ul><ul><ul><li>gets work for client </li></ul></ul></ul><ul><ul><ul><li>processes for a long time </li></ul></ul></ul><ul><ul><ul><li>signals event when complete </li></ul></ul></ul><ul><li>Thread 1: </li></ul><ul><ul><ul><li>wakes on event signal </li></ul></ul></ul><ul><ul><ul><li>returns result to client </li></ul></ul></ul>
    26. 26. Event Example III <ul><li>Thread 1: as previous slide </li></ul><ul><li>Worker Thread: </li></ul><ul><ul><ul><li>gets work for client </li></ul></ul></ul><ul><ul><ul><li>processes for a long time </li></ul></ul></ul><ul><li>Thread 2: </li></ul><ul><ul><ul><li>client cancels original request </li></ul></ul></ul><ul><ul><ul><li>sets cancelled flag </li></ul></ul></ul><ul><ul><ul><li>signals event </li></ul></ul></ul><ul><li>Thread 1: </li></ul><ul><ul><ul><li>wakes on event signal </li></ul></ul></ul><ul><ul><ul><li>returns cancelled request </li></ul></ul></ul>
    27. 27. Event Example IV <ul><li>What to do with worker thread when cancelled ? </li></ul><ul><ul><li>run to completion and discard result </li></ul></ul><ul><ul><li>if possible, interrupt worker prior to completion </li></ul></ul><ul><li>General timeouts would also be possible using a “watchdog” thread </li></ul>
    28. 28. Semaphore <ul><li>A non-negative counter that can be waited on </li></ul><ul><li>Attempting acquisition of a semaphore will do one of two things: </li></ul><ul><ul><li>Decrement the value and continue </li></ul></ul><ul><ul><li>Wait on the semaphore if it is zero </li></ul></ul><ul><li>Releasing a semaphore increments the value </li></ul><ul><ul><li>A thread waiting on the semaphore will be woken and will attempt to acquire the semaphore again </li></ul></ul><ul><li>Used where there are multiple copies of a resource </li></ul>
    29. 29. Database Connection Pool <ul><li>Initiating database connections can be time consuming, so construct “ahead of time” </li></ul><ul><li>A semaphore is initialised with the number of connections </li></ul><ul><li>Requesting a connection is managed by acquiring the semaphore </li></ul><ul><ul><li>If a connection is available it is returned </li></ul></ul><ul><ul><li>If none are available, thread blocks until one is </li></ul></ul>
    30. 30. Database Connection Pool II <ul><li>A collection for the connections </li></ul><ul><li>A semaphore to wait on </li></ul><ul><li>A mutex to protect the state of the collection </li></ul><ul><li>What about a lock free implementation? </li></ul>
    31. 31. Thread Local Storage <ul><li>A memory block associated with a particular thread </li></ul><ul><li>Since thread local storage is not shared, protection from other threads is not needed </li></ul><ul><li>Access to thread local storage is through platform specific API </li></ul>
    32. 32. Lock-free Data Structures and Wait-free Algorithms <ul><li>Lock free data structures are written without mutexes </li></ul><ul><ul><li>Designed to allow multiple threads to share data without corruption </li></ul></ul><ul><ul><li>Built using atomic functions </li></ul></ul><ul><ul><li>Well documented examples available for simple containers </li></ul></ul><ul><li>Wait free algorithms are where the algorithm is guaranteed to finish in a finite number of steps regardless of other thread behaviour </li></ul>
    33. 33. Queues Revisited <ul><li>We have a work producer and four workers. </li></ul><ul><li>Work is going to be stored on a queue. </li></ul><ul><li>How can this be done? </li></ul><ul><li>What should a worker do if the queue is empty? </li></ul>
    34. 34. Protecting a Queue <ul><li>A mutex to protect the queue internals </li></ul><ul><li>An event so the workers can wait on empty </li></ul><ul><li>The producer can signal the event when adding work </li></ul><ul><li>or </li></ul><ul><li>Hand craft single linked list from atomic functions (aka Lock-Free) </li></ul>
    35. 35. Threading Idioms <ul><li>Producers and Consumers </li></ul><ul><ul><li>work crew </li></ul></ul><ul><li>Pipeline </li></ul><ul><li>Monitors </li></ul>
    36. 36. Simple Question <ul><li>x = 0, y = 0 </li></ul><ul><li>Thread 1: </li></ul><ul><ul><ul><li>while x == 0 loop </li></ul></ul></ul><ul><ul><ul><li>print y </li></ul></ul></ul><ul><li>Thread 2: </li></ul><ul><ul><ul><li>y = 0xDEADBEEF </li></ul></ul></ul><ul><ul><ul><li>x = 1 </li></ul></ul></ul><ul><li>What is printed? </li></ul>
    37. 37. Smart Processors <ul><li>Advances in processor efficiency have led to many tricks and optimisations </li></ul><ul><li>Execution out of order </li></ul><ul><ul><li>Grouping reads and writes together </li></ul></ul><ul><ul><li>Doing block reads and writes </li></ul></ul><ul><li>Grouping is done so transparent to a single thread of execution </li></ul>
    38. 38. Memory Barriers <ul><li>Process instruction that inhibits “optimal” reordering of memory access </li></ul><ul><li>Makes sure that all loads and saves before the barrier happen before loads and saves after </li></ul><ul><li>Many synchronisation primitives have implicit memory barriers </li></ul><ul><ul><li>It is usual for a mutex to have a memory barrier on both acquisition and releasing </li></ul></ul>
    39. 39. Race Conditions <ul><li>Anomalous behaviour due to unexpected critical dependence on the relative timing of events </li></ul><ul><ul><li>Many can be solved through the use of synchronisation primitives </li></ul></ul><ul><ul><li>Some are problems at the design level </li></ul></ul><ul><li>It is the unexpected nature of the problems that make writing multithreaded code hard </li></ul>
    40. 40. Deadlock <ul><li>Best known result of a race condition </li></ul><ul><li>Four necessary conditions for deadlock </li></ul><ul><ul><ul><li>Tasks claim exclusive control of the resources they require </li></ul></ul></ul><ul><ul><ul><li>Tasks hold resources already allocated while waiting for additional resources </li></ul></ul></ul><ul><ul><ul><li>Resources can not be forcibly removed from the tasks holding them </li></ul></ul></ul><ul><ul><ul><li>A circular chain of tasks exists such that each task holds one or more resources that are being requested by the next task in the chain </li></ul></ul></ul>
    41. 41. Livelock <ul><li>Two or more threads execute continuously but make no progress </li></ul><ul><ul><li>Two threads both doing calculations while some value is true, but in such a way that each invalidates the other's test, causing both threads to continue for ever </li></ul></ul><ul><li>Real world example </li></ul><ul><ul><li>Two people walking towards each other in a corridor. Both decide to be polite and step to the side, however just end up mirroring the others movements </li></ul></ul>
    42. 42. Priority Inversion <ul><li>More associated with real time systems </li></ul><ul><li>Occurs when a low priority task owns a resource that a high priority task needs </li></ul><ul><li>In this case the low priority task takes precedence over the high priority task </li></ul><ul><li>Made more complex if there is a medium priority task that is also running taking precedence over the low priority task </li></ul>
    43. 43. The Near Future <ul><li>Already we have desktop machines with multiple processors, and now multi-core </li></ul><ul><li>In order to get full use from machines, multithreading will be used more and more </li></ul><ul><li>Languages that have not defined a thread aware memory model (like C++) will need to do so to have uniform defined behaviour </li></ul>
    44. 44. Conclusion <ul><li>Before diving into threads stop and think </li></ul><ul><ul><li>Are threads really necessary? </li></ul></ul><ul><ul><li>Is the added complexity of threads going to reduce the complexity of the problem? </li></ul></ul><ul><ul><li>Would separate processes be better? </li></ul></ul><ul><li>Proceed with care </li></ul><ul><ul><li>Protect shared data </li></ul></ul><ul><ul><li>Avoid circular dependencies on mutexes </li></ul></ul><ul><ul><li>Use the correct synchronization method for the problem </li></ul></ul>
    45. 45. References <ul><li>http://foldoc.org </li></ul><ul><li>http://www.cs.umass.edu/~mcorner/courses/691J/papers/TS/coffman_deadlocks/coffman_deadlocks.pdf </li></ul><ul><li>http://research.microsoft.com/~mbj/Mars_Pathfinder/Authoritative_Account.html </li></ul>

    ×