• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Multithreading 101

Multithreading 101



An introduction to multithreading and why it is harder than you think.

An introduction to multithreading and why it is harder than you think.



Total Views
Views on SlideShare
Embed Views



2 Embeds 14

http://www.slideshare.net 13
http://www.slashdocs.com 1



Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • - thanks for coming – ensured my free entry to the conference :)

Multithreading 101 Multithreading 101 Presentation Transcript

  • Multithreading 101
      • Tim Penhey
  • Contents
    • Terms and definitions
    • When to use threads, or not
    • Threading primitives
    • Threading concepts
      • Memory barriers
      • Race conditions
      • Priority inversion
    • Where to from here?
  • Multitasking
    • Allows multiple programs to appear to be running at the same time
    • Context switch to move from one task to another
    • Two types of Multitasking Operating Systems
      • Co-operative
        • Application relinquishes control
      • Preemptive
        • OS kernel interrupts task
  • Processes and Threads
    • Processes
      • Own memory
        • OS protected – GPF
      • Own resources
        • file handles, sockets, ...
    • Thread
      • Short for “thread of execution”
      • Every process has “main thread”
      • Additional threads for a process share resources
  • Threading
    • Don't Do It
  • I Mean It
    • Just say NO!
  • Why Avoid Threads?
    • Threads add complexity
    • Subtle bugs that are hard to find
    • Debugging threads is hard
    • Often using a debugger can hide a race condition
    • Harder to test
  • What Are The Alternatives?
    • Multiple processes
    • Just using one thread
    • Event based callbacks
  • When To Use Threads
    • GUI responsiveness
    • Complete processor utilisation
    • Network connectivity
  • GUI Responsiveness
    • User interfaces are event driven
    • If a long running function blocks events the GUI can become unresponsive
    • Repaint is a classic example
  • Complete Processor Utilisation
    • When the following criteria meet
      • Problem can be broken into independent parts
      • Results needed as soon as possible
      • Problem is only computationally bound
    • One thread per (effective) CPU
      • Often called worker threads
    • Need an abstraction to separate work generation from the workers
      • Queues are popular
  • Worker Threads
    • More threads than processors means that the extra threads are not being used effectively
    • Overhead of context switching reduces computation time
  • Queues
    • Work generator thread is not handing work to a particular worker
    • An abstraction needed between producer and consumer
    • Often a queue – push on the back, pop from the front
    • When a worker is available, it tries to pop an item from the queue
  • Network Connectivity
    • Communicating over a network adds a level of non-determinism
    • Client side threads
      • Useful when the client needs to do more than just wait for the response – such as GUI events
    • Server side threads
      • Server waits for connections
      • Socket descriptor represents connection can be used independently of waiting for more connections
  • Threading Primitives
    • Threads
    • Atomic functions
    • Mutual exclusion blocks
    • Events / Conditions
    • Semaphores
  • Common Thread Functionality
    • Create a thread
      • Given a function to execute
      • When that function is finished, so is the thread
    • Wait for a thread to finish
      • Either wait forever or wait for a time limit
    • Kill a thread
      • Not always available – Java
  • Atomic Functions
    • Provided by the OS kernel
    • Guaranteed to complete before task is switched out
    • Examples:
      • increment and decrement
      • decrement and test
        • subtracts one and returns true if value now zero
      • compare and swap
        • three params, the variable, the expected value, the new value – only updates if current value is the expected one
  • Mutual Exclusion Blocks
    • Primitive that provides interface to mutual exclusion blocks often referred to as a mutex
    • A mutex is owned by at most one thread
    • Attempts to acquire ownership of an already owned mutex will block the thread
    • Releasing an owned mutex make it available for other threads to acquire
  • Common Mutex Functionality
    • create mutex
    • destroy mutex
    • acquire mutex
    • try to acquire mutex
    • release mutex
  • Why Are Mutexes Needed?
    • Any non-trivial data structure uses multiple variables to define state
    • Thread may be switched out after only modifying part of the state
    • Other thread accessing the inconsistent state of the data structure is now in the world of undefined behaviour
  • Mutex Example
    • There is a container of Foo objects called foo
    • A mutex to protect foo access called m
    • Thread 1
        • acquire mutex m
        • bar = reference to a value in foo
        • release mutex m
        • use bar...
    • Thread 2
        • acquire mutex m
        • modify foo
        • release mutex m
  • Mutex Example Discussion
    • Mutual exclusion by itself does not ensure correct behaviour
    • It is more than just acquiring access to an object that needs to be protected
    • If the granularity of mutex use is too fine, it can have speed implications
    • Over zealous use of mutexes can lead to serialisation of access – sometimes to the level of rendering the threads useless
  • Event / Condition
    • Win32 and Posix implementations differ
      • event on Win32
      • condition on Posix
    • A thread can wait on an event – this will cause the waiting thread to block (most of the time)
    • When the event is signalled , one or more threads (implementation dependant ) will wake and continue executing
  • Event Example
    • A client / server system where the client can cancel long running requests
    • Client thread passes work request to a worker thread
    • Worker thread informs client when the work is complete
    • Client thread waits on an event
    • Worker thread signals the event
  • Event Example II
    • Thread 1:
        • gets client request
        • passes on to worker thread
        • wait on event
    • Worker Thread:
        • gets work for client
        • processes for a long time
        • signals event when complete
    • Thread 1:
        • wakes on event signal
        • returns result to client
  • Event Example III
    • Thread 1: as previous slide
    • Worker Thread:
        • gets work for client
        • processes for a long time
    • Thread 2:
        • client cancels original request
        • sets cancelled flag
        • signals event
    • Thread 1:
        • wakes on event signal
        • returns cancelled request
  • Event Example IV
    • What to do with worker thread when cancelled ?
      • run to completion and discard result
      • if possible, interrupt worker prior to completion
    • General timeouts would also be possible using a “watchdog” thread
  • Semaphore
    • A non-negative counter that can be waited on
    • Attempting acquisition of a semaphore will do one of two things:
      • Decrement the value and continue
      • Wait on the semaphore if it is zero
    • Releasing a semaphore increments the value
      • A thread waiting on the semaphore will be woken and will attempt to acquire the semaphore again
    • Used where there are multiple copies of a resource
  • Database Connection Pool
    • Initiating database connections can be time consuming, so construct “ahead of time”
    • A semaphore is initialised with the number of connections
    • Requesting a connection is managed by acquiring the semaphore
      • If a connection is available it is returned
      • If none are available, thread blocks until one is
  • Database Connection Pool II
    • A collection for the connections
    • A semaphore to wait on
    • A mutex to protect the state of the collection
    • What about a lock free implementation?
  • Thread Local Storage
    • A memory block associated with a particular thread
    • Since thread local storage is not shared, protection from other threads is not needed
    • Access to thread local storage is through platform specific API
  • Lock-free Data Structures and Wait-free Algorithms
    • Lock free data structures are written without mutexes
      • Designed to allow multiple threads to share data without corruption
      • Built using atomic functions
      • Well documented examples available for simple containers
    • Wait free algorithms are where the algorithm is guaranteed to finish in a finite number of steps regardless of other thread behaviour
  • Queues Revisited
    • We have a work producer and four workers.
    • Work is going to be stored on a queue.
    • How can this be done?
    • What should a worker do if the queue is empty?
  • Protecting a Queue
    • A mutex to protect the queue internals
    • An event so the workers can wait on empty
    • The producer can signal the event when adding work
    • or
    • Hand craft single linked list from atomic functions (aka Lock-Free)
  • Threading Idioms
    • Producers and Consumers
      • work crew
    • Pipeline
    • Monitors
  • Simple Question
    • x = 0, y = 0
    • Thread 1:
        • while x == 0 loop
        • print y
    • Thread 2:
        • y = 0xDEADBEEF
        • x = 1
    • What is printed?
  • Smart Processors
    • Advances in processor efficiency have led to many tricks and optimisations
    • Execution out of order
      • Grouping reads and writes together
      • Doing block reads and writes
    • Grouping is done so transparent to a single thread of execution
  • Memory Barriers
    • Process instruction that inhibits “optimal” reordering of memory access
    • Makes sure that all loads and saves before the barrier happen before loads and saves after
    • Many synchronisation primitives have implicit memory barriers
      • It is usual for a mutex to have a memory barrier on both acquisition and releasing
  • Race Conditions
    • Anomalous behaviour due to unexpected critical dependence on the relative timing of events
      • Many can be solved through the use of synchronisation primitives
      • Some are problems at the design level
    • It is the unexpected nature of the problems that make writing multithreaded code hard
  • Deadlock
    • Best known result of a race condition
    • Four necessary conditions for deadlock
        • Tasks claim exclusive control of the resources they require
        • Tasks hold resources already allocated while waiting for additional resources
        • Resources can not be forcibly removed from the tasks holding them
        • A circular chain of tasks exists such that each task holds one or more resources that are being requested by the next task in the chain
  • Livelock
    • Two or more threads execute continuously but make no progress
      • Two threads both doing calculations while some value is true, but in such a way that each invalidates the other's test, causing both threads to continue for ever
    • Real world example
      • Two people walking towards each other in a corridor. Both decide to be polite and step to the side, however just end up mirroring the others movements
  • Priority Inversion
    • More associated with real time systems
    • Occurs when a low priority task owns a resource that a high priority task needs
    • In this case the low priority task takes precedence over the high priority task
    • Made more complex if there is a medium priority task that is also running taking precedence over the low priority task
  • The Near Future
    • Already we have desktop machines with multiple processors, and now multi-core
    • In order to get full use from machines, multithreading will be used more and more
    • Languages that have not defined a thread aware memory model (like C++) will need to do so to have uniform defined behaviour
  • Conclusion
    • Before diving into threads stop and think
      • Are threads really necessary?
      • Is the added complexity of threads going to reduce the complexity of the problem?
      • Would separate processes be better?
    • Proceed with care
      • Protect shared data
      • Avoid circular dependencies on mutexes
      • Use the correct synchronization method for the problem
  • References
    • http://foldoc.org
    • http://www.cs.umass.edu/~mcorner/courses/691J/papers/TS/coffman_deadlocks/coffman_deadlocks.pdf
    • http://research.microsoft.com/~mbj/Mars_Pathfinder/Authoritative_Account.html