Essentials of Multithreaded System Programming in C++
Upcoming SlideShare
Loading in...5

Essentials of Multithreaded System Programming in C++






Total Views
Views on SlideShare
Embed Views



2 Embeds 42 35 7



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • random_shuffle

Essentials of Multithreaded System Programming in C++ Essentials of Multithreaded System Programming in C++ Presentation Transcript

  • Essentials of MultithreadedSystem Programming in C++
    Shuo Chen
  • Contents
    Challenges in multithreaded system programming
    Thread safety of C and C++ libraries
    RAII and fork()
    fork() and signal handling in multithreaded programs
    Shuo Chen (
  • Audience: C++ programmers
    Familiar with Pthreads and Sockets API
    Knows thread safety, deadlock, race condition, etc.
    In a word: read through APUE2e and UNP3e (vol. 1) by W. Richard Stevens et al.
    All discussions are based on Linux 2.6.x, x >= 28
    There are new syscalls, egsignalfd, eventfd, and timerfd
    x86 and x64 platforms
    Shuo Chen (
  • Multi-threaded system programming
    Multithreading is inevitable in this multi-core era
    The difficulties are not learning synchronization primitives (mutexes, condition variables)
    ~10 functions are sufficient to do it right
    But understanding interactions between existing system calls and library functions
    Understands how threads affect system design
    Use it wisely and effectively
    Avoid common pitfalls and fallacies
    Shuo Chen (
  • 11 essential Pthreads functions
    11 out of 110+ pthreads functions
    2 -> create and join threads
    4 -> init/destroy, lock/unlock mutexes
    5 -> init/destroy, wait/signal/broadcast condvars
    Think twice if you need more
    Some are okay, eg. once and key, maybe rwlock
    Some are bad, eg. cancel and kill, semaphores
    Check muduo/base for encapsulation in C++ click thread
    Shuo Chen (
  • An asynchronous world
    Never assume the sequence of events without proper synchronization.
    Knows happens-before relation, memory visibility, etc.
    The effect of an interaction between two [thread]s must be independent of the speed at which it is carried out. --- Brinch Hansen 1973
    Shuo Chen (
  • Standards and practices
    Although the latest official standards of C and C++ languages (C99 and C++03) do not say a word about process or thread
    We write multi-process and/or multi-threaded C/C++ programs in real life, as a real-world need
    We can’t wait it to be standardized, as standards usually fall behind practices for years
    btw, if there are not real life multi-threads programs , how do people what/how to standardize?
    We adhere to some de facto standards
    A lot simpler if we focus on one hardware and one OS
    Shuo Chen (
  • Thread identifier on Linux
    Use pid_t as thread id, instead of pthread_t, on Linux
    pthread_tthid = pthread_self(), thid is opaque (uintptr_t)
    pid_ttid = ::gettid(), tid is task id, usually a small integer
    /proc/tid/, /proc/pid/task/tid/, ps, top all work fine
    How to implement gettid() efficiently? Thread local?
    gettid(2) is a syscall, but the output should never change
    getpid(2) caches the result, should gettid() do the same?
    What if fork(), will it caches the old value in child proc?
    How about pthread_atfork() to clear it up?
    Check muduo/base/ for details
    Shuo Chen (
  • Creation of threads
    A library should not create its own ‘background’ thread without prior informed consent
    Makes a program non-forkable
    Never create thread before main()
    Avoid creating thread in ctor of static or global object
    Breaks static objects constructing, eg. protobuf registering
    The number of threads created should be independentof system load, eg. # of connections, # of requests
    otherwise non-scalable
    Reuse threads, by assign multiple roles to it
    Doing IO and timer with muduoEventLoop class
    For simple task, do it within IO callbacks in IO threads
    Shuo Chen (
  • Three ways of termination
    Natural death – return from thread function, good
    Suicide – call pthread_exit()
    Mudered – killed by pthread_cancel()
    Rule: let it die, never suicide or murder a thread
    Why? inherently deadlock-prone: no chance to unlock
    Design your program so that a thread can be waken up and safely exits
    For reference
    Java Thread.{stop, suspend and destroy} are deprecated
    Boost Threads doesn’t provide thread::cancel()
    Shuo Chen (
  • pthread_cancel() and C++
    In C, we have concept of ‘cancellation point’
    In C++, pthread_cancel() throws an exception in that thread, helps unwinding objects on stack
    The exception must reach the outmost function, otherwise core dump:
    FATAL: exception not rethrown
    Aborted (core dumped)
    Always rethrow in catch(…) cause
    Ulrich Drepper “Cancellation and C++ Exceptions”
    Better: never cancel or kill a thread
    Shuo Chen (
  • exit() is not thread safe in C++
    exit() destructs static or global objects, (_exit() doesn’t)
    The destructor may try to hold a lock
    The caller function may have held the same lock already
    End up in a dead lock
    Check following code for an example of dead lock
    How to quit a multi-threaded program safely?
    An irregular but simple solution: make a process killable, eg.
    It’s not fault of exit(), but static or global objects
    Try to avoid static or global objects in C++, except for PODs
    Shuo Chen (
  • Thread local __thread in g++
    Thread safe by natural, unless escaped to other thread
    More efficient implementation, than pthread_key_t
    See “ELF Handling For Thread-Local Storage”
    In C++, must be initialized with constant-expression
    No __thread string t_obj("Chen Shuo");
    No __thread string* t_obj = new string;
    Only __thread string* t_obj = NULL;
    More rules:
    Use pthread_key_t if you want auto destruction
    Shuo Chen (
  • Use non-recursive mutex only
    A basic assumption of holding a mutex
    Once I lock it, I can modify the guarded object safely
    Which is not true for recursive mutex, eg.
    Recursive mutexes by David Butenhof
    Recursive locks - a blessing or a curse?
    Shuo Chen (
  • Impacts of introducing threads
    Threading is a late patch to OS kernel
    Unix kernel and API formed in early 1970s
    First implementation of threads emerged in early 1990s
    Breaks lots of assumptions made during the 20 years
    Library functions with side effects must be revisited
    malloc/free, fread/fseekcan be made thread-safe with locks
    Functions that return or use static allocated space are not thread safe but may have thread-safe variants
    asctime_r, ctime_r, gmtime_r, rand_r, stderror_r, strtok_r
    errno is not an ‘extern int’, but a per-thread value
    extern int *__errno_location(void);
    #define errno (*__errno_location())
    Shuo Chen (
  • Thread safety of C library
    Individual system calls must be thread safe
    Be caution of interfering of same file descriptor from multiple threads
    Most of glibc library functions are thread safe nowadays
    Counterintuitively, Posix standards lists functions thatare not required to be thread safe, it's a black list.
    2.9.1 Thread-Safety :
    All functions defined by this volume of POSIX.1-2008 shall be thread-safe, except that the following functions need not be thread-safe.
    Notably, getenv/putenv/setenv/system() are not safe
    Shuo Chen (
  • FILE* functions are thread safe
    Read ‘man flockfile’, but they are not composable, eg.
    fseek(), followed by fread()
    The file position may change during the course by a different thread
    Wrap with flockfile(FILE*) and funlockfile(FILE*)
    Same applies to lseek(2) and read(2), but how to lock?
    Use pread(2) instead, which doesn’t change the file offset
    In general, a function that calls two thread-safe functions is not guaranteed to be thread-safe
    Just like exception-safety, thread-safety is not composable
    Shuo Chen (
  • Thread safety is not composable
    A solution works in single-threaded program may not apply to multi-threaded program.
    Any solution calls two or more thread safe function are not necessarily correct in multi-threaded program
    What’s the time in London now? Program runs in New York
    string oldTz = getenv("TZ"); // save TZ
    putenv("TZ=Europe/London"); tzset(); // set TZ to London
    struct tm localTimeInLN = *localtime(time(NULL));
    setenv("TZ", oldTz.c_str(), 1); tzset(); // restore old TZ
    This code impacts localtime() in other threads
    Thread safe functions are not composable unless you carefully design the interface and interactions
    Shuo Chen (
  • Thread safety of C++ std library
    Although not required by the standard, the de facto says
    Unshared objects are independent: Two threads can freely use different objects without any special action on the caller's part. We call it "same level as built-in types."
    This applies to STL containers like map, vector, string
    Pure functions are safe, eg. Most of STL algorithms.
    The global cin/cout objects are shared by threads, and are not thread safe. Moreover, they can't be made safe
    cout << a << b;  cout.operator<<(a).operator<<(b);
    Two function calls can be interrupted by another thread
    Use printf(3) instead, it's thread safe and atomic.
    Allocators must be thread safe, as they are shared
    Shuo Chen (
  • Thread-Safe vs. Thread-Efficient
    printf(3) and malloc(3) are thread safe, but not necessarily efficient enough, esp. on multi-cores
    printf(3) locks FILE* stdout, synchronizes threads
    not good for multi-threaded logging, we need a better lib
    your default malloc(3) may not optimized for multi-threads and multi-cores
    it may lock global heap for each allocation
    try tcmalloc, Google's thread-cache malloc
    see Intel. Is your memory management multi-core ready?
    Shuo Chen (
  • Operate one fd in one thread
    Although system calls of file descriptors are safe
    What if a thread close a fd when other thread is block reading it?
    What happens if a thread add a fd to epoll watch list while other thread is epoll_wait()ing it?
    What happens if two threads poll same fd, and find it readable simultaneously?
    What if two threads read the same TCP socket but each get partial data? How do you tell which part comes first?
    Rule: all operations on one file descriptor should happen in one thread, make your life a lot easier
    Shuo Chen (
  • File descriptors in threads
    File descriptors are small integers, unlike HANDLE
    When create a new fd, kernel picks the lowest unused one
    Higher possibility of cross-talk, if careless, eg.
    A fd shared by two threads
    The first thread have just close()d it
    The second is about to read() it
    But a third thread happened to create a new fd with same id (the lowest available int reused) during the period
    What does the second thread read from? Any other impact?
    Solution: manage resource with RAII idiom
    And use the usual technique to manage object life cycles
    Shuo Chen (
  • C++ and fork()
    A object could construct once but destruct twice
    int main()
    Foofoo; // call 'Foo::Foo'
    fork(); // fork to two process
    // call 'Foo::~Foo' in parent *and* child processes
    It might be a problem, if Foo owns some resource that is not inherited by child process
    Again, avoid static or global objects in C++
    In child process, the object may not be properly initialized
    A global muduo::Timestamp startTime(now()) is wrong
    Shuo Chen (
  • RAII and fork()
    fork() doesn't copy all state
    Open file descriptors are inherited by child process
    But the offset of file are independent
    The child does not inherit
    its parent's memory locks (mlock(2), mlockall(2))
    record locks from its parent (fcntl(2))
    timers from its parent (setitimer(2), alarm(2), timer_create(2)), and others
    So the RAII idiom may not work well in fork()ed process
    A RAII class that wraps timer_create/timer_delete in ctor/dtor may fail in child process after fork()
    Use pthread_atfork() as the last resort
    Shuo Chen (
  • C++ and threads
    Use scoped lock guard only, check muduo/base/Mutex.h
    Don't allow exceptions to propagate across module boundaries
    don't let exception propagate out of the thread main function, catch all exceptions in the outer-most function
    But, rethrow the one of pthread_cancel(), as we said before
    Don't allow exceptions to propagate out of your callback, esp. callbacks from C library, eg. the init_routine registered to pthread_once()
    Better: don't use exception in C++
    Shuo Chen (
  • Threads and fork()
    The fork() model doesn’t fit well in threads
    A fundamental flaw of PosixOSes, as other threads disappear in child, the state is not consistent in child proc
    After fork a multi-threaded program you may only call async-signal-safe functions in child, as if in signal handler
    malloc() is not safe, other thread may hold the lock when fork()ing, and no chance to unlock in the new process
    So does printf(), pthread_* and others.
    The only safe way to use fork() in a multi-threaded program is calling exec() immediately in child process
    And make sure set close-on-exec flag on every file descriptors in parent process for security reasons.
    Shuo Chen (
  • Signals and threads
    The whole Posix signal mechanism is a shit
    Only async-signal-safe functions can be called in signal handler, also called 'reentrant functions'
    Most of the functions are notasync-signal-safe, except those listed in Posix standards, so it's a white list
    'man 7 signal' to get the list on Linux
    None of pthread_* are not async-signal-safe, you can't notify a condvar or lock a mutex in signal handler
    Surprisely, gettimeofday(2) is not async-signal-safe
    Shuo Chen (
  • Deal with signals in MT programs
    Rule 1: do not use signal
    don't use it as IPC, eg. SIGUSR1, SIGUSR2, SIGINT, SIGHUP
    don't use library functions built upon signals, eg. alarm, sleep, usleep, timer_create, etc.
    Rule 2: when you absolutely need, convert an async signal to synchronous file descriptor readable event
    use signalfd in high Linux kernel version
    Normally, the set of signals to be received via the file descriptor should be blocked using pthread_sigmask(3), to prevent the signals being handled according to their default dispositions.
    or open a pipe(2), write(2) one byte in signal handler, and read(2) or poll(2) it in main thread
    Shuo Chen (
  • Other resources
    Seven posts in
    Shuo Chen (
  • To be continued
    Essential of non-blocking network programming in C++
    Birth of a reactor – design and implementation of Muduo
    Shuo Chen (
  • Avoid static or global objects
    Except for PODs
    Shuo Chen (