Dthreads: Efficient Deterministic Multithreading

600 views
419 views

Published on

Dthreads is an efficient deterministic multithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. It is easy to use: just link your program with -ldthread instead of -lpthread.

Dthreads can be downloaded from its source code repo on GitHub (https://github.com/plasma-umass/dthreads). A technical paper describing Dthreads appeared at SOSP 2012 (https://github.com/plasma-umass/dthreads/blob/master/doc/dthreads-sosp11.pdf?raw=true).


Multithreaded programming is notoriously difficult to get right. A key problem is non-determinism, which complicates debugging, testing, and reproducing errors. One way to simplify multithreaded programming is to enforce deterministic execution, but current deterministic systems for C/C++ are incomplete or impractical. These systems require program modification, do not ensure determinism in the presence of data races, do not work with general-purpose multithreaded programs, or run up to 8.4× slower than pthreads.

This talk presents Dthreads, an efficient deterministic multithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. Dthreads works by exploding multithreaded applications into multiple processes, with private, copy-on-write mappings to shared memory. It uses standard virtual memory protection to track writes, and deterministically orders updates by each thread. By separating updates from different threads, Dthreads has the additional benefit of eliminating false sharing. Experimental results show that Dthreads substantially outperforms a state-of-the-art deterministic runtime system, and for a majority of the benchmarks we evaluated, matches and occasionally exceeds the performance of pthreads.

Published in: Technology, Spiritual
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
600
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • In the beginning, there was the Core. And it was good.
  • Casts out the demons of nondeterminism
  • Highlight when same speed or faster.
  • Highlight when same speed or faster.
  • Obviously this doesn’t preserve shared memory semantics, so we need to commit changes made by one thread so they become visible to others.
  • ADD ANIMATIONS: threads initially on one core then migrating, vs. processes spewed across cores
  • ADD ANIMATIONS: threads initially on one core then migrating, vs. processes spewed across cores
  • ADD ANIMATIONS: threads initially on one core then migrating, vs. processes spewed across cores
  • It’s not *always* as fast or faster than pthreads. Slow THEN HIGHLIGHT THE FASTER PARTS.
  • Cache coherence protocol makes false sharing problem unpleasant performance effect
  • Panel 1 = what it does, panel 2 = how, panel 3 = efficient, panel 4 = easy to use
  • Dthreads: Efficient Deterministic Multithreading

    1. 1. Tongping Liu, Charlie Curtsinger, Emery BergerDTHREADS: Efficient DeterministicMultithreadingInsanity: Doing the samething over and over againand expecting differentresults.
    2. 2. 2In the Beginning…
    3. 3. 3There was the Core.
    4. 4. 4And it was Good.
    5. 5. 5It gave us our Daily Speed.
    6. 6. 6Until the Apocalypse.
    7. 7. 7And the Speed was no Moore.
    8. 8. 8And then came a False Prophet…
    9. 9. 9
    10. 10. 10Want speed?
    11. 11. 11I BRING YOU THE GIFT OF PARALLELISM!
    12. 12. 12color = ; row = 0; // globalsvoid nextStripe(){for (c = 0; c < Width; c++)drawBox (c,row,color);color = (color == )?  : ;row++;}for (n = 0; n < 9; n++)pthread_create(t[n], nextStripe);for (n = 0; n < 9; n++)pthread_join(t[n]);JUST USE THREADS…
    13. 13. 13
    14. 14. 14
    15. 15. 15
    16. 16. 16
    17. 17. 17
    18. 18. 18pthreadsrace conditionsatomicity violationsdeadlockorder violations
    19. 19. 19Salvation?
    20. 20. 20
    21. 21. 21pthreadsrace conditionsatomicity violationsdeadlockorder violationsDTHREADSdeterministicrace conditionsatomicity violationsdeadlockorder violations
    22. 22. 22DTHREADS Enables…Race-free ExecutionsReplay Debugging w/o LoggingReplicated StateMachines
    23. 23. 230123456runtimerelativetopthreadsCoreDet dthreads pthreads8.47.8DTHREADS: Efficient DeterminismUsually faster than the state of the art
    24. 24. 240123456runtimerelativetopthreadsCoreDet dthreads pthreads8.47.8DTHREADS: Efficient DeterminismGenerally as fast or faster than pthreads
    25. 25. 25% g++ myprog.cpp –l threadDTHREADS: Easy to Usep
    26. 26. 26Isolationshared address space disjoint address spaces
    27. 27. 27Performance: Processes vs. Threadsthreadsprocesses1 2 4 8 16 32 64 128 256 5121024Thread Execution Time (ms)1.41.21.00.80.60.40.20.0NormalizedExecutionTime
    28. 28. 28Performance: Processes vs. Threadsthreadsprocesses1 2 4 8 16 32 64 128 256 5121024Thread Execution Time (ms)1.41.21.00.80.60.40.20.0NormalizedExecutionTime
    29. 29. 29Performance: Processes vs. Threadsthreadsprocesses1 2 4 8 16 32 64 128 256 5121024Thread Execution Time (ms)1.41.21.00.80.60.40.20.0NormalizedExecutionTime
    30. 30. 30“Shared Memory”
    31. 31. 31Snapshot pagesbefore modifications“Shared Memory”
    32. 32. 32Write back diffs“Shared Memory”
    33. 33. 33“Thread” 1“Thread” 2“Thread” 3Parallel SerialUpdate in Deterministic Time & OrderParamutex_lockcond_waitpthread_create
    34. 34. 3401234runtimerelativetopthreadsdthreads pthreadsDTHREADS performance analysis
    35. 35. 35Thread 1Main MemoryCore 1Thread 2Core 2InvalidateThe Culprit: False Sharing
    36. 36. 36Thread 1 Thread 2InvalidateMain MemoryCore 1 Core 2The Culprit: False Sharing20x
    37. 37. 37Process 1 Process 2Global StateCore 1 Core 2Process 2Process 1DTHREADS: Eliminates False Sharing!
    38. 38. 380123456runtimerelativetopthreadsordering only isolation only dthreadsDTHREADS: Detailed Analysis
    39. 39. 390123456runtimerelativetopthreadsordering only isolation only dthreadsDTHREADS: Detailed Analysis
    40. 40. 400123456runtimerelativetopthreadsordering only isolation only dthreadsDTHREADS: Detailed Analysis
    41. 41. 4101234speedupof8coresover2coresCoreDet dthreads pthreadsDTHREADS: Scalable Determinism
    42. 42. 4201234speedupof8coresover2coresCoreDet dthreads pthreadsDTHREADS: Scalable Determinism
    43. 43. 4301234speedupof8coresover2coresCoreDet dthreads pthreadsDTHREADS: Scalable Determinism
    44. 44. 44DTHREADS% g++ myprog.cpp –l threadp
    45. 45. 45

    ×