Your SlideShare is downloading. ×
Dthreads: Efficient Deterministic Multithreading
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Dthreads: Efficient Deterministic Multithreading

217
views

Published on

Dthreads is an efficient deterministic multithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. …

Dthreads is an efficient deterministic multithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. It is easy to use: just link your program with -ldthread instead of -lpthread.

Dthreads can be downloaded from its source code repo on GitHub (https://github.com/plasma-umass/dthreads). A technical paper describing Dthreads appeared at SOSP 2012 (https://github.com/plasma-umass/dthreads/blob/master/doc/dthreads-sosp11.pdf?raw=true).


Multithreaded programming is notoriously difficult to get right. A key problem is non-determinism, which complicates debugging, testing, and reproducing errors. One way to simplify multithreaded programming is to enforce deterministic execution, but current deterministic systems for C/C++ are incomplete or impractical. These systems require program modification, do not ensure determinism in the presence of data races, do not work with general-purpose multithreaded programs, or run up to 8.4× slower than pthreads.

This talk presents Dthreads, an efficient deterministic multithreading system for unmodified C/C++ applications that replaces the pthreads library. Dthreads enforces determinism in the face of data races and deadlocks. Dthreads works by exploding multithreaded applications into multiple processes, with private, copy-on-write mappings to shared memory. It uses standard virtual memory protection to track writes, and deterministically orders updates by each thread. By separating updates from different threads, Dthreads has the additional benefit of eliminating false sharing. Experimental results show that Dthreads substantially outperforms a state-of-the-art deterministic runtime system, and for a majority of the benchmarks we evaluated, matches and occasionally exceeds the performance of pthreads.

Published in: Technology, Spiritual

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
217
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • In the beginning, there was the Core. And it was good.
  • Casts out the demons of nondeterminism
  • Highlight when same speed or faster.
  • Highlight when same speed or faster.
  • Obviously this doesn’t preserve shared memory semantics, so we need to commit changes made by one thread so they become visible to others.
  • ADD ANIMATIONS: threads initially on one core then migrating, vs. processes spewed across cores
  • ADD ANIMATIONS: threads initially on one core then migrating, vs. processes spewed across cores
  • ADD ANIMATIONS: threads initially on one core then migrating, vs. processes spewed across cores
  • It’s not *always* as fast or faster than pthreads. Slow THEN HIGHLIGHT THE FASTER PARTS.
  • Cache coherence protocol makes false sharing problem unpleasant performance effect
  • Panel 1 = what it does, panel 2 = how, panel 3 = efficient, panel 4 = easy to use
  • Transcript

    • 1. Tongping Liu, Charlie Curtsinger, Emery BergerDTHREADS: Efficient DeterministicMultithreadingInsanity: Doing the samething over and over againand expecting differentresults.
    • 2. 2In the Beginning…
    • 3. 3There was the Core.
    • 4. 4And it was Good.
    • 5. 5It gave us our Daily Speed.
    • 6. 6Until the Apocalypse.
    • 7. 7And the Speed was no Moore.
    • 8. 8And then came a False Prophet…
    • 9. 9
    • 10. 10Want speed?
    • 11. 11I BRING YOU THE GIFT OF PARALLELISM!
    • 12. 12color = ; row = 0; // globalsvoid nextStripe(){for (c = 0; c < Width; c++)drawBox (c,row,color);color = (color == )?  : ;row++;}for (n = 0; n < 9; n++)pthread_create(t[n], nextStripe);for (n = 0; n < 9; n++)pthread_join(t[n]);JUST USE THREADS…
    • 13. 13
    • 14. 14
    • 15. 15
    • 16. 16
    • 17. 17
    • 18. 18pthreadsrace conditionsatomicity violationsdeadlockorder violations
    • 19. 19Salvation?
    • 20. 20
    • 21. 21pthreadsrace conditionsatomicity violationsdeadlockorder violationsDTHREADSdeterministicrace conditionsatomicity violationsdeadlockorder violations
    • 22. 22DTHREADS Enables…Race-free ExecutionsReplay Debugging w/o LoggingReplicated StateMachines
    • 23. 230123456runtimerelativetopthreadsCoreDet dthreads pthreads8.47.8DTHREADS: Efficient DeterminismUsually faster than the state of the art
    • 24. 240123456runtimerelativetopthreadsCoreDet dthreads pthreads8.47.8DTHREADS: Efficient DeterminismGenerally as fast or faster than pthreads
    • 25. 25% g++ myprog.cpp –l threadDTHREADS: Easy to Usep
    • 26. 26Isolationshared address space disjoint address spaces
    • 27. 27Performance: Processes vs. Threadsthreadsprocesses1 2 4 8 16 32 64 128 256 5121024Thread Execution Time (ms)1.41.21.00.80.60.40.20.0NormalizedExecutionTime
    • 28. 28Performance: Processes vs. Threadsthreadsprocesses1 2 4 8 16 32 64 128 256 5121024Thread Execution Time (ms)1.41.21.00.80.60.40.20.0NormalizedExecutionTime
    • 29. 29Performance: Processes vs. Threadsthreadsprocesses1 2 4 8 16 32 64 128 256 5121024Thread Execution Time (ms)1.41.21.00.80.60.40.20.0NormalizedExecutionTime
    • 30. 30“Shared Memory”
    • 31. 31Snapshot pagesbefore modifications“Shared Memory”
    • 32. 32Write back diffs“Shared Memory”
    • 33. 33“Thread” 1“Thread” 2“Thread” 3Parallel SerialUpdate in Deterministic Time & OrderParamutex_lockcond_waitpthread_create
    • 34. 3401234runtimerelativetopthreadsdthreads pthreadsDTHREADS performance analysis
    • 35. 35Thread 1Main MemoryCore 1Thread 2Core 2InvalidateThe Culprit: False Sharing
    • 36. 36Thread 1 Thread 2InvalidateMain MemoryCore 1 Core 2The Culprit: False Sharing20x
    • 37. 37Process 1 Process 2Global StateCore 1 Core 2Process 2Process 1DTHREADS: Eliminates False Sharing!
    • 38. 380123456runtimerelativetopthreadsordering only isolation only dthreadsDTHREADS: Detailed Analysis
    • 39. 390123456runtimerelativetopthreadsordering only isolation only dthreadsDTHREADS: Detailed Analysis
    • 40. 400123456runtimerelativetopthreadsordering only isolation only dthreadsDTHREADS: Detailed Analysis
    • 41. 4101234speedupof8coresover2coresCoreDet dthreads pthreadsDTHREADS: Scalable Determinism
    • 42. 4201234speedupof8coresover2coresCoreDet dthreads pthreadsDTHREADS: Scalable Determinism
    • 43. 4301234speedupof8coresover2coresCoreDet dthreads pthreadsDTHREADS: Scalable Determinism
    • 44. 44DTHREADS% g++ myprog.cpp –l threadp
    • 45. 45