Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NYAN Conference: Debugging asynchronous scenarios in .net

67 views

Published on

Times have changed. Multi-core CPUs have become the norm and multi-threading has been replaced by asynchronous programming. You think you know everything about async/await... until something goes wrong. While debugging synchronous code can be straightforward, investigating an asynchronous deadlock or race condition proves to be surprisingly tricky.
In this talk, follow us through real-life examples and investigations to cover the main asynchronous code patterns that can go wrong. You will tumble on deadlock and understand the reasons behind ThreadPool thread starvation.
In addition to WinDbg magic to follow async/await chains, Visual Studio goodies won't be forgotten to quickly analyze hundreds of call stacks or tasks status.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

NYAN Conference: Debugging asynchronous scenarios in .net

  1. 1. Organized by Donating to R&Devents@criteo.com criteo.com Medium.com/criteo-labs @CriteoEng #NYANconf Debugging asynchronous scenarios by Christophe Nasarre Kevin Gosse NYAN conference
  2. 2. First case: a service refuses to stop • Still in running state in Windows Services panel
  3. 3. In production → take a memory snaphot procdump -ma <pid>
  4. 4. Parallel Stack in Visual Studio • Yes: VS is able to load a memory dump • This is a nice way to visually see what is going on → We are waiting for ClusterClient.Dispose() to end
  5. 5. In production → take a memory snaphot procdump -ma <pid> Which foreground thread is still running? what ClusterClient.Dispose() is waiting for? Look at the Code Luke!
  6. 6. ActionBlock internals Task ProcessMessage(TInput message) { ... }
  7. 7. In production → take a memory snaphot procdump -ma <pid> Which foreground thread is still running? what ClusterClient.Dispose() is waiting for? Look at the Code Luke! Look for _agent state
  8. 8. Task ContinueWith(Action<Task> nextAction,…) { Task task = new ContinuationTaskFromTask<TResult> (this, nextAction,…); base.ContinueWithCore(task, …); return task; } ContinueWith internals (1|3)
  9. 9. internal void ContinueWithCore(Task continuationTask, …) { TaskContinuation taskContinuation = new StandardTaskContinuation(continuationTask, …); … if (!continuationTask.IsCompleted) { // add task to m_continuationObject if (!AddTaskContinuation(taskContinuation, …)) { taskContinuation.Run(this, …); } } } ContinueWith internals (2|3)
  10. 10. ContinueWith internals (3|3)
  11. 11. async Task<long> AAA(CancellationToken token) { Stopwatch tick = new Stopwatch(); tick.Start(); await BBB(token); tick.Stop(); return tick.ElapsedMillisecond; } async/await Internals (1|2)
  12. 12. async/await Internals (2|2) async Task AAA() { await BBB(); ... } async Task BBB() { ... } AsyncMethodBuilderCore+MoveNextRunner Action TaskSchedulerAwaitTaskContinuation** Task (returned by BBB) AAA State machine Task (returned by AAA)
  13. 13. In production → take a memory snaphot procdump -ma <pid> Which foreground thread is still running? what ClusterClient.Dispose() is waiting for? Look at the Code Luke! Look for _agent state → Exception broke the responses ActionBlock
  14. 14. BONUS: more continuations • A few other continuation scenarios that you may encounter ✓ Task.Delay ✓ Task.WhenAny ✓ Special cases
  15. 15. Why a List<object> as continuation? Task DoStuffAsync() { var task = SendAsync(); task.ContinueWith(t => LogStuff(t)); return task; } // user code await DoStuffAsync(); DoSomethingSynchronously() Task m_continuationObject nullStandardTaskContinuation List<object> StandardTaskContinuation *TaskContinuation
  16. 16. Why a empty List<object> as continuation? async Task DoStuffAsync() { var T1 = Task.Run(…); var T2 = Task.Run(…); await Task.WhenAny(T1, T2); … // T2 ends first } T1 m_continuationObject null T2 m_continuationObject null CompleteOnInvokePromise CompleteOnInvokePromise empty List<object>object
  17. 17. Investigation 1 - key takeaways 1. Thread call stacks do not give the full picture • Even Visual Studio parallel stacks is not enough 2. Require clear understanding of Task internals • m_continuationObject and state machines 3. Start from the blocked task and follow the reverse references chain • sosex!refs is your friend
  18. 18. Symptoms: 0% CPU and thread count raises
  19. 19. In production → take a memory snaphot procdump -ma <pid> look at call stacks in Visual Studio
  20. 20. In production → take a memory snaphot procdump -ma <pid> look at call stacks in Visual Studio → what are those tasks (we are waiting for) doing?
  21. 21. In production → take a memory snaphot procdump -ma <pid> look at call stacks in Visual Studio → what are those tasks (we are waiting for) doing? look at tasks in WinDBG → no deadlock but everything is blocked…
  22. 22. ThreadPool internals static void ProcessRequest() { var task = CallbackAsync(); task.Wait(); } R C
  23. 23. ThreadPool internals ThreadPool R R
  24. 24. ThreadPool internals ThreadPool R R R R R R C C
  25. 25. ThreadPool internals ThreadPool R R R R C C R R
  26. 26. ThreadPool internals ThreadPool R R R R C C R R R R R R C C R R R R
  27. 27. DEMO Simple ThreadPool starvation code
  28. 28. Thread 1 Thread 2 ThreadPool internals Global queue Local queue Local queue Task 1 Task 2 Task 5 Task 4 Task 6Task 3
  29. 29. ThreadPool internals Global queue Local queue C Thread 1 Local queue C Thread 2 Local queue C Thread 3 R R R R
  30. 30. In production → take a memory snaphot procdump -ma <pid> look at call stacks in Visual Studio → what are those tasks (we are waiting for) doing? look at tasks in WinDBG → no deadlock but everything is blocked… → ThreadPool is starved
  31. 31. Investigation 2 - key takeaways 1. Waiting synchronously on a Task is dangerous 2. ThreadPool scheduling is unfair 3. 0% CPU + increasing thread count = sign of ThreadPool starvation
  32. 32. Conclusion • Understand the underlying data structures • Think of causality chains instead of threads call stack • Visual Studio is your friend • Parallel Stacks to get the big picture • WinDBG is your true ally • Use and abuse of sosex !refs • You knew that waiting on tasks is bad • Now you know why
  33. 33. Resources Criteo blog series • http://labs.criteo.com/ • https://medium.com/@kevingosse • https://medium.com/@chnasarre Debugging extensions • https://github.com/chrisnas/DebuggingExtensions (aka Grand Son Of Strike) Contacts • Kevin Gosse @kookiz • Christophe Nasarre @chnasarre

×