Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)

699 views

Published on

An overview of the Task Parallel Library and its use in the Pithos for Windows client, including a recipe of Parallel Revani

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
699
On SlideShare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
Downloads
5
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)

  1. 1. Parallel andAsynchronousProgrammingOr how we buitl a Dropbox clone without aPhD in AstrophysicsPanagiotis KanavosDotNetZone Moderatorpkanavos@gmail.com
  2. 2. • Processors are getting smaller• Networks are getting worse• Operating Systems demand it• Only a subset of the code can run in parallelWhy
  3. 3. • Once, a single-thread process could use 100%of the CPU• 16% ΜΑΧ ona Quad core LAPTOP withHyperThreading• 8% ΜΑΧ on an 8 core serverProcessors are getting smaller
  4. 4. • Hand-coded threads and synchronization• BackgroundWorker Heavy, cumbersome, single threaded, inadequate progress reporting• EAP: From event to event Complicated, loss of continuity• APM: BeginXXX/EndXXX Cumbersome, imagine socket programming with Begin/End!or rather ...What we used to have
  5. 5. • Asynchronous Pipes with APMWhy I stopped blogging
  6. 6. • Collisions Reduced throughput Deadlocks• Solution: Limit the number of threads ThreadPools Extreme: Stackless Python Copy data instead of shared access Extreme: Immutable programmingThe problem with threads
  7. 7. • How can I speed-up my algorithm?• Which parts can run in parallel?• How can I partition my data?Why should I care aboutthreads?
  8. 8. ExampleRevani
  9. 9. • Beat the yolks with 2/3 of sugar until fluffy• Beat the whites with 1/3 of sugar to stiff meringue• and add half the mixture to the yolk mixture.• Mix semolina with flour and ground coconut ,• add rest of meringue and mix• Mix and pour in cake pan• Bake in pre-heated oven at 170οC for 20-25 mins.• Allow to cool• Prepare syrup, boil water, sugar, lemon for 3 mins.• Pour warm syrup over revani• Sprinkle with ground coconut.Synchronous Revani
  10. 10. Parallel Revani• Beat yolks • Beat Whites• Add half mixture• Mix semolina• Add rest of meringue• Mix• Pour in cake pan• Pour syrup• Sprinkle• Bake • Prepare syrup
  11. 11. • Support for multiple concurrency scenarios• Overall improvements in threading• Highly Concurrent collectionsWhat we have now
  12. 12. Scenaria• Faster processing of large data• Number crunching• Execute long operations• Serve high volume of requests• Social Sites, Web sites, Billing, Log aggregators• Tasks with frequent blocking• REST clients, IT management apps
  13. 13. • Data Parallelism• Task Parallelism• Asynchronous programming• Agents/Actors• DataflowsScenario Classification
  14. 14. • Partition the data• Implement the algorithm in a function• TPL creates the necessary tasks• The tasks are assigned to threads• I DON’T’T have to define the number ofTasks/Threads!Data Parallelism – Recipe
  15. 15. • Parallel.For / Parallel.ForEach• PLINQ• PartitionersData Parallelism - Tools
  16. 16. • Parallel execution of lambdas• Blocking calls!• We specify Cancellation Token Maximum number of Threads Task SchedulerParallel class Methods
  17. 17. • LINQ Queries• Potentially multiple threads• Parallel operators• Unordered results• Beware of racesList<int> list = new List<int>();var q = src.AsParallel().Select(x => { list.Add(x); return x; }).Where(x => true) .Take(100);PLINQ
  18. 18. • Doesn’t use SSE instructions• Doesn’t use the GPU• Isn’t using the CPU at 100%What it can’t do
  19. 19. • Data Parallelism• Task Parallelism• Asynchronous programming• Agents/Actors• DataflowsScenaria
  20. 20. • Break the problem into steps• Convert each step to a function• Combine steps with Continuations• TPL assigns tasks to threads as needed• I DON’T have to define number ofTasks/Threads!• Cancellation of the entire task chainTask Parellelism – Recipe
  21. 21. • Tasks wherever code blocks• Cancellation• Lazy Initialization• Progress Reporting• Synchronization ContextsThe Improvements
  22. 22. • Problem: How do you cancel multiple taskswithout leaving trash behind?• Solution: Everyone monitors aCancellationToken TPL cancels subsequent Tasks or Parallel operations Created by a CancellationTokenSource Can execute code when Cancel is calledCancellation
  23. 23. • Problem: How do you update the UI from insidea task?• Solution: Using an IProgress<T> object Out-of-the-Box Progress<T> updates the current Synch Context Any type can be a message Replace with our own implementationProgress Reporting
  24. 24. • Calculate a value only when needed• Lazy<T>(Func<T> …)• Synchronous or Asynchronous calculation Lazy.Value Lazy.GetValueAsync<T>()Lazy Initialization
  25. 25. • Since .NET 2.0!• Hides Winforms, WPF, ASP.NET SynchronizationContext.Post/Send instead of Dispatcher.Invoke etc Synchronous and Asynchronous version• Automatically created by the environment SynchronizationContext.Current• Can create our own E.g. For a Command Line aplicationSynchronization Context
  26. 26. • Data Parallelism• Task Parallelism• Asynchronous programming• Agents/Actors• DataflowsScenaria
  27. 27. • Support at the language leve• Debugging support• Exception Handling• After await return to original “thread” Beware of servers and libraries• Dos NOT always execute asynchronously Only when a task is encountered or the thread yields Task.YieldAsync/Await
  28. 28. private static async Task<T>Retry<T>(Func<T> func, int retryCount) {while (true) {try {var result = await Task.Run(func);return result;}catch {If (retryCount == 0)throw;retryCount--;} } }Asynchronous Retry
  29. 29. • Highly concurrent• Thread-safe• Not only for TPL/PLINQ• Producer/Consumer scenariaMore Goodies - Collections
  30. 30. • ConcurrentQueue• ConcurrentStack• ConcurrentDictionaryConcurrent Collections - 2
  31. 31. • Duplicates allowed• List per Thread• Reduced collisions for each tread’s Add/Take• BAD for Producer/ConsumerThe Odd one - ConcurrentBag
  32. 32. • NOT faster than plain collections in lowconcurrency scenarios• DO NOT consume less memory• DO NOT provide thread safe enumeration• DO NOT ensure atomic operations on content• DO NOT fix unsafe codeConcurrent Collections -Gotchas
  33. 33. • Visual Studio 2012• Async Targeting package• System.Net.HttpClient packageAlso in .NET 4
  34. 34. • F# async• C++ Parallel Patterns Library• C++ Concurrency Runtime• C++ Agents• C++ AMPOther Technologies
  35. 35. • Object storage similar to Amazon S3/Azure Blobstorage• A Service of Synnefo – IaaS by GRNet• Written in Python• Clients for Web, Windows, iOS, Android, Linux• Versioning, Permissions, Sharing
  36. 36. Synnefo
  37. 37. • REST API base on CloudFiles by Rackspace Compatible with CyberDuck etc• Block storage• Uploads only using blocks• Uses Merkle HashingPithos API
  38. 38. • Multiple accounts per machine• Synchronize local folder to a Pithos account• Detect local changes and upload• Detect server changes and download• Calculate Merkle Hash for each filePithos Client for Windows
  39. 39. The ArchitectureUIWPFMVVMCaliburnMicroCoreFile AgentPoll AgentNetworkAgentStatus AgentNetworkingCloudFilesHttpClientStorageSQLiteSQL ServerCompact
  40. 40. • .ΝΕΤ 4, due to Windows XP compatibility• Visual Studio 2012 + Async Targeting Pack• UI - Caliburn.Micro• Concurrency - TPL, Parallel, Dataflow• Network – HttpClient• Hashing - OpenSSL – Faster than native provider for hashing• Storage - NHibernate, SQLite/SQL Server Compact• Logging - log4netTechnologies
  41. 41. • Handle potentially hundrends of file events• Hashing of many/large files• Multiple slow calls to the server• Unreliable network• And yet it shouldn’t hang• Update the UI with enough informationThe challenges
  42. 42. • Use producer/consumer pattern• Store events in ConcurrentQueue• Process ONLY after idle timeoutEvents Handling
  43. 43. • Why I hate Game of Thrones• Asynchronous reading of blocks• Parallel Hashing of each block• Use of OpenSSL for its SSE support• Concurrency Throttling• Beware of memory consumption!Merkle Hashing
  44. 44. • Each call a task• Concurrent REST calls per account and share• Task.WhenAll to process resultsMultiple slow calls
  45. 45. • Use System.Net.Http.HttpClient• Store blocks in a cache folder• Check and reuse orphans• Asynchronous Retry of callsUnreliable network
  46. 46. • Use Transactional NTFS if available Thanks MS for killing it!• Update a copy and File.Replace otherwiseResilience to crashes
  47. 47. • Use of independent agents• Asynchronous operations wherever possibleShould not hang
  48. 48. • Use WPF, MVVM• Use Progress to update the UIProvide Sufficient user feedback
  49. 49. • Create Windows 8 Dekstop and WinRT client• Use Reactive FrameworkNext StepsΖΗΤΟΥΝΤΑΘ ΕΘΕΛΟΝΤΕΣ
  50. 50. • Avoid Side Effects• Use Functional Style• Clean Coding• THE BIG SECRET: Use existing, tested algorithms• IEEE, ACM Journals and librariesClever Tricks
  51. 51. • Simplify asynchronous or parallel code• Use out-of-the-box libraries• Scenarios that SUIT Task or Data ParallelismYES TPL
  52. 52. • To accelerate “bad” algorithms• To “accelerate” database access Use proper SQL and Indexes! Avoid Cursors• Reporting DBs, Data Warehouse, OLAP CubesNO TPL
  53. 53. • Functional languages like F#, Scala• Distributed Frameworks like Hadoop, {m}braceWhen TPL is not enough
  54. 54. • C# 5 in a Nutshell, O’Riley• Parallel Programming with .NET, Microsoft• Pro Parallel Programming with C#, Wiley• Concurrent Programming on Windows, Pearson• The Art of Concurrency, O’ReillyBooks
  55. 55. • Parallel FX Team:http://blogs.msdn.com/b/pfxteam/• ΙΕΕΕ Computer Societyhttp://www.computer.org• ACM http://www.acm.orgUseful Links

×