.Net Multithreading and Parallelization

  • 3,976 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,976
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
158
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Multithreading and Parallelization Dmitri Nesteruk dmitrinesteruk@gmail.com | http://nesteruk.org/seminars
  • 2. Agenda Overview Multithreading PowerThreading (AsyncEnumerator) Multi-core parallelization Parallel Extensions to .NET Framework Multi-computer parallelization PureMPI.NET
  • 3. Why now? Manycore paradigm shift CPU speeds reach production challenges (not at the limit yet) growth Processor features Hyper-threading SIMD
  • 4. CPU Scope Past: more Yesterday transistors per chip 1x-core Present: more cores per chip Today 2x-core norm Future: even more 4x- cores per chip; Tomorrow NUMA & other 32x-core? specialties
  • 5. Machine Scope Most clients are concerned with Machine one-machine use Clustering helps Cluster leverage performance Clouds Cloud
  • 6. Multithreading vs. Parallelization Multithreading Using threads/thread pool to perform async operations Explicit (# of threads known) Parallelization Implicit parallelization No explicit thread operation
  • 7. Ways to Parallelize/Multithread System.Threading Managed Parr. Extensions Libraries OpenMP Unmanaged Libraries GPGPU Specialized FPGA
  • 8. Managed System.Threading Libraries Parallel Extensions (TPL + PLINQ) PowerThreading Languages/frameworks Sing#, CCR Remoting, WCF, MPI.NET, PureMPI.NET, etc. Use over many machines
  • 9. Unmanaged OpenMP – #pragma directives in C++ code Intel multi-core libraries Threading Building Blocks (low-level) Integrated Performance Primitives Math Kernel Library (also has MPI support) MPI, PVM, etc. Use over many machines
  • 10. Specialized Ex. (Intrinsic Parallelization) GPU Computation (GPGPU) Calculations on graphic card Uses programmable pixel shaders See, e.g., NVidia CUDA, GPGPU.org FPGA Hardware-specific solutions E.g., in-socket accelerators Requires HDL programming & custom hardware
  • 11. Part I Multithreading: a look at AsyncEnumerator
  • 12. Multithreading Goals Do stuff concurrently Preserve safety/consistency Tools Threads ThreadPool Synchronization objects Framework async APIs
  • 13. A Look at Delegates Making delegate for function is easy Given void a() { … } – ThreadStart del = a; Given void a(int n) { … } – Action<int> del = a; Given float a(int n, double m) {…} – Func<int, double, float> del = a; Otherwise, make your own!
  • 14. Delegate Methods Invoke() Synchronous, blocks your thread  BeginInvoke Executes in ThreadPool Returns IAsyncResult EndInvoke Waits for completion Takes the IAsyncResult from BeginInvoke
  • 15. Usage Fire and forget – del.BeginInvoke(null, null); Fire, and wait until done – IAsyncResult ar = del.BeginInvoke(null,null); … del.EndInvoke(ar); Fire, and call a function when done – del.BeginInvoke(firedWhenDone, null); Callback parameter
  • 16. WaitOne and WaitAll To wait until either delegate completes – WaitHandle.WaitOne( new ThreadStart[] { ar1.AsyncWaitHandle, ar2.AsyncWaitHandle }); // wait until either completes To wait until all delegates complete Use WaitAll instead of WaitOne – [MTAThread]-specific, use Pulse & Wait instead
  • 17. Example Execute a() and b() in parallel; wait on both ThreadStart delA = a; ThreadStart delB = b; IAsyncResult arA = delA.BeginInvoke(null, null); IAsyncResult arB = delB.BeginInvoke(null, null); WaitHandle.WaitAll(new [] { arA.AsyncWaitHandle, arB.AsyncWaitHandle });
  • 18. LINQ Example Execute a() and b() in parallel; wait on both WaitHandle.WaitAll( new [] { a, b } Implicitly make an array of delegates .Select (f =>f.BeginInvoke(null,null) Call each delegate .AsyncWaitHandle) .ToArray()); Get a wait handle of each Convert from IEnumerable to array
  • 19. Asynchronous Programming Model (APM) Basic goal – IAsyncResult ar = del.BeginXXX(null,null); … del.EndXXX(ar); Supported by Framework classes, e.g., – FileStream – WebRequest
  • 20. Difficulties Async calls do not always succeed Timeout Exceptions Cancelation Results in too many functions/anonymous delegates Async workflow code becomes difficult to read
  • 21. PowerThreading A free library from Resource locks Wintellect (Jeffrey ReaderWriterGate Richter) Async. prog. model Get it at AsyncEnumerator wintellect.com SyncGate Other features Also check out IO PowerCollections State manager NumaInformation :)
  • 22. AsyncEnumerator Simplifies APM programming No need to manually manage IAsyncResult cookies Fewer functions, cleaner code
  • 23. Usage patterns 1 async op → process X async ops → process all X async ops → process each one as it completes X async ops → process some, discard the rest X async ops → process some until cancellation/timeout occurs, discard the rest
  • 24. AsyncEnumerator Basics Has three methods Execute(IEnumerator<Int32>) BeginExecute EndExecute Also exists as AsyncEnumerator<T> when a return value is required
  • 25. Inside the Function internal IEnumerator<Int32> GetFile( AsyncEnumerator ae, string uri) { WebRequest wr = WebRequest.Create(uri); wr.BeginGetResponse(ae.End(), null); yield return 1; WebResponse resp = wr.EndGetResponse( ae.DequeueAsyncResult()); // use response }
  • 26. Signature internal IEnumerator<Int32> GetFile( AsyncEnumerator ae, string uri) { Function must return IEnumerator<Int32> WebRequestwr = WebRequest.Create(uri); Function must accept AsyncEnumerator as wr.BeginGetResponse(ae.End(), null); one of the parameters (order unimportant) yield return 1; WebResponseresp = wr.EndGetResponse( ae.DequeueAsyncResult()); // use response }
  • 27. Callback internal IEnumerator<Int32> GetFile( AsyncEnumerator ae, string uri) { WebRequest wr = WebRequest.Create(uri); wr.BeginGetResponse(ae.End(), null); yieldthe asyncBeginXXX() methods Call return 1; WebResponseresp = wr.EndGetResponse( Pass ae.End() as callback parameter ae.DequeueAsyncResult()); // use response }
  • 28. Yield internal IEnumerator<Int32> GetFile( AsyncEnumerator ae, string uri) { WebRequest wr = WebRequest.Create(uri); wr.BeginGetResponse(ae.End(), null); yield return 1; WebResponseresp = wr.EndGetResponse( Now yield return the number of pending asynchronous operations ae.DequeueAsyncResult()); // use response }
  • 29. Wait & Process internal IEnumerator<Int32> GetFile( AsyncEnumerator ae, string uri) { WebRequest wr = WebRequest.Create(uri); wr.BeginGetResponse(ae.End(), null); yield return 1; Call the asyncEndXXX() methods WebResponse resp = wr.EndGetResponse( ae.DequeueAsyncResult()); // use response Pass ae.DequeueAsyncResult() as parameter }
  • 30. Usage Init the enumerator – var ae = new AsyncEnumerator(); Use it, passing itself as a parameter – ae.Execute(GetFile( ae, “http://nesteruk.org”));
  • 31. Exception Handling Break out of function – try { resp = wr.EndGetResponse( ae.DequeueAsyncResult()); } catch (WebException e) { // process e yield break; } Propagate a parameter
  • 32. Discard Groups Sometimes, you want to ignore the result of some calls E.g., you already got the data elsewhere To discard a group of calls Use overloaded End(…) methods to specify Group number Cleanup delegate Call DiscardGroup(…) with group number
  • 33. Cancellation External code can cancel the iterator – ae.Cancel(…) Or specify a timeout – ae.SetCancelTimeout(…) Check whether iterator is cancelled with – ae.IsCanceled(…) just call yield break if it is
  • 34. Part II Parallel Extensions to .NET Framework TPL and PLINQ
  • 35. Parallelization Algorithms vary (e.g., matrix multiplication) Some not so (e.g., matrix inversion) Some not at all parallelize them
  • 36. Parallel Extensions to .NET Framework (PFX) A library for parallelization Consists of Task Parallel Library Parallel LINQ (PLINQ) Currently in CTP stage Maybe in .NET 4.0?
  • 37. Task Parallel Library Features System.Linq Parallel LINQ System.Theading Implicit parallelism (Parallel.Xxx) System.Threading.Collections Thread-safe stack and queue System.Threading.Tasks Task manager, tasks, futures
  • 38. System.Threading Implicit Parallel.For | ForEach parallelization (Parallel.For and LazyInit<T> ForEach) WriteOnce<T> Aggregate AggregateException exceptions Other useful classes Other goodies 
  • 39. Parallel.For Parallelizes a for loop Instead of for (int i = 0; i < 10; ++i) { … } We write Parallel.For(0, 10, i => { … });
  • 40. Parallel.For Overloads Step size ParallelState for cancelation Thread-local initialization Thread-local finalization References to a TaskManager Task creation options
  • 41. Parallel.ForEach Same features as Parallel.For except No counters or steps Takes an IEnumerable<T> 
  • 42. Cancelation Parallel.For takes an Action<Int32> delegate Can also take an Action<Int32, ParallelState> ParallelState keeps track of the state of parallel execution ParallelState.Stop() stops execution in all threads
  • 43. Parallel.For Exceptions The AggregateException class holds all exceptions thrown Created even if only one thread throws Used by both Parallel.Xxx and PLINQ Original exceptions stored in InnerExceptions property.
  • 44. LazyInit<T> Lazy initialization of a single variable Options – AllowMultipleExecution Init function can be called by many threads, only one value published – EnsureSingleExecution Init function executed only once – ThreadLocal One init call & value per thread
  • 45. WriteOnce<T> Single-assignment structure Just like Nullable: HasValue Value Also try methods TryGetValue TrySetValue
  • 46. Futures A future is the name of a value that will eventually be produced by a computation Thus, we can decide what to do with the value before we know it
  • 47. Futures of T • Future is a factory • Future<T> is the actual future (and also has factory methods) To make a future – var f = Future.Create(() => g()); To use a future Get f.Value The accessor does an async computation
  • 48. Tasks & TaskManager A better Thread+ThreadPool combination TaskManager A very clever thread pool :) Adjusts worker threads to # of CPUs/cores Keeps all cores busy Task A unit of work May (or may not) run concurrently http://channel9.msdn.com/posts/DanielMoth/Parall elFX-Task-and-friends/
  • 49. Task Just like a future, a task takes an Action<T> – Task t = Task.Create(DoSomeWork); Overloads exist :) Fires off immediately. To wait on completion – t.Wait(); Unlike the thread pool, task manager will use as many threads as there are cores
  • 50. Parallel LINQ (PLINQ) Parallel evaluation in LINQ to Objects LINQ to XML Features IParallelEnumerable<T> ParallelEnumerable.AsParallel static method
  • 51. Example IEnumerable<T> data = ...; var q = data.AsParallel() .Where(x => p(x)) .Orderby(x => k(x)) .Select(x => f(x)); foreach (var e in q) a(e);
  • 52. Part III Interprocess communication with PureMPI.NET
  • 53. Message Passing Interface An API for general-purpose IPC Works across cores & machines C++ and Fortran Some Intel libraries support explicitly http://www.mcs.anl.gov/research/projects/m pich2/
  • 54. PureMPI.NET A free library available at http://purempi.net Uses WCF endpoints for communication Uses MPI syntax Features A library DLL for WCF functionality An EXE for easy deployment over network
  • 55. How it works Your computers run a service that connects them together Your program exposes WCF endpoints You use the MPI interfaces to communicate
  • 56. Communicator & Rank A communicator is a group of computers In most scenarios, you would have one group MPI_COMM_WORLD comm Useful for determine whether we are the
  • 57. Main static void Main(string[] args) { MPIEnvironment app.config using (ProcessorGroup processors = new ProcessorGroup("MPIEnvironment", MpiProcess)) { Run MpiProcess on all machines processors.Start(); Start each one processors.WaitForCompletion(); Wait on all } }
  • 58. Sending & Receiving Blocking or non-blocking methods Send/Receive (blocking) Begin|End Send/Receive (async) Invoked on the comm
  • 59. Send/Receive static void MpiProcess(IDictionary<string, Comm> comms) { Get a default comm from dictionary Comm comm = comms["MPI_COMM_WORLD"]; if (comm.Rank == 0) { Get a message from 1 (blocking) string msg = comm.Receive<string>(1, string.Empty); Console.WriteLine("Got " + msg); } else if (comm.Rank == 1) { comm.Send(0, string.Empty, "Hello"); } Send a message to 0 (also blocking) }
  • 60. Extras Can use async ops Can send to all (Broadcast) Can distribute work and then collect it (Gather/Scatter)
  • 61. Thank You!