Overview Of Parallel Development - Ericnel


Published on

VBUG Newcastle delivery 24th February 2009 by Eric Nelson

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • 06/08/09 02:01 © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
  • Overview Of Parallel Development - Ericnel

    1. 1. Overview of Parallel Development Eric Nelson http://geekswithblogs.net/iupdateable http://blogs.msdn.com/goto100 http://twitter.com/ericnel
    2. 2. Agenda <ul><li>Overview of what we are up to </li></ul><ul><li>Drill down into parallel programming for managed developers </li></ul>
    3. 3. Things I learnt... <ul><li>We have a very large investment in parallel computing </li></ul><ul><ul><li>We have “something for everyone” </li></ul></ul><ul><ul><li>It is not all synced, it is sometimes overlapping </li></ul></ul><ul><li>It is a big topic </li></ul><ul><ul><li>Managed vs native vs client vs server vs task vs data... </li></ul></ul><ul><li>Even with the investment, design/code/test for parallel is far harder </li></ul><ul><ul><li>Locking, Deadlocks, Livelocks </li></ul></ul><ul><li>It is about getting ready for the future </li></ul><ul><ul><li>Code today – run better tomorrow? </li></ul></ul><ul><li>VS2010 CTP – not a great place for parallel </li></ul><ul><ul><li>Single core in guest </li></ul></ul><ul><ul><li>Unsupported route to use Hyper-V </li></ul></ul><ul><li>Easiest route to dabble – Microsoft Parallel Extensions June CTP for VS2008 </li></ul>
    4. 4. Buying a new Processor £100 - £300 2-3GHz 2 cores or 4 64-bit Core Core
    5. 5. Buying a new Processor Core Core Core Core £200 - £500 2-3GHz 4 cores with HT 64-bit QuickPath Interconnect Memory Controller
    6. 6. Where will it all end? Unisys ES7000 (7600R) used with kind permission of Mr Henk var der Valk, Unisys, NL
    7. 7. Was it a wise purchase? Windows OS App 1 App 2 ... App 1 .NET CLR .NET Framework My Code
    8. 8. Was it a wise purchase? <ul><li>Some environments scale to take advantage of additional CPU cores (mostly server-side) </li></ul><ul><li>A lot of code does not (mostly client-side) </li></ul><ul><ul><li>This code will see little benefit from future hardware advances  </li></ul></ul>ASP.NET Web Forms/Services WCF Services WF Engine ... .NET ThreadPool or Custom Threading Strategy
    9. 9. What happened to “The Free Lunch”? Bad sequential code will run faster on a faster processor Just using parallel code is not enough Bad parallel code WILL NOT run faster on more cores
    10. 10. Applications Can Scale Well Graphics Rendering – Physical Simulation -- Vision – Data Mining -- Analytics
    11. 11. <ul><li>Multithreaded programming is “hard” today </li></ul><ul><ul><li>Doable by only a subgroup of senior specialists </li></ul></ul><ul><ul><li>Parallel patterns are not prevalent, well known, nor easy to implement </li></ul></ul><ul><ul><li>So many potential problems </li></ul></ul><ul><ul><ul><li>Races, deadlocks, livelocks, lock convoys, cache coherency overheads, lost event notifications, broken serializability, priority inversion, and so on… </li></ul></ul></ul><ul><li>Businesses have little desire to “go deep” </li></ul><ul><ul><li>Best developers should focus on business value, not concurrency </li></ul></ul><ul><ul><li>Need simple ways to allow all developers to write concurrent code </li></ul></ul>What's The Problem?
    12. 12. <ul><li>void MatrixMult( int size, double ** m1, double ** m2, double ** result) </li></ul><ul><li>{ </li></ul><ul><li>    for ( int i = 0; i < size; i++) { </li></ul><ul><li>        for ( int j = 0; j < size; j++) { </li></ul><ul><li>            result[i][j] = 0; </li></ul><ul><li>            for ( int k = 0; k < size; k++) { </li></ul><ul><li>                 result[i][j] += m1[i][k] * m2[k][j]; </li></ul><ul><li>            } </li></ul><ul><li>        } </li></ul><ul><li>    } </li></ul><ul><li>} </li></ul>
    13. 13. <ul><li>void MatrixMult( </li></ul><ul><li>int size, double ** m1, double ** m2, double ** result) { </li></ul><ul><li>   int N = size;                           </li></ul><ul><li>   int P = 2 * NUMPROCS; </li></ul><ul><li>   int Chunk = N / P;                   </li></ul><ul><li>   HANDLE hEvent = CreateEvent(NULL, TRUE, FALSE, NULL); </li></ul><ul><li>   long counter = P;                     </li></ul><ul><li>   for ( int c = 0; c < P; c++) { </li></ul><ul><li>   std::thread t ([&,c] {   </li></ul><ul><li>     for ( int i = c * Chunk; </li></ul><ul><li>       i < (c + 1 == P ? N : (c + 1) * Chunk); i++) { </li></ul><ul><li>for ( int j = 0; j < size; j++) { </li></ul><ul><li>result[i][j] = 0; </li></ul><ul><li>for ( int k = 0; k < size; k++) { </li></ul><ul><li>result[i][j] += m1[i][k] * m2[k][j]; </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>       if (InterlockedDecrement(counter) == 0) </li></ul><ul><li>         SetEvent(hEvent); </li></ul><ul><li>     }); </li></ul><ul><li>   } </li></ul><ul><li>   WaitForSingleObject(hEvent,INFINITE); </li></ul><ul><li>CloseHandle(hEvent); </li></ul><ul><li>} </li></ul>Synchronization Knowledge Error prone Heavy synchronization Static partitioning Lack of thread reuse Tricks Lots of boilerplate
    14. 14. Microsoft Parallel Computing Technologies Task Concurrency Data Parallelism Distributed/ Cloud Computing Local Computing <ul><li>Robotics-based manufacturing assembly line </li></ul><ul><li>Silverlight Olympics viewer </li></ul><ul><li>Enterprise search, OLTP, collab </li></ul><ul><li>Animation / CGI rendering </li></ul><ul><li>Weather forecasting </li></ul><ul><li>Seismic monitoring </li></ul><ul><li>Oil exploration </li></ul><ul><li>Automotive control system </li></ul><ul><li>Internet –based photo services </li></ul><ul><li>Ultrasound imaging equipment </li></ul><ul><li>Media encode/decode </li></ul><ul><li>Image processing/ enhancement </li></ul><ul><li>Data visualization </li></ul>CCR Maestro TPL / PPL Cluster TPL Cluster PLINQ MPI / MPI.Net WCF Cluster SOA WF PLINQ TPL / PPL CDS OpenMP WF Compute Shader
    15. 15. Visual Studio 2010 Tools / Programming Models / Runtimes Threads Operating System Concurrency Runtime Programming Models Task Scheduler Resource Manager Integrated Tooling Programming Models Concurrency Runtime Parallel Pattern Library Resource Manager Task Scheduler Task Parallel Library PLINQ Managed Library Native Library Agents Library ThreadPool Data Structures Data Structures Tools Parallel Debugger Tool Profiler Concurrency Analysis
    16. 16. Explicit Tasking Support <ul><li>.NET 4.0 </li></ul><ul><li>Task Parallel Library </li></ul><ul><li>Task, TaskFactory </li></ul><ul><li>Parallel.For </li></ul><ul><li>Parallel.Foreach </li></ul><ul><li>Parallel.Invoke </li></ul><ul><li>Concurrent data structures </li></ul><ul><li>Visual Studio 2010 C++ </li></ul><ul><li>Parallel Pattern Library </li></ul><ul><li>task, task_group </li></ul><ul><li>parallel_for </li></ul><ul><li>parallel_for_each </li></ul><ul><li>parallel_invoke </li></ul><ul><li>Concurrent data structures </li></ul><ul><li>Primitives for message passing </li></ul><ul><li>User-mode locks </li></ul>
    17. 17. Task Parallel Library ( TPL )
    18. 18. Task No Threading to Threading to Tasks
    19. 19. User Mode Scheduler Program Thread CLR Thread Pool Global Queue Worker Thread 1 Worker Thread p
    20. 20. User Mode Scheduler For Tasks Program Thread Task 1 Task 2 Task 3 Task 5 Task 4 Task 6 CLR Thread Pool: Work-Stealing Worker Thread 1 Worker Thread p Global Queue Local Queue Local Queue
    21. 21. Debugger Support <ul><li>Support both managed and native </li></ul><ul><li>Parallel Tasks </li></ul><ul><li>Parallel Stacks </li></ul>
    22. 22. Higher Level Constructs <ul><li>Even with Task there are common patterns that build into higher level abstractions </li></ul><ul><li>The Parallel class </li></ul><ul><ul><li>Invoke, For, For<T>, Foreach </li></ul></ul><ul><li>Care needs to be taken with state, ordering </li></ul><ul><ul><li>“ This is not your Father’s for loop” </li></ul></ul>
    23. 23. Parallel Parallel.ForEach Parallel.Invoke
    24. 24. Declarative Data Parallelism <ul><li>Parallel LINQ-to-Objects (PLINQ) </li></ul><ul><ul><li>Enables LINQ devs to leverage multiple cores </li></ul></ul><ul><ul><li>Fully supports all .NET standard query operators </li></ul></ul><ul><ul><li>Minimal impact to existing LINQ model </li></ul></ul>var q = from p in people          where p.Name == queryInfo.Name && p.State == queryInfo.State && p.Year >= yearStart && p.Year <= yearEnd         orderby p.Year ascending          select p;
    25. 25. P arallel LINQ
    26. 26. What Next? <ul><li>Download VS 2010 CTP </li></ul><ul><ul><li>Remember to set the clock back </li></ul></ul><ul><li>Or </li></ul><ul><li>Download Parallel Extensions June CTP for VS2008 </li></ul><ul><li>Experiment with runtime and API </li></ul><ul><ul><li>Team is working on Visual Studio 2010 beta </li></ul></ul><ul><ul><li>Very open to feedback </li></ul></ul><ul><ul><li>Join in the discussion forums </li></ul></ul><ul><ul><li>http://blogs.msdn.com/pfxteam/ </li></ul></ul>
    27. 27. Parallel Computing Resources <ul><li> Downloads, Binaries, Code, Forums, Blogs, Videos, Screencasts, Podcasts, Articles, Samples </li></ul><ul><ul><li>http://msdn.com/concurrency </li></ul></ul><ul><ul><li>http://blogs.msdn.com/pfxteam/ </li></ul></ul>