Your SlideShare is downloading. ×
  • Like
Architecting Solutions for the Manycore Future
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Architecting Solutions for the Manycore Future


This talk will focus solution architects toward thinking about parallelism when designing applications and solutions specifically Threads vs Tasks on TPL, LINQ vs. PLINQ, and Object Oriented versus …

This talk will focus solution architects toward thinking about parallelism when designing applications and solutions specifically Threads vs Tasks on TPL, LINQ vs. PLINQ, and Object Oriented versus Functional Programming techniques. This talk will also compare programming languages, how languages differ when dealing with manycore programming, and the different advantages to these languages. Demonstration include C#, VB, and F# features for functional programming, LINQ and TPL. A demonstration of the Concurrency Visualizer in Visual Studio 2010 will also be included.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • ENUF = Elements Needed Up Front


  • 1. Architecting Solutions for the Manycore Future
    Talbott Crowell
  • 2. This talk will focus solution architects toward thinking about parallelism when designing applications and solutions
    Threads vs. Tasks using TPL
    LINQ vs. PLINQ
    Object Oriented vs. Functional Programming
    This talk will also compare programming languages, how languages differ when dealing with manycore programming, and the different advantages to these languages.
  • 3. Patrick Gelsinger, Intel VP
    February 2001, San Francisco, CA
    2001 IEEE International Solid-State Circuits Conference (ISSCC)
    If scaling continues at present pace, by 2005, high speed processors would have power density of nuclear reactor, by 2010, a rocket nozzle, and by 2015, surface of sun.
    Intel stock dropped 8% on the next day
    “Business as usual will not work in the future.”
  • 4. The Power Wall: CPU Clock Speed
    Single core
    From Katherine Yelick’s “Multicore: Fallout of a Hardware Revolution”
  • 5. In 1966, Gordon Moore predicted exponential growth in number of transistors per chip based on the trend from 1959 to 1965
    Clock frequencies continued to increase exponentially until they hit the power wall in 2004 at around 3 to 4 GHz
    1971, Intel 4004 (first single-chip CPU) – 740 kHz
    1978, Intel 8086 (orgin of x86) – 4.77 MHz
    1985, Intel 80386DX – 16 MHz
    1993, Pentium P5 – 66 MHz
    1998, Pentium II – 450 MHz
    2001, Pentium II (Tualatin) – 1.4 GHz
    2004, Pentium 4F – 3.6 GHz
    2008, Core i7 (Extreme) – 3.3 GHz
    Intel is now doubling cores along with other improvements to continue to scale
    Effect of the Power Wall
    This trend continues even today
    The Power Wall
    Enter Manycore
  • 6. Manycore, What is it?
    Manycore, Why should I care?
    Manycore, What do we do about it?
    Task Parallel Library (Reactive Extensions and .NET 4)
    Languages, paradigms, and language extensions
    F#, functional programming, LINQ, PLINQ
    Visual Studio 2010 Tools for Concurrency
    Agenda: Manycore Future
  • 7. What is Manycore?
  • 8. Single core: 1 processor on a chip die (1 socket)
    Many past consumer and server CPU’s (some current CPU’s for lightweight low power devices)
    Including CPU’s that support hyperthreading, but this is a grey area
    Multicore: 2 to 8 core processors per chip/socket
    AMD Athlon 64 X2 (first dual-core desktop CPU released in 2005)
    Intel Core Duo, 2006 (32 bit, dual core, for laptops only)
    Core Solo was a dual core chip with one that doesn’t work
    Intel Core 2 (not multicore, instead a brand for 64 bit arch)
    Core 2 Solo (1 core)
    Core 2 Duo (2 cores)
    Core 2 Quad (4 cores)
    Manycore: more than 8 cores per chip
    Currently prototypes and R&D
    Manycore, What is it?
  • 9. High-end Servers 2001-2004
    IBM Servers 2001 - IBM POWER4 PowerPC for AS/400 and RS/6000 “world's first non-embedded dual-core processor”
    Sun Servers 2004 - UltraSpark IV – “first multicore SPARC processor”
    Desktops/Laptops 2005-2006
    AMD Athlon 64 X2 (Manchester) May 2005 “first dual-core desktop CPU”
    Intel Core Duo, Jan 2006
    Intel Pentium (Allendale) dual core Jan 2007
    Windows Servers 2006
    Intel Xeon (Paxville) dual core Dec 2005
    AMD Opteron (Denmark) dual core March 2006
    Intel Itanium 2 (Montecito) dual core July 2006
    Sony Playstation 3 – 2006
    9 core Cell Processor (only 8 operational) - Cell architecture jointly developed by Sony, Toshiba, and IBM
    Multicore trends from servers to gaming consoles
  • 10. Power Mac G5 - Mid 2003
    2 x 1 core (single core) IBM PowerPC 970
    Mac Pro - Mid 2006
    2 x 2 core (dual core) Intel Xeon (Woodcrest)
    Mac Pro - Early 2008
    2 x 4 core (quad core) Intel Xeon (Harpertown)
    In 5 years number of cores doubled twice on Apple’s high end graphics workstation
    From 2 to 4 to 8
    Macintosh multicore trend
  • 11. The chip is just designed for research efforts at the moment, according to an Intel spokesperson.
    "There are no product plans for this chip. We will never sell it so there won't be a price for it," the Intel spokesperson noted in an e-mail. "We will give about a hundred or more to industry partners like Microsoft and academia to help us research software development and learn on a real piece of hardware, [of] which nothing of its kind exists today."
    Microsoft said it had already put SCC into its development pipeline so it could exploit it in the future.
    48 Core Single-chip Cloud Computer (SCC)
  • 12. Why should I care?
    (about Manycore)
  • 13. Hardware is changing
    Programming needs to change to take advantage of new hardware
    Concurrent Programming
    Paradigm Shift
    Designing applications
    Developing applications
    Manycore, Why should I care?
  • 14. “The computer industry is once again at a crossroads. Hardware concurrency, in the form of new manycore processors, together with growing software complexity, will require that the technology industry fundamentally rethink both the architecture of modern computers and the resulting software development paradigms.”
    Craig MundieChief Research and Strategy OfficerMicrosoft CorporationJune 2008
    First paragraph of the Forward of Joe Duffy’s preeminent tome “Concurrent Programming on Windows”
    Concurrent Programming
  • 15. Excerpt from Mark Reinhold’s Blog post: November 24, 2009
    The free lunch is over.
    Multicore processors are not just coming—they’re here.
    Leveraging multiple cores requires writing scalable parallel programs, which is incredibly hard.
    Tools such as fork/join frameworks based on work-stealing algorithms make the task easier, but it still takes a fair bit of expertise and tuning.
    Bulk-data APIs such as parallel arrays allow computations to be expressed in terms of higher-level, SQL-like operations (e.g., filter, map, and reduce) which can be mapped automatically onto the fork-join paradigm.
    Working with parallel arrays in Java, unfortunately, requires lots of boilerplate code to solve even simple problems.
    Closures can eliminate that boilerplate.
    “It’s time to add them to Java.”
    “There’s not a moment to lose!”
  • 16. Herb Sutter 2005
    Programs are not doubling in speed every couple of years for free anymore
    We need to start writing code to take advantage of many cores
    Currently painful and problematic to take advantage of many cores because of shared memory, locking, and other imperative programming techniques
    “The Free Lunch Is Over”
  • 17. Is this just hype?
    Another Y2K scare?
    CPU’s are changing
    Programmers will learn to exploit new architectures
    Will you be one of them?
    Wait and see?
    You could just wait and let the tools catch up so you don’t have to think about it. Will that strategy work?
    Should you be concerned?
  • 18. Just tools or frameworks will not solve the manycore problem alone
    Imperative programming by definition has limitations scaling in a parallel way
    Imperative programming (C, C++, VB, Java, C#)
    Requires locks and synchronization code to handle shared memory read/write transactions
    Not trivial
    Difficult to debug
    Tools and frameworks may help, but will require different approach to the problem (a different paradigm) to really take advantage of the tools
    The Core Problem
  • 19. Some frameworks are designed to be single threaded, such as ASP.NET
    Best practices for ASP.NET applications recommend avoiding spawning new threads
    ASP.NET and IIS handle the multithreading and multiprocessing to take advantage of the many processors (and now many cores) on Web Servers and Application Servers
    Will this best practice remain true?
    Even when server CPU’s have hundreds or thousands of cores?
    Will it affect all programmers?
  • 20. What do we do about it?
    (How do we prepare for Manycore)
  • 21. Identify where the dependencies are
    Identify where you can parallelize
    Understand the tools, techniques, and approaches for solving the pieces
    Put them together to understand overall performance
    POC – Proof of Concept
    Test, test, test
    Performance goals up front
    Understand Problem Domain
  • 22. Frameworks
    Task Parallel Library (TPL)
    Reactive Extensions for .NET 3.5 (Rx)
    Used to be called Parallel Extensions or PFx
    Baked into .NET 4
    Programming paradigms, languages, and language extensions
    Functional programming
    LINQ and PLINQ
    Visual Studio 2010 Tools for Concurrency
    Manycore, What do we do about it?
  • 23. Parallelism vs. Concurrency
    Task vs. Data Parallelism
    Parallel Programming Concepts
  • 24. Concurrency or Concurrent computing
    Many independent requests
    Web Server, works on multi-threaded single core CPU
    Separate processes that may be executed in parallel
    More general than parallelism
    Parallelism or Parallel computing
    Processes are executed in parallel simultaneously
    Only possible with multiple processors or multiple cores
    Yuan Lin: compares to black and white photography vs. color, one is not a superset of the other
    Parallelism vs. Concurrency
  • 25. Task Parallelism (aka function parallelism and control parallelism)
    Distributing execution processes (threads/functions/tasks) across different parallel computing nodes (cores)
    Data Parallelism (aka loop-level parallelism)
    Distributing dataacross different parallel computing nodes (cores)
    Executing same command over every element in a data structure
    Task vs. Data Parallelism
    See MSDN for .NET 4, Parallel Programming, Data/Task Parallelism
  • 26. Task Parallel Libarary
  • 27. Parallel Programming in the .NET Framework 4 Beta 2 - TPL
  • 28. Reference System.Threading
    Use Visual Studio 2010 or .NET 4
    For Visual Studio 2008
    Download unsupported version for .NET 3.5 SP1 from Reactive Extensions for .NET (Rx)
    Create a “Task”
    How to use the TPL
    FileStream fs =  new FileStream(fileName, FileMode.CreateNew); 
    var task = Task.Factory.FromAsync(fs.BeginWrite, fs.EndWrite, bytes, 0, 
    bytes.Length, null);    
  • 29. Use Task class
    Task Parallelism with the TPL
    // Create a task and supply a user delegate
    // by using a lambda expression.
    vartaskA = new Task(() =>
    Console.WriteLine("Hello from taskA."));
    // Start the task.
    // Output a message from the calling thread.
    Console.WriteLine("Hello from the calling thread."); 
  • 30. Task<TResult>
    Getting return value from a Task
    Task<double>[] taskArray = new Task<double>[]
    Task<double>.Factory.StartNew(() => DoComputation1()),
    // May be written more conveniently like this:
    Task.Factory.StartNew(() => DoComputation2()),
    Task.Factory.StartNew(() => DoComputation3())
    double[] results = new double[taskArray.Length];
    for (inti = 0; i < taskArray.Length; i++)
    results[i] = taskArray[i].Result;
  • 31. Task resembles new thread or ThreadPool work item, but higher level of abstraction
    Tasks provide two primary benefits over Threads:
    More efficient and scalable use of system resources
    More programmatic control than is possible with a thread or work item
    Tasks vs. Threads
  • 32. Behind the scenes, tasks are queued to the ThreadPool
    ThreadPool now enhanced with algorithms (like hill-climbing) that determine and adjust to the number of threads that maximizes throughput.
    Tasks are relatively lightweight
    You can create many of them to enable fine-grained parallelism.
    To complement this, widely-known work-stealing algorithms are employed to provide load-balancing..
    Tasks and the framework built around them provide a rich set of APIs that support waiting, cancellation, continuations, robust exception handling, detailed status, custom scheduling, and more.
  • 33. Instead of:
    Data Parallelism with the TPL
    for (inti = 0; i < matARows; i++) {
    for (int j = 0; j < matBCols; j++) {
    Parallel.For(0, matARows, i => {
    for (int j = 0; j < matBCols; j++) {
    }); // Parallel.For  
  • 34. Use Tasks not Threads
    Use Parallel.For in Data Parallelism scenarios
    Use AsyncWorkflosw from F#, covered later
    Use PLINQ, covered later
    TPL Summary
  • 35. Functional Programming
  • 36. 1930’s: lambda calculus (roots)
    1956: IPL (Information Processing Language) “the first functional language”
    1958: LISP “a functional flavored language”
    1962: APL (A Programming Language)
    1973: ML (Meta Language)
    1983: SML (Standard ML)
    1987: Caml (Categorical Abstract Machine Language ) and Haskell
    1996: OCaml (Objective Caml)
    2005: F# introduced to public by Microsoft Research
    2010: F# is “productized” in the form of Visual Studio 2010
    Functional programming has been around a long time (over 50 years)
  • 37. Most functional languages encourage programmers to avoid side effects
    Haskell (a “pure” functional language) restricts side effects with a static type system
    A side effect
    Modifies some state
    Has observable interaction with calling functions
    Has observable interaction with the outside world
    Example: a function or method with no return value
    Functional programming is safe
  • 38. Language Evolution (Simon Payton-Jones)
    C#, VB, Java, C are imperative programming languages. Very useful but can change the state of the world at anytime creating side effects.
    Nirvana! Useful and Safe
    Haskell is Very Safe, but not very useful. Used heavily in research and academia, but rarely in business.
  • 39. When a function changes the state of the program
    Write to a file (that may be read later)
    Write to the screen
    Changing values of variables in memory (global variables or object state)
    Side Effect
  • 40. Compare SQL to your favorite imperative programming language
    If you write a statement to store and query your data, you don’t need to specify how the system will need to store the data at a low level
    Example: table partitioning
    LINQ is an example of bringing functional programming to C# and VB through language extensions
    Functional Programming
  • 41. Use lots of processes
    Avoid side effects
    Avoid sequential bottlenecks
    Write “small messages, big computations” code
    Efficient Multicore Programming
    Source: Joe Armstrong’s “Programming Erlang, Software for a Concurrent World”
    Section 20.1 “How to Make Programs Run Efficiently on a Multicore CPU”
  • 42. F#
  • 43. Functional language developed by Microsoft Research
    By Don Syme and his team, who productized Generics
    Based on OCaml (influenced by C# and Haskell)
    2002: F# language design started
    2005 January: F# 1.0.1 releases to public
    Not a product. Integration with VS2003
    Works in .NET 1.0 through .NET 2.0 beta, Mono
    2005 November: F# 1.1.5 with VS 2005 RTM support
    2009 October: VS2010 Beta 2, CTP for VS2008 & Non-Windows users
    2010: F# is “productized” and baked into VS 2010
    What is F#
  • 44. Multi-Paradigm
    Functional Programming
    Imperative Programming
    Object Oriented Programming
    Language Oriented Programming
    F# is not just Functional
  • 45. Parallel Computing and PDC09
    Managed Languages
    Visual F#
    Visual Studio 2010
    Debugger Windows
    Native Libraries
    Managed Libraries
    Parallel Pattern Library
    Profiler Concurrency
    Parallel LINQ
    Task ParallelLibrary
    Data Structures
    Data Structures
    Native Concurrency Runtime
    Task Scheduler
    Race Detection
    Managed Concurrency Runtime
    Resource Manager
    Operating System
    UMS Threads
    HPC Server
    Windows 7 / Server 2008 R2
    Research / Incubation
    Visual Studio 2010 / .NET 4
  • 46. Functional programming has been around a long time
    Not new
    Long history
    Functional programming is safe
    A concern as we head toward manycore and cloud computing
    Functional programming is on the rise
    Why another language?
  • 47. Parallel Programming in the .NET Framework 4 Beta 2 - PLINQ
  • 48. “F# is, technically speaking, neutral with respect to concurrency - it allows the programmer to exploit the many different techniques for concurrency and distribution supported by the .NET platform”
    F# FAQ:
    Functional programming is a primary technique for minimizing/isolating mutable state
    Asynchronous workflows make writing parallel programs in a “natural and compositional style”
    F# and Multi-Core Programming
  • 49. Interactive Scripting
    Good for prototyping
    Succinct = Less code
    Type Inference
    Strongly typed, strict (no dynamic typing)
    Automatic generalization (generics for free)
    Few type annotations
    1st class functions (currying, lazy evaluations)
    Pattern matching
    Key Characteristics of F#
  • 50. Concurrent Programming with F#
  • 51. Luke Hoban at PDC 2009F# Program Manager
  • 52. Demo – Imperative sumOfSquares
  • 53. Difficult to turn existing sequential code into parallel code
    Must modify large portions of code to use threads explicitly
    Using shared state and locks is difficult
    Careful to avoid race conditions and deadlocks
    Two Problems Parallelizing Imperative Code
  • 54. Demo – Recursive sumOfSquares
  • 55. Declarative programming style
    Easier to introduce parallelism into existing code
    Immutability by default
    Can’t introduce race conditions
    Easier to write lock-free code
    Functional Programming
  • 56. Demo – Functional sumOfSquares
  • 57. From Seq to PSeq
    Matthew Podwysocki’s Blog
    Adding Parallel Extensions to F# for VS2010 Beta 2
    Talbott Crowell’s Developer Blog!A6E0DA836D488CA6!396.entry
    Parallel Extensions to F#
  • 58. Demo – Parallel sumOfSquares
  • 59. Asynchronous Workflows
    Task Based Programming using TPL
    Reactive Extensions
    “The Reactive Extensions can be used from any .NET language. In F#, .NET events are first-class values that implement the IObservable<out T> interface.  In addition, F# provides a basic set of functions for composing observable collections and F# developers can leverage Rx to get a richer set of operators for composing events and other observable collections. ”S. Somasegar, Senior Vice President, Developer Division
    F# Parallel Programming Options
  • 60. Problem
    Resize a ton of images
    Demo of Image Processor
    let files = Directory.GetFiles(@"C:imagesoriginal")
    for file in files do
    use image = Image.FromFile(file)
    use smallImage = ResizeImage(image)
    let destFileName = DestFileName("s1", file)
  • 61. Asynchronous Workflows
    let FetchAsync(file:string) =
    async {
    use stream = File.OpenRead(file)
    let! bytes = stream.AsyncRead(intstream.Length)
    use memstream = new MemoryStream(bytes.Length)
    memstream.Write(bytes, 0, bytes.Length)
    use image = Image.FromStream(memstream)
    use smallImage = ResizeImage(image)
    let destFileName = DestFileName("s2", file)
    let tasks = [for file in files -> FetchAsync(file)]
    let parallelTasks = Async.Parallel tasks
  • 62. Tomas PetricekUsing Asynchronous Workflows
  • 63. LINQ
    Language-Integrated Query
  • 64. LINQ declaratively specify what you want done not how you want it done
    var source = Enumerable.Range(1, 10000);
    varevenNums = from num in source
    where Compute(num) > 0
    select num;
    var source = Enumerable.Range(1, 10000);
    varevenNums = new List<int>();
    foreach (var num in source)
    if (Compute(num) > 0)
  • 65. If I put a counter in Compute(num)?
    What will happen?
    var source = Enumerable.Range(1, 10000);
    varevenNums = from num in source
    where Compute(num) > 0
    select num;
    private static int Compute(int num) {
    if (num % 2 == 0) return 1;
    return 0;
  • 66. PLINQ
    (Parallel LINQ)
  • 67. Parallel Programming in the .NET Framework 4 Beta 2 - PLINQ
  • 68. LINQ declaratively specify what you want done not how you want it done
    Declaratively specify “As Parallel”
    Under the hood, the framework will implement “the how” using TPL and threads.
    PLINQ = Parallel LINQ
    var source = Enumerable.Range(1, 10000);
    varevenNums = from num in source
    where Compute(num) > 0
    select num;
    var source = Enumerable.Range(1, 10000);
    varevenNums = from num in source.AsParallel()
    where Compute(num) > 0
    select num;
  • 69. System.Linq.ParallelEnumerable
    The entry point for PLINQ. Specifies that the rest of the query should be parallelized, if it is possible.
  • 70. Visual Studio 2010
    Tools for Concurrency
  • 71. Steven Toub at PDC 2009Senior Program Manager on the Parallel Computing Platform
  • 72. Views enable you to see how your multi-threaded application interacts with
    Operating System
    Other processes on the host computer
    Provides graphical, tabular and textual data
    Shows the temporal relationships between
    the threads in your program
    the system as a whole
    Concurrency Visualizer in Visual Studio 2010
  • 73. Performance bottlenecks
    CPU underutilization
    Thread contention
    Thread migration
    Synchronization delays
    Areas of overlapped I/O
    and other info…
    Use Concurrency Visualizer to Locate
  • 74. Concurrency Visualizer
    High level of Contentions during Async
  • 75. CPU View
    Threads View (Parallel Performance)
    Cores View
  • 76. CPU View
    • Async uses more of the CPU(s)/cores
    • 77. Sync uses 1 CPU/core
  • Threads View
  • 78. Full test
    Close up of Sync
    Close up of Async
    Core View
  • 79. Tomas Petricek - F# Webcast (III.) - Using Asynchronous Workflows
    Luke Hoban - F# for Parallel and Asynchronous Programming
    More info on Asychrounous Workflows
  • 80. The Landscape of Parallel Computing Research: A View from Berkeley 2.0 by David Patterson
    Parallel Dwarfs
    More Research
  • 81. “The architect as we know him today is a product of the Renaissance.” (1)
    “But the medieval architect was a master craftsman (usually a mason or a carpenter by trace), one who could build as well as design, or at least ‘one trained in that craft even if he had ceased to ply his axe and chisel’(2).” (1)
    “Not only is he hands on, like the agile architect, but we also learn from Arnold that the great Gothic cathedrals of Europe were built, not with BDUF, but with ENUF”
    (1). Dana Arnold, Reading Architectural History, 2002
    (2). D. Knoop & G. P. Jones, The Medieval Mason, 1933
    (3). Architects: Back to the future?, Ian Cooper 2008
    The Architect
  • 82. visit us at
    Thank you. Questions?Architecting Solutions for the Manycore Future
    Talbott Crowell
    Twitter: @Talbott and @fsug