SlideShare a Scribd company logo
Architecting Solutions for the Manycore Future Talbott Crowell ThirdM
This talk will focus solution architects toward thinking about parallelism when designing applications and solutions Threads vs. Tasks using TPL  LINQ vs. PLINQ  Object Oriented vs. Functional Programming This talk will also compare programming languages, how languages differ when dealing with manycore programming, and the different advantages to these languages.   Abstract manycore
Patrick Gelsinger, Intel VP  February 2001, San Francisco, CA 2001 IEEE International Solid-State Circuits Conference (ISSCC)  If scaling continues at present pace, by 2005, high speed processors would have power density of nuclear reactor, by 2010, a rocket nozzle, and by 2015, surface of sun. Intel stock dropped 8% on the next day “Business as usual will not work in the future.”
The Power Wall: CPU Clock Speed Manycore -> Multicore -> Single core -> From Katherine Yelick’s “Multicore: Fallout of a Hardware Revolution”
In 1966, Gordon Moore predicted exponential growth in number of transistors per chip based on the trend from 1959 to 1965 Clock frequencies continued to increase exponentially until they hit the power wall in 2004 at around 3 to 4 GHz 1971, Intel 4004 (first single-chip CPU) – 740 kHz 1978, Intel 8086 (orgin of x86) – 4.77 MHz 1985, Intel 80386DX – 16 MHz 1993, Pentium P5 – 66 MHz 1998, Pentium II – 450 MHz 2001, Pentium II (Tualatin) – 1.4 GHz 2004, Pentium 4F – 3.6 GHz 2008, Core i7 (Extreme) – 3.3 GHz Intel is now doubling cores along with other improvements to continue to scale Effect of the Power Wall This trend continues even today The Power Wall Enter Manycore
Manycore, What is it? Manycore, Why should I care? Manycore, What do we do about it? Frameworks Task Parallel Library (Reactive Extensions and .NET 4) Languages, paradigms, and language extensions F#, functional programming, LINQ, PLINQ Tools Visual Studio 2010 Tools for Concurrency Agenda: Manycore Future
What is Manycore?
Single core: 1 processor on a chip die (1 socket) Many past consumer and server CPU’s (some current CPU’s for lightweight low power devices) Including CPU’s that support hyperthreading, but this is a grey area Multicore: 2 to 8 core processors per chip/socket AMD Athlon 64 X2 (first dual-core desktop CPU released in 2005) Intel Core Duo, 2006 (32 bit, dual core, for laptops only) Core Solo was a dual core chip with one that doesn’t work Intel Core 2 (not multicore, instead a brand for 64 bit arch) Core 2 Solo (1 core) Core 2 Duo (2 cores) Core 2 Quad (4 cores) Manycore: more than 8 cores per chip Currently prototypes and R&D Manycore, What is it?
High-end Servers 2001-2004 IBM Servers 2001 - IBM POWER4 PowerPC for AS/400 and RS/6000 “world's first non-embedded dual-core processor” Sun Servers 2004 - UltraSpark IV – “first multicore SPARC processor” Desktops/Laptops 2005-2006 AMD Athlon 64 X2 (Manchester) May 2005 “first dual-core desktop CPU” Intel Core Duo, Jan 2006  Intel Pentium (Allendale) dual core Jan 2007 Windows Servers 2006 Intel Xeon (Paxville) dual core Dec 2005 AMD Opteron (Denmark) dual core March 2006 Intel Itanium 2 (Montecito) dual core July 2006 Sony Playstation 3 – 2006 9 core Cell Processor (only 8 operational) - Cell architecture jointly developed by Sony, Toshiba, and IBM Multicore trends from servers to gaming consoles
Power Mac G5 - Mid 2003 2 x 1 core (single core) IBM PowerPC 970 Mac Pro - Mid 2006 2 x 2 core (dual core) Intel Xeon (Woodcrest) Mac Pro - Early 2008 2 x 4 core (quad core) Intel Xeon (Harpertown) In 5 years number of cores doubled twice on Apple’s high end graphics workstation From 2 to 4 to 8 Macintosh multicore trend
The chip is just designed for research efforts at the moment, according to an Intel spokesperson. "There are no product plans for this chip. We will never sell it so there won't be a price for it," the Intel spokesperson noted in an e-mail. "We will give about a hundred or more to industry partners like Microsoft and academia to help us research software development and learn on a real piece of hardware, [of] which nothing of its kind exists today."  http://redmondmag.com/articles/2009/12/04/intel-unveils-48-core-cloud-computer-chip.aspx Microsoft said it had already put SCC into its development pipeline so it could exploit it in the future.  http://news.bbc.co.uk/2/hi/technology/8392392.stm 48 Core Single-chip Cloud Computer (SCC)
Why should I care? (about Manycore)
Hardware is changing Programming needs to change to take advantage of new hardware Concurrent Programming Paradigm Shift  Designing applications Developing applications Manycore, Why should I care?
“The computer industry is once again at a crossroads.  Hardware concurrency, in the form of new manycore processors, together with growing software complexity, will require that the technology industry fundamentally rethink both the architecture of modern computers and the resulting software development paradigms.” Craig MundieChief Research and Strategy OfficerMicrosoft CorporationJune 2008 First paragraph of the Forward of Joe Duffy’s preeminent tome “Concurrent Programming on Windows” Concurrent Programming
Excerpt from Mark Reinhold’s Blog post: November 24, 2009 The free lunch is over.  Multicore processors are not just coming—they’re here.  Leveraging multiple cores requires writing scalable parallel programs, which is incredibly hard.  Tools such as fork/join frameworks based on work-stealing algorithms make the task easier, but it still takes a fair bit of expertise and tuning.  Bulk-data APIs such as parallel arrays allow computations to be expressed in terms of higher-level, SQL-like operations (e.g., filter, map, and reduce) which can be mapped automatically onto the fork-join paradigm.  Working with parallel arrays in Java, unfortunately, requires lots of boilerplate code to solve even simple problems.  Closures can eliminate that boilerplate.  “It’s time to add them to Java.” http://blogs.sun.com/mr/entry/closures “There’s not a moment to lose!”
Herb Sutter 2005 Programs are not doubling in speed every couple of years for free anymore We need to start writing code to take advantage of many cores Currently painful and problematic to take advantage of many cores because of shared memory, locking, and other imperative programming techniques “The Free Lunch Is Over”
Is this just hype? Another Y2K scare? Fact: CPU’s are changing Programmers will learn to exploit new architectures Will you be one of them? Wait and see? You could just wait and let the tools catch up so you don’t have to think about it.  Will that strategy work? Should you be concerned?
Just tools or frameworks will not solve the manycore problem alone Imperative programming by definition has limitations scaling in a parallel way Imperative programming (C, C++, VB, Java, C#) Requires locks and synchronization code to handle shared memory read/write transactions  Not trivial Difficult to debug Tools and frameworks may help, but will require different approach to the problem (a different paradigm)  to really take advantage of the tools The Core Problem
Some frameworks are designed to be single threaded, such as ASP.NET Best practices for ASP.NET applications recommend avoiding spawning new threads ASP.NET and IIS handle the multithreading and multiprocessing to take advantage of the many processors (and now many cores) on Web Servers and Application Servers Will this best practice remain true? Even when server CPU’s have hundreds or thousands of cores? Will it affect all programmers?
What do we do about it? (How do we prepare for Manycore)
Identify where the dependencies are Identify where you can parallelize Understand the tools, techniques, and approaches for solving the pieces Put them together to understand overall performance POC – Proof of Concept Test, test, test Performance goals up front Understand Problem Domain
Frameworks Task Parallel Library (TPL) Reactive Extensions for .NET 3.5 (Rx) Used to be called Parallel Extensions or PFx Baked into .NET 4 Programming paradigms, languages, and language extensions Functional programming F# LINQ and PLINQ Tools Visual Studio 2010 Tools for Concurrency Manycore, What do we do about it?
Parallelism vs. Concurrency Task vs. Data Parallelism Parallel Programming Concepts
Concurrency or Concurrent computing Many independent requests Web Server, works on multi-threaded single core CPU Separate processes that may be executed in parallel More general than parallelism Parallelism or Parallel computing Processes are executed in parallel simultaneously Only possible with multiple processors or multiple cores Yuan Lin: compares to black and white photography vs. color, one is not a superset of the other http://www.touchdreams.net/blog/2008/12/21/more-on-concurrency-vs-parallelism/ Parallelism vs. Concurrency
Task Parallelism (aka function parallelism and control parallelism) Distributing execution processes (threads/functions/tasks) across different parallel computing nodes (cores) http://msdn.microsoft.com/en-us/library/dd537609(VS.100).aspx Data Parallelism (aka loop-level parallelism) Distributing dataacross different parallel computing nodes (cores) Executing same command over every element in a data structure http://msdn.microsoft.com/en-us/library/dd537608(VS.100).aspx Task vs. Data Parallelism See MSDN for .NET 4, Parallel Programming, Data/Task Parallelism
Task Parallel Libarary
Parallel Programming in the .NET Framework 4 Beta 2 - TPL
Reference System.Threading Use Visual Studio 2010 or .NET 4 For Visual Studio 2008 Download unsupported version for .NET 3.5 SP1 from Reactive Extensions for .NET (Rx) http://msdn.microsoft.com/en-us/devlabs/ee794896.aspx Create a “Task” How to use the TPL FileStream fs = 	new FileStream(fileName, FileMode.CreateNew);  var task = Task.Factory.FromAsync(fs.BeginWrite, fs.EndWrite, bytes, 0,  bytes.Length, null);    
Use Task class Task Parallelism with the TPL // Create a task and supply a user delegate  // by using a lambda expression. vartaskA = new Task(() =>  Console.WriteLine("Hello from taskA.")); // Start the task. taskA.Start(); // Output a message from the calling thread. Console.WriteLine("Hello from the calling thread."); 
Task<TResult> Getting return value from a Task Task<double>[] taskArray = new Task<double>[] {     Task<double>.Factory.StartNew(() => DoComputation1()),     // May be written more conveniently like this: Task.Factory.StartNew(() => DoComputation2()), Task.Factory.StartNew(() => DoComputation3()) }; double[] results = new double[taskArray.Length]; for (inti = 0; i < taskArray.Length; i++)     results[i] = taskArray[i].Result;
Task resembles new thread or ThreadPool work item, but higher level of abstraction Tasks provide two primary benefits over Threads:  More efficient and scalable use of system resources More programmatic control than is possible with a thread or work item Tasks vs. Threads
Behind the scenes, tasks are queued to the ThreadPool ThreadPool now enhanced with algorithms (like hill-climbing) that determine and adjust to the number of threads that maximizes throughput.  Tasks are relatively lightweight You can create many of them to enable fine-grained parallelism.  To complement this, widely-known work-stealing algorithms are employed to provide load-balancing.. Tasks and the framework built around them provide a rich set of APIs that support waiting, cancellation, continuations, robust exception handling, detailed status, custom scheduling, and more. Tasks
Instead of: Use:  Data Parallelism with the TPL for (inti = 0; i < matARows; i++) {     for (int j = 0; j < matBCols; j++) {         ...     } }     Parallel.For(0, matARows, i => {     for (int j = 0; j < matBCols; j++) {         ...     } }); // Parallel.For  
Use Tasks not Threads Use Parallel.For in Data Parallelism scenarios Or… Use AsyncWorkflosw from F#, covered later Use PLINQ, covered later TPL Summary
Functional Programming
1930’s: lambda calculus (roots) 1956: IPL (Information Processing Language) “the first functional language” 1958: LISP “a functional flavored language” 1962: APL (A Programming Language) 1973: ML (Meta Language) 1983: SML (Standard ML) 1987: Caml (Categorical Abstract Machine Language ) and Haskell 1996: OCaml (Objective Caml) 2005: F# introduced to public by Microsoft Research 2010: F# is “productized” in the form of Visual Studio 2010 Functional programming has been around a long time (over 50 years)
Most functional languages encourage programmers to avoid side effects Haskell (a “pure” functional language) restricts side effects with a static type system A side effect Modifies some state Has observable interaction with calling functions  Has observable interaction with the outside world Example: a function or method with no return value Functional programming is safe
Language Evolution (Simon Payton-Jones) C#, VB, Java, C are imperative programming languages.  Very useful but can change the state of the world at anytime creating side effects. Nirvana! Useful and Safe F# Haskell is Very Safe, but not very useful.  Used heavily in research and academia, but rarely in business. http://channel9.msdn.com/posts/Charles/Simon-Peyton-Jones-Towards-a-Programming-Language-Nirvana/
When a function changes the state of the program Write to a file (that may be read later) Write to the screen Changing values of variables in memory (global variables or object state) Side Effect
Compare SQL to your favorite imperative programming language If you write a statement to store and query your data, you don’t need to specify how the system will need to store the data at a low level Example: table partitioning LINQ is an example of bringing functional programming to C# and VB through language extensions Functional Programming
Use lots of processes Avoid side effects Avoid sequential bottlenecks Write “small messages, big computations” code Efficient Multicore Programming Source: Joe Armstrong’s “Programming Erlang, Software for a Concurrent World” Section 20.1 “How to Make Programs Run Efficiently on a Multicore CPU”
F#
Functional language developed by Microsoft Research By Don Syme and his team, who productized Generics Based on OCaml (influenced by C# and Haskell) History 2002: F# language design started 2005 January: F# 1.0.1 releases to public Not a product.  Integration with VS2003 Works in .NET 1.0 through .NET 2.0 beta, Mono 2005 November: F# 1.1.5 with VS 2005 RTM support 2009 October: VS2010 Beta 2, CTP for VS2008 & Non-Windows users 2010: F# is “productized” and baked into VS 2010 What is F#
Multi-Paradigm Functional Programming Imperative Programming Object Oriented Programming Language Oriented Programming F# is not just Functional
Parallel Computing and PDC09 Tools Managed Languages Axum Visual F# Visual Studio 2010 Parallel Debugger Windows Native Libraries Managed Libraries DryadLINQ Async AgentsLibrary Parallel Pattern Library Profiler Concurrency Analysis Parallel LINQ Rx Task ParallelLibrary Data Structures Data Structures Microsoft Research Native Concurrency Runtime Task Scheduler Race Detection Managed Concurrency Runtime Resource Manager ThreadPool Fuzzing Operating System Threads UMS Threads HPC Server Windows 7 / Server 2008 R2 Research / Incubation Visual Studio 2010 / .NET 4 Key:
Functional programming has been around a long time Not new Long history Functional programming is safe A concern as we head toward manycore and cloud computing Functional programming is on the rise Why another language?
Parallel Programming in the .NET Framework 4 Beta 2 - PLINQ
“F# is, technically speaking, neutral with respect to concurrency - it allows the programmer to exploit the many different techniques for concurrency and distribution supported by the .NET platform”  F# FAQ:  http://bit.ly/FSharpFAQ Functional programming is a primary technique for minimizing/isolating mutable state Asynchronous workflows make writing parallel programs in a “natural and compositional style” F# and Multi-Core Programming
Interactive Scripting Good for prototyping Succinct = Less code Type Inference Strongly typed, strict (no dynamic typing) Automatic generalization (generics for free) Few type annotations 1st class functions (currying, lazy evaluations) Pattern matching Key Characteristics of F#
Concurrent Programming with F#
Luke Hoban at PDC 2009F# Program Manager http://microsoftpdc.com/Sessions/FT20
Demo – Imperative sumOfSquares
Difficult to turn existing sequential code into parallel code Must modify large portions of code to use threads explicitly Using shared state and locks is difficult Careful to avoid race conditions and deadlocks Two Problems Parallelizing Imperative Code http://www.manning.com/petricek/petricek_meapch1.pdf
Demo – Recursive sumOfSquares
Declarative programming style	 Easier to introduce parallelism into existing code Immutability by default Can’t introduce race conditions Easier to write lock-free code Functional Programming
Demo – Functional sumOfSquares
From Seq to PSeq Matthew Podwysocki’s Blog http://weblogs.asp.net/podwysocki/archive/2009/02/23/adding-parallel-extensions-to-f.aspx Adding Parallel Extensions to F# for VS2010 Beta 2 Talbott Crowell’s Developer Blog http://talbottc.spaces.live.com/blog/cns!A6E0DA836D488CA6!396.entry Parallel Extensions to F#
Demo – Parallel sumOfSquares
Asynchronous Workflows Control.MailboxProcessor Task Based Programming using TPL Reactive Extensions “The Reactive Extensions can be used from any .NET language.  In F#, .NET events are first-class values that implement the IObservable<out T> interface.  In addition, F# provides a basic set of functions for composing observable collections and F# developers can leverage Rx to get a richer set of operators for composing events and other observable collections. ”S. Somasegar, Senior Vice President, Developer Division   http://blogs.msdn.com/somasegar/archive/2009/11/18/reactive-extensions-for-net-rx.aspx F# Parallel Programming Options
Problem Resize a ton of images Demo of Image Processor let files = Directory.GetFiles(@"C:magesriginal") for file in files do     use image = Image.FromFile(file)     use smallImage = ResizeImage(image)     let destFileName = DestFileName("s1", file) smallImage.Save(destFileName)
Asynchronous Workflows let FetchAsync(file:string) = async {         use stream = File.OpenRead(file)         let! bytes = stream.AsyncRead(intstream.Length)         use memstream = new MemoryStream(bytes.Length) memstream.Write(bytes, 0, bytes.Length)         use image = Image.FromStream(memstream)         use smallImage = ResizeImage(image)         let destFileName = DestFileName("s2", file) smallImage.Save(destFileName)     } let tasks = [for file in files -> FetchAsync(file)] let parallelTasks = Async.Parallel tasks Async.RunSynchronouslyparallelTasks
Tomas PetricekUsing Asynchronous Workflows http://tomasp.net/blog/fsharp-webcast-async.aspx
LINQ Language-Integrated Query
LINQ declaratively specify what you want done not how you want it done Versus: LINQ var source = Enumerable.Range(1, 10000); varevenNums = from num in source                where Compute(num) > 0                select num; var source = Enumerable.Range(1, 10000); varevenNums = new List<int>(); foreach (var num in source)     if (Compute(num) > 0) evenNums.Add(num);
If I put a counter in Compute(num)? What will happen? var source = Enumerable.Range(1, 10000); varevenNums = from num in source                where Compute(num) > 0                select num; private static int Compute(int num) { counter++;     if (num % 2 == 0) return 1;     return 0; }
PLINQ (Parallel LINQ)
Parallel Programming in the .NET Framework 4 Beta 2 - PLINQ
LINQ declaratively specify what you want done not how you want it done PLINQ Declaratively specify “As Parallel” Under the hood, the framework will implement “the how” using TPL and threads. PLINQ = Parallel LINQ var source = Enumerable.Range(1, 10000); varevenNums = from num in source                where Compute(num) > 0                select num; var source = Enumerable.Range(1, 10000); varevenNums = from num in source.AsParallel()                where Compute(num) > 0                select num;
System.Linq.ParallelEnumerable AsParallel() The entry point for PLINQ. Specifies that the rest of the query should be parallelized, if it is possible.
Visual Studio 2010 Tools for Concurrency
Steven Toub at PDC 2009Senior Program Manager on the Parallel Computing Platform  http://microsoftpdc.com/Sessions/P09-09
Views enable you to see how your multi-threaded application interacts with  Itself Hardware Operating System Other processes on the host computer Provides graphical, tabular and textual data Shows the temporal relationships between  the threads in your program the system as a whole Concurrency Visualizer in Visual Studio 2010
Performance bottlenecks CPU underutilization Thread contention Thread migration Synchronization delays Areas of overlapped I/O and other info… Use Concurrency Visualizer to Locate
Concurrency Visualizer High level of Contentions during Async
CPU View Threads View (Parallel Performance) Cores View Views
CPU View ,[object Object]
Sync uses 1 CPU/core,[object Object]
Full test Close up of Sync Close up of Async Core View
Tomas Petricek - F# Webcast (III.) - Using Asynchronous Workflows http://tomasp.net/blog/fsharp-webcast-async.aspx Luke Hoban - F# for Parallel and Asynchronous Programming http://microsoftpdc.com/Sessions/FT20 More info on Asychrounous Workflows
The Landscape of Parallel Computing Research: A View from Berkeley 2.0 by David Patterson http://science.officeisp.net/ManycoreComputingWorkshop07/Presentations/David%20Patterson.pdf Parallel Dwarfs http://paralleldwarfs.codeplex.com/ More Research
“The architect as we know him today is a product of the Renaissance.” (1) “But the medieval architect was a master craftsman (usually a mason or a carpenter by trace), one who could build as well as design, or at least ‘one trained in that craft even if he had ceased to ply his axe and chisel’(2).” (1) “Not only is he hands on, like the agile architect, but we also learn from Arnold that the great Gothic cathedrals of Europe were built, not with BDUF, but with ENUF” (1). Dana Arnold, Reading Architectural History, 2002 (2). D. Knoop & G. P. Jones, The Medieval Mason, 1933 (3). Architects: Back to the future?, Ian Cooper 2008 The Architect http://codebetter.com/blogs/ian_cooper/archive/2008/01/02/architects-back-to-the-future.aspx
visit us at http://fsug.org Thank you. Questions?Architecting Solutions for the Manycore Future Talbott Crowell ThirdM.com http://talbottc.spaces.live.com Twitter: @Talbott and @fsug

More Related Content

What's hot

Deep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoDeep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with Theano
Vincenzo Lomonaco
 
Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel
Intel® Software
 
GPU Ecosystem
GPU EcosystemGPU Ecosystem
GPU Ecosystem
Ofer Rosenberg
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...ChangWoo Min
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
Ofer Rosenberg
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
DESMOND YUEN
 
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
Ofer Rosenberg
 
Matrix Multiplication with Ateji PX for Java
Matrix Multiplication with Ateji PX for JavaMatrix Multiplication with Ateji PX for Java
Matrix Multiplication with Ateji PX for Java
Patrick Viry
 
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
Edge AI and Vision Alliance
 
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
Edge AI and Vision Alliance
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
Oswald Campesato
 

What's hot (11)

Deep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with TheanoDeep Learning libraries and first experiments with Theano
Deep Learning libraries and first experiments with Theano
 
Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel
 
GPU Ecosystem
GPU EcosystemGPU Ecosystem
GPU Ecosystem
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
 
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
 
Matrix Multiplication with Ateji PX for Java
Matrix Multiplication with Ateji PX for JavaMatrix Multiplication with Ateji PX for Java
Matrix Multiplication with Ateji PX for Java
 
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
 
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP..."Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
"Efficient Implementation of Convolutional Neural Networks using OpenCL on FP...
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 

Similar to Architecting Solutions for the Manycore Future

Software and the Concurrency Revolution : Notes
Software and the Concurrency Revolution : NotesSoftware and the Concurrency Revolution : Notes
Software and the Concurrency Revolution : Notes
Subhajit Sahu
 
Japan's post K Computer
Japan's post K ComputerJapan's post K Computer
Japan's post K Computer
inside-BigData.com
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor DesignSri Prasanna
 
End of a trend
End of a trendEnd of a trend
End of a trend
mml2000
 
Parallel universe-issue-29
Parallel universe-issue-29Parallel universe-issue-29
Parallel universe-issue-29
DESMOND YUEN
 
Why you should use the Yocto Project
Why you should use the Yocto ProjectWhy you should use the Yocto Project
Why you should use the Yocto Project
rossburton
 
Infrastructure student
Infrastructure studentInfrastructure student
Infrastructure student
John Scrugham
 
Thinking in parallel ab tuladev
Thinking in parallel ab tuladevThinking in parallel ab tuladev
Thinking in parallel ab tuladev
Pavel Tsukanov
 
O futuro do .NET : O que eu preciso saber
O futuro do .NET : O que eu preciso saberO futuro do .NET : O que eu preciso saber
O futuro do .NET : O que eu preciso saber
Danilo Bordini
 
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
enriquealbabaena6868
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
Wilhelm van Belkum
 
Webinaron muticoreprocessors
Webinaron muticoreprocessorsWebinaron muticoreprocessors
Webinaron muticoreprocessors
Nagasuri Bala Venkateswarlu
 
01 intro-bps-2011
01 intro-bps-201101 intro-bps-2011
01 intro-bps-2011
mistercteam
 
LAS16-108: JerryScript and other scripting languages for IoT
LAS16-108: JerryScript and other scripting languages for IoTLAS16-108: JerryScript and other scripting languages for IoT
LAS16-108: JerryScript and other scripting languages for IoT
Linaro
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
Tyrone Systems
 

Similar to Architecting Solutions for the Manycore Future (20)

Os Lamothe
Os LamotheOs Lamothe
Os Lamothe
 
Software and the Concurrency Revolution : Notes
Software and the Concurrency Revolution : NotesSoftware and the Concurrency Revolution : Notes
Software and the Concurrency Revolution : Notes
 
Japan's post K Computer
Japan's post K ComputerJapan's post K Computer
Japan's post K Computer
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor Design
 
LPC4300_two_cores
LPC4300_two_coresLPC4300_two_cores
LPC4300_two_cores
 
Embedded systems
Embedded systemsEmbedded systems
Embedded systems
 
End of a trend
End of a trendEnd of a trend
End of a trend
 
Parallel universe-issue-29
Parallel universe-issue-29Parallel universe-issue-29
Parallel universe-issue-29
 
Why you should use the Yocto Project
Why you should use the Yocto ProjectWhy you should use the Yocto Project
Why you should use the Yocto Project
 
Infrastructure student
Infrastructure studentInfrastructure student
Infrastructure student
 
Thinking in parallel ab tuladev
Thinking in parallel ab tuladevThinking in parallel ab tuladev
Thinking in parallel ab tuladev
 
Embedded systemppt2343
Embedded systemppt2343Embedded systemppt2343
Embedded systemppt2343
 
O futuro do .NET : O que eu preciso saber
O futuro do .NET : O que eu preciso saberO futuro do .NET : O que eu preciso saber
O futuro do .NET : O que eu preciso saber
 
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
1.1. SOC AND MULTICORE ARCHITECTURES FOR EMBEDDED SYSTEMS (2).pdf
 
Clustering
ClusteringClustering
Clustering
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
Webinaron muticoreprocessors
Webinaron muticoreprocessorsWebinaron muticoreprocessors
Webinaron muticoreprocessors
 
01 intro-bps-2011
01 intro-bps-201101 intro-bps-2011
01 intro-bps-2011
 
LAS16-108: JerryScript and other scripting languages for IoT
LAS16-108: JerryScript and other scripting languages for IoTLAS16-108: JerryScript and other scripting languages for IoT
LAS16-108: JerryScript and other scripting languages for IoT
 
oneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel ProductoneAPI: Industry Initiative & Intel Product
oneAPI: Industry Initiative & Intel Product
 

More from Talbott Crowell

Top 7 mistakes
Top 7 mistakesTop 7 mistakes
Top 7 mistakes
Talbott Crowell
 
Top 3 Mistakes when Building
Top 3 Mistakes when BuildingTop 3 Mistakes when Building
Top 3 Mistakes when BuildingTalbott Crowell
 
Building high performance and scalable share point applications
Building high performance and scalable share point applicationsBuilding high performance and scalable share point applications
Building high performance and scalable share point applications
Talbott Crowell
 
Road to the Cloud - Extending your reach with SharePoint and Office 365
Road to the Cloud - Extending your reach with SharePoint and Office 365Road to the Cloud - Extending your reach with SharePoint and Office 365
Road to the Cloud - Extending your reach with SharePoint and Office 365
Talbott Crowell
 
Custom Development for SharePoint
Custom Development for SharePointCustom Development for SharePoint
Custom Development for SharePointTalbott Crowell
 
Custom Development in SharePoint – What are my options now?
Custom Development in SharePoint – What are my options now?Custom Development in SharePoint – What are my options now?
Custom Development in SharePoint – What are my options now?
Talbott Crowell
 
Developing a Provider Hosted SharePoint app
Developing a Provider Hosted SharePoint appDeveloping a Provider Hosted SharePoint app
Developing a Provider Hosted SharePoint app
Talbott Crowell
 
Developing a provider hosted share point app
Developing a provider hosted share point appDeveloping a provider hosted share point app
Developing a provider hosted share point app
Talbott Crowell
 
Introduction to F# 3.0
Introduction to F# 3.0Introduction to F# 3.0
Introduction to F# 3.0
Talbott Crowell
 
PowerShell and SharePoint @spsnyc July 2012
PowerShell and SharePoint @spsnyc July 2012PowerShell and SharePoint @spsnyc July 2012
PowerShell and SharePoint @spsnyc July 2012Talbott Crowell
 
PowerShell and SharePoint
PowerShell and SharePointPowerShell and SharePoint
PowerShell and SharePoint
Talbott Crowell
 
Welcome to windows 8
Welcome to windows 8Welcome to windows 8
Welcome to windows 8
Talbott Crowell
 
Exploring SharePoint with F#
Exploring SharePoint with F#Exploring SharePoint with F#
Exploring SharePoint with F#Talbott Crowell
 
Automating PowerShell with SharePoint
Automating PowerShell with SharePointAutomating PowerShell with SharePoint
Automating PowerShell with SharePointTalbott Crowell
 
F# And Silverlight
F# And SilverlightF# And Silverlight
F# And Silverlight
Talbott Crowell
 
SharePoint Saturday Boston 2010
SharePoint Saturday Boston 2010SharePoint Saturday Boston 2010
SharePoint Saturday Boston 2010
Talbott Crowell
 
Automating SQL Server Database Creation for SharePoint
Automating SQL Server Database Creation for SharePointAutomating SQL Server Database Creation for SharePoint
Automating SQL Server Database Creation for SharePoint
Talbott Crowell
 
Introduction to F#
Introduction to F#Introduction to F#
Introduction to F#
Talbott Crowell
 

More from Talbott Crowell (18)

Top 7 mistakes
Top 7 mistakesTop 7 mistakes
Top 7 mistakes
 
Top 3 Mistakes when Building
Top 3 Mistakes when BuildingTop 3 Mistakes when Building
Top 3 Mistakes when Building
 
Building high performance and scalable share point applications
Building high performance and scalable share point applicationsBuilding high performance and scalable share point applications
Building high performance and scalable share point applications
 
Road to the Cloud - Extending your reach with SharePoint and Office 365
Road to the Cloud - Extending your reach with SharePoint and Office 365Road to the Cloud - Extending your reach with SharePoint and Office 365
Road to the Cloud - Extending your reach with SharePoint and Office 365
 
Custom Development for SharePoint
Custom Development for SharePointCustom Development for SharePoint
Custom Development for SharePoint
 
Custom Development in SharePoint – What are my options now?
Custom Development in SharePoint – What are my options now?Custom Development in SharePoint – What are my options now?
Custom Development in SharePoint – What are my options now?
 
Developing a Provider Hosted SharePoint app
Developing a Provider Hosted SharePoint appDeveloping a Provider Hosted SharePoint app
Developing a Provider Hosted SharePoint app
 
Developing a provider hosted share point app
Developing a provider hosted share point appDeveloping a provider hosted share point app
Developing a provider hosted share point app
 
Introduction to F# 3.0
Introduction to F# 3.0Introduction to F# 3.0
Introduction to F# 3.0
 
PowerShell and SharePoint @spsnyc July 2012
PowerShell and SharePoint @spsnyc July 2012PowerShell and SharePoint @spsnyc July 2012
PowerShell and SharePoint @spsnyc July 2012
 
PowerShell and SharePoint
PowerShell and SharePointPowerShell and SharePoint
PowerShell and SharePoint
 
Welcome to windows 8
Welcome to windows 8Welcome to windows 8
Welcome to windows 8
 
Exploring SharePoint with F#
Exploring SharePoint with F#Exploring SharePoint with F#
Exploring SharePoint with F#
 
Automating PowerShell with SharePoint
Automating PowerShell with SharePointAutomating PowerShell with SharePoint
Automating PowerShell with SharePoint
 
F# And Silverlight
F# And SilverlightF# And Silverlight
F# And Silverlight
 
SharePoint Saturday Boston 2010
SharePoint Saturday Boston 2010SharePoint Saturday Boston 2010
SharePoint Saturday Boston 2010
 
Automating SQL Server Database Creation for SharePoint
Automating SQL Server Database Creation for SharePointAutomating SQL Server Database Creation for SharePoint
Automating SQL Server Database Creation for SharePoint
 
Introduction to F#
Introduction to F#Introduction to F#
Introduction to F#
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Architecting Solutions for the Manycore Future

  • 1. Architecting Solutions for the Manycore Future Talbott Crowell ThirdM
  • 2. This talk will focus solution architects toward thinking about parallelism when designing applications and solutions Threads vs. Tasks using TPL LINQ vs. PLINQ Object Oriented vs. Functional Programming This talk will also compare programming languages, how languages differ when dealing with manycore programming, and the different advantages to these languages. Abstract manycore
  • 3. Patrick Gelsinger, Intel VP February 2001, San Francisco, CA 2001 IEEE International Solid-State Circuits Conference (ISSCC) If scaling continues at present pace, by 2005, high speed processors would have power density of nuclear reactor, by 2010, a rocket nozzle, and by 2015, surface of sun. Intel stock dropped 8% on the next day “Business as usual will not work in the future.”
  • 4. The Power Wall: CPU Clock Speed Manycore -> Multicore -> Single core -> From Katherine Yelick’s “Multicore: Fallout of a Hardware Revolution”
  • 5. In 1966, Gordon Moore predicted exponential growth in number of transistors per chip based on the trend from 1959 to 1965 Clock frequencies continued to increase exponentially until they hit the power wall in 2004 at around 3 to 4 GHz 1971, Intel 4004 (first single-chip CPU) – 740 kHz 1978, Intel 8086 (orgin of x86) – 4.77 MHz 1985, Intel 80386DX – 16 MHz 1993, Pentium P5 – 66 MHz 1998, Pentium II – 450 MHz 2001, Pentium II (Tualatin) – 1.4 GHz 2004, Pentium 4F – 3.6 GHz 2008, Core i7 (Extreme) – 3.3 GHz Intel is now doubling cores along with other improvements to continue to scale Effect of the Power Wall This trend continues even today The Power Wall Enter Manycore
  • 6. Manycore, What is it? Manycore, Why should I care? Manycore, What do we do about it? Frameworks Task Parallel Library (Reactive Extensions and .NET 4) Languages, paradigms, and language extensions F#, functional programming, LINQ, PLINQ Tools Visual Studio 2010 Tools for Concurrency Agenda: Manycore Future
  • 8. Single core: 1 processor on a chip die (1 socket) Many past consumer and server CPU’s (some current CPU’s for lightweight low power devices) Including CPU’s that support hyperthreading, but this is a grey area Multicore: 2 to 8 core processors per chip/socket AMD Athlon 64 X2 (first dual-core desktop CPU released in 2005) Intel Core Duo, 2006 (32 bit, dual core, for laptops only) Core Solo was a dual core chip with one that doesn’t work Intel Core 2 (not multicore, instead a brand for 64 bit arch) Core 2 Solo (1 core) Core 2 Duo (2 cores) Core 2 Quad (4 cores) Manycore: more than 8 cores per chip Currently prototypes and R&D Manycore, What is it?
  • 9. High-end Servers 2001-2004 IBM Servers 2001 - IBM POWER4 PowerPC for AS/400 and RS/6000 “world's first non-embedded dual-core processor” Sun Servers 2004 - UltraSpark IV – “first multicore SPARC processor” Desktops/Laptops 2005-2006 AMD Athlon 64 X2 (Manchester) May 2005 “first dual-core desktop CPU” Intel Core Duo, Jan 2006 Intel Pentium (Allendale) dual core Jan 2007 Windows Servers 2006 Intel Xeon (Paxville) dual core Dec 2005 AMD Opteron (Denmark) dual core March 2006 Intel Itanium 2 (Montecito) dual core July 2006 Sony Playstation 3 – 2006 9 core Cell Processor (only 8 operational) - Cell architecture jointly developed by Sony, Toshiba, and IBM Multicore trends from servers to gaming consoles
  • 10. Power Mac G5 - Mid 2003 2 x 1 core (single core) IBM PowerPC 970 Mac Pro - Mid 2006 2 x 2 core (dual core) Intel Xeon (Woodcrest) Mac Pro - Early 2008 2 x 4 core (quad core) Intel Xeon (Harpertown) In 5 years number of cores doubled twice on Apple’s high end graphics workstation From 2 to 4 to 8 Macintosh multicore trend
  • 11. The chip is just designed for research efforts at the moment, according to an Intel spokesperson. "There are no product plans for this chip. We will never sell it so there won't be a price for it," the Intel spokesperson noted in an e-mail. "We will give about a hundred or more to industry partners like Microsoft and academia to help us research software development and learn on a real piece of hardware, [of] which nothing of its kind exists today." http://redmondmag.com/articles/2009/12/04/intel-unveils-48-core-cloud-computer-chip.aspx Microsoft said it had already put SCC into its development pipeline so it could exploit it in the future. http://news.bbc.co.uk/2/hi/technology/8392392.stm 48 Core Single-chip Cloud Computer (SCC)
  • 12. Why should I care? (about Manycore)
  • 13. Hardware is changing Programming needs to change to take advantage of new hardware Concurrent Programming Paradigm Shift Designing applications Developing applications Manycore, Why should I care?
  • 14. “The computer industry is once again at a crossroads. Hardware concurrency, in the form of new manycore processors, together with growing software complexity, will require that the technology industry fundamentally rethink both the architecture of modern computers and the resulting software development paradigms.” Craig MundieChief Research and Strategy OfficerMicrosoft CorporationJune 2008 First paragraph of the Forward of Joe Duffy’s preeminent tome “Concurrent Programming on Windows” Concurrent Programming
  • 15. Excerpt from Mark Reinhold’s Blog post: November 24, 2009 The free lunch is over. Multicore processors are not just coming—they’re here. Leveraging multiple cores requires writing scalable parallel programs, which is incredibly hard. Tools such as fork/join frameworks based on work-stealing algorithms make the task easier, but it still takes a fair bit of expertise and tuning. Bulk-data APIs such as parallel arrays allow computations to be expressed in terms of higher-level, SQL-like operations (e.g., filter, map, and reduce) which can be mapped automatically onto the fork-join paradigm. Working with parallel arrays in Java, unfortunately, requires lots of boilerplate code to solve even simple problems. Closures can eliminate that boilerplate. “It’s time to add them to Java.” http://blogs.sun.com/mr/entry/closures “There’s not a moment to lose!”
  • 16. Herb Sutter 2005 Programs are not doubling in speed every couple of years for free anymore We need to start writing code to take advantage of many cores Currently painful and problematic to take advantage of many cores because of shared memory, locking, and other imperative programming techniques “The Free Lunch Is Over”
  • 17. Is this just hype? Another Y2K scare? Fact: CPU’s are changing Programmers will learn to exploit new architectures Will you be one of them? Wait and see? You could just wait and let the tools catch up so you don’t have to think about it. Will that strategy work? Should you be concerned?
  • 18. Just tools or frameworks will not solve the manycore problem alone Imperative programming by definition has limitations scaling in a parallel way Imperative programming (C, C++, VB, Java, C#) Requires locks and synchronization code to handle shared memory read/write transactions Not trivial Difficult to debug Tools and frameworks may help, but will require different approach to the problem (a different paradigm) to really take advantage of the tools The Core Problem
  • 19. Some frameworks are designed to be single threaded, such as ASP.NET Best practices for ASP.NET applications recommend avoiding spawning new threads ASP.NET and IIS handle the multithreading and multiprocessing to take advantage of the many processors (and now many cores) on Web Servers and Application Servers Will this best practice remain true? Even when server CPU’s have hundreds or thousands of cores? Will it affect all programmers?
  • 20. What do we do about it? (How do we prepare for Manycore)
  • 21. Identify where the dependencies are Identify where you can parallelize Understand the tools, techniques, and approaches for solving the pieces Put them together to understand overall performance POC – Proof of Concept Test, test, test Performance goals up front Understand Problem Domain
  • 22. Frameworks Task Parallel Library (TPL) Reactive Extensions for .NET 3.5 (Rx) Used to be called Parallel Extensions or PFx Baked into .NET 4 Programming paradigms, languages, and language extensions Functional programming F# LINQ and PLINQ Tools Visual Studio 2010 Tools for Concurrency Manycore, What do we do about it?
  • 23. Parallelism vs. Concurrency Task vs. Data Parallelism Parallel Programming Concepts
  • 24. Concurrency or Concurrent computing Many independent requests Web Server, works on multi-threaded single core CPU Separate processes that may be executed in parallel More general than parallelism Parallelism or Parallel computing Processes are executed in parallel simultaneously Only possible with multiple processors or multiple cores Yuan Lin: compares to black and white photography vs. color, one is not a superset of the other http://www.touchdreams.net/blog/2008/12/21/more-on-concurrency-vs-parallelism/ Parallelism vs. Concurrency
  • 25. Task Parallelism (aka function parallelism and control parallelism) Distributing execution processes (threads/functions/tasks) across different parallel computing nodes (cores) http://msdn.microsoft.com/en-us/library/dd537609(VS.100).aspx Data Parallelism (aka loop-level parallelism) Distributing dataacross different parallel computing nodes (cores) Executing same command over every element in a data structure http://msdn.microsoft.com/en-us/library/dd537608(VS.100).aspx Task vs. Data Parallelism See MSDN for .NET 4, Parallel Programming, Data/Task Parallelism
  • 27. Parallel Programming in the .NET Framework 4 Beta 2 - TPL
  • 28. Reference System.Threading Use Visual Studio 2010 or .NET 4 For Visual Studio 2008 Download unsupported version for .NET 3.5 SP1 from Reactive Extensions for .NET (Rx) http://msdn.microsoft.com/en-us/devlabs/ee794896.aspx Create a “Task” How to use the TPL FileStream fs =  new FileStream(fileName, FileMode.CreateNew);  var task = Task.Factory.FromAsync(fs.BeginWrite, fs.EndWrite, bytes, 0,  bytes.Length, null);    
  • 29. Use Task class Task Parallelism with the TPL // Create a task and supply a user delegate // by using a lambda expression. vartaskA = new Task(() => Console.WriteLine("Hello from taskA.")); // Start the task. taskA.Start(); // Output a message from the calling thread. Console.WriteLine("Hello from the calling thread."); 
  • 30. Task<TResult> Getting return value from a Task Task<double>[] taskArray = new Task<double>[] { Task<double>.Factory.StartNew(() => DoComputation1()), // May be written more conveniently like this: Task.Factory.StartNew(() => DoComputation2()), Task.Factory.StartNew(() => DoComputation3()) }; double[] results = new double[taskArray.Length]; for (inti = 0; i < taskArray.Length; i++) results[i] = taskArray[i].Result;
  • 31. Task resembles new thread or ThreadPool work item, but higher level of abstraction Tasks provide two primary benefits over Threads: More efficient and scalable use of system resources More programmatic control than is possible with a thread or work item Tasks vs. Threads
  • 32. Behind the scenes, tasks are queued to the ThreadPool ThreadPool now enhanced with algorithms (like hill-climbing) that determine and adjust to the number of threads that maximizes throughput. Tasks are relatively lightweight You can create many of them to enable fine-grained parallelism. To complement this, widely-known work-stealing algorithms are employed to provide load-balancing.. Tasks and the framework built around them provide a rich set of APIs that support waiting, cancellation, continuations, robust exception handling, detailed status, custom scheduling, and more. Tasks
  • 33. Instead of: Use: Data Parallelism with the TPL for (inti = 0; i < matARows; i++) { for (int j = 0; j < matBCols; j++) { ... } }     Parallel.For(0, matARows, i => { for (int j = 0; j < matBCols; j++) { ... } }); // Parallel.For  
  • 34. Use Tasks not Threads Use Parallel.For in Data Parallelism scenarios Or… Use AsyncWorkflosw from F#, covered later Use PLINQ, covered later TPL Summary
  • 36. 1930’s: lambda calculus (roots) 1956: IPL (Information Processing Language) “the first functional language” 1958: LISP “a functional flavored language” 1962: APL (A Programming Language) 1973: ML (Meta Language) 1983: SML (Standard ML) 1987: Caml (Categorical Abstract Machine Language ) and Haskell 1996: OCaml (Objective Caml) 2005: F# introduced to public by Microsoft Research 2010: F# is “productized” in the form of Visual Studio 2010 Functional programming has been around a long time (over 50 years)
  • 37. Most functional languages encourage programmers to avoid side effects Haskell (a “pure” functional language) restricts side effects with a static type system A side effect Modifies some state Has observable interaction with calling functions Has observable interaction with the outside world Example: a function or method with no return value Functional programming is safe
  • 38. Language Evolution (Simon Payton-Jones) C#, VB, Java, C are imperative programming languages. Very useful but can change the state of the world at anytime creating side effects. Nirvana! Useful and Safe F# Haskell is Very Safe, but not very useful. Used heavily in research and academia, but rarely in business. http://channel9.msdn.com/posts/Charles/Simon-Peyton-Jones-Towards-a-Programming-Language-Nirvana/
  • 39. When a function changes the state of the program Write to a file (that may be read later) Write to the screen Changing values of variables in memory (global variables or object state) Side Effect
  • 40. Compare SQL to your favorite imperative programming language If you write a statement to store and query your data, you don’t need to specify how the system will need to store the data at a low level Example: table partitioning LINQ is an example of bringing functional programming to C# and VB through language extensions Functional Programming
  • 41. Use lots of processes Avoid side effects Avoid sequential bottlenecks Write “small messages, big computations” code Efficient Multicore Programming Source: Joe Armstrong’s “Programming Erlang, Software for a Concurrent World” Section 20.1 “How to Make Programs Run Efficiently on a Multicore CPU”
  • 42. F#
  • 43. Functional language developed by Microsoft Research By Don Syme and his team, who productized Generics Based on OCaml (influenced by C# and Haskell) History 2002: F# language design started 2005 January: F# 1.0.1 releases to public Not a product. Integration with VS2003 Works in .NET 1.0 through .NET 2.0 beta, Mono 2005 November: F# 1.1.5 with VS 2005 RTM support 2009 October: VS2010 Beta 2, CTP for VS2008 & Non-Windows users 2010: F# is “productized” and baked into VS 2010 What is F#
  • 44. Multi-Paradigm Functional Programming Imperative Programming Object Oriented Programming Language Oriented Programming F# is not just Functional
  • 45. Parallel Computing and PDC09 Tools Managed Languages Axum Visual F# Visual Studio 2010 Parallel Debugger Windows Native Libraries Managed Libraries DryadLINQ Async AgentsLibrary Parallel Pattern Library Profiler Concurrency Analysis Parallel LINQ Rx Task ParallelLibrary Data Structures Data Structures Microsoft Research Native Concurrency Runtime Task Scheduler Race Detection Managed Concurrency Runtime Resource Manager ThreadPool Fuzzing Operating System Threads UMS Threads HPC Server Windows 7 / Server 2008 R2 Research / Incubation Visual Studio 2010 / .NET 4 Key:
  • 46. Functional programming has been around a long time Not new Long history Functional programming is safe A concern as we head toward manycore and cloud computing Functional programming is on the rise Why another language?
  • 47. Parallel Programming in the .NET Framework 4 Beta 2 - PLINQ
  • 48. “F# is, technically speaking, neutral with respect to concurrency - it allows the programmer to exploit the many different techniques for concurrency and distribution supported by the .NET platform” F# FAQ: http://bit.ly/FSharpFAQ Functional programming is a primary technique for minimizing/isolating mutable state Asynchronous workflows make writing parallel programs in a “natural and compositional style” F# and Multi-Core Programming
  • 49. Interactive Scripting Good for prototyping Succinct = Less code Type Inference Strongly typed, strict (no dynamic typing) Automatic generalization (generics for free) Few type annotations 1st class functions (currying, lazy evaluations) Pattern matching Key Characteristics of F#
  • 51. Luke Hoban at PDC 2009F# Program Manager http://microsoftpdc.com/Sessions/FT20
  • 52. Demo – Imperative sumOfSquares
  • 53. Difficult to turn existing sequential code into parallel code Must modify large portions of code to use threads explicitly Using shared state and locks is difficult Careful to avoid race conditions and deadlocks Two Problems Parallelizing Imperative Code http://www.manning.com/petricek/petricek_meapch1.pdf
  • 54. Demo – Recursive sumOfSquares
  • 55. Declarative programming style Easier to introduce parallelism into existing code Immutability by default Can’t introduce race conditions Easier to write lock-free code Functional Programming
  • 56. Demo – Functional sumOfSquares
  • 57. From Seq to PSeq Matthew Podwysocki’s Blog http://weblogs.asp.net/podwysocki/archive/2009/02/23/adding-parallel-extensions-to-f.aspx Adding Parallel Extensions to F# for VS2010 Beta 2 Talbott Crowell’s Developer Blog http://talbottc.spaces.live.com/blog/cns!A6E0DA836D488CA6!396.entry Parallel Extensions to F#
  • 58. Demo – Parallel sumOfSquares
  • 59. Asynchronous Workflows Control.MailboxProcessor Task Based Programming using TPL Reactive Extensions “The Reactive Extensions can be used from any .NET language. In F#, .NET events are first-class values that implement the IObservable<out T> interface.  In addition, F# provides a basic set of functions for composing observable collections and F# developers can leverage Rx to get a richer set of operators for composing events and other observable collections. ”S. Somasegar, Senior Vice President, Developer Division   http://blogs.msdn.com/somasegar/archive/2009/11/18/reactive-extensions-for-net-rx.aspx F# Parallel Programming Options
  • 60. Problem Resize a ton of images Demo of Image Processor let files = Directory.GetFiles(@"C:magesriginal") for file in files do use image = Image.FromFile(file) use smallImage = ResizeImage(image) let destFileName = DestFileName("s1", file) smallImage.Save(destFileName)
  • 61. Asynchronous Workflows let FetchAsync(file:string) = async { use stream = File.OpenRead(file) let! bytes = stream.AsyncRead(intstream.Length) use memstream = new MemoryStream(bytes.Length) memstream.Write(bytes, 0, bytes.Length) use image = Image.FromStream(memstream) use smallImage = ResizeImage(image) let destFileName = DestFileName("s2", file) smallImage.Save(destFileName) } let tasks = [for file in files -> FetchAsync(file)] let parallelTasks = Async.Parallel tasks Async.RunSynchronouslyparallelTasks
  • 62. Tomas PetricekUsing Asynchronous Workflows http://tomasp.net/blog/fsharp-webcast-async.aspx
  • 64. LINQ declaratively specify what you want done not how you want it done Versus: LINQ var source = Enumerable.Range(1, 10000); varevenNums = from num in source where Compute(num) > 0 select num; var source = Enumerable.Range(1, 10000); varevenNums = new List<int>(); foreach (var num in source) if (Compute(num) > 0) evenNums.Add(num);
  • 65. If I put a counter in Compute(num)? What will happen? var source = Enumerable.Range(1, 10000); varevenNums = from num in source where Compute(num) > 0 select num; private static int Compute(int num) { counter++; if (num % 2 == 0) return 1; return 0; }
  • 67. Parallel Programming in the .NET Framework 4 Beta 2 - PLINQ
  • 68. LINQ declaratively specify what you want done not how you want it done PLINQ Declaratively specify “As Parallel” Under the hood, the framework will implement “the how” using TPL and threads. PLINQ = Parallel LINQ var source = Enumerable.Range(1, 10000); varevenNums = from num in source where Compute(num) > 0 select num; var source = Enumerable.Range(1, 10000); varevenNums = from num in source.AsParallel() where Compute(num) > 0 select num;
  • 69. System.Linq.ParallelEnumerable AsParallel() The entry point for PLINQ. Specifies that the rest of the query should be parallelized, if it is possible.
  • 70. Visual Studio 2010 Tools for Concurrency
  • 71. Steven Toub at PDC 2009Senior Program Manager on the Parallel Computing Platform http://microsoftpdc.com/Sessions/P09-09
  • 72. Views enable you to see how your multi-threaded application interacts with Itself Hardware Operating System Other processes on the host computer Provides graphical, tabular and textual data Shows the temporal relationships between the threads in your program the system as a whole Concurrency Visualizer in Visual Studio 2010
  • 73. Performance bottlenecks CPU underutilization Thread contention Thread migration Synchronization delays Areas of overlapped I/O and other info… Use Concurrency Visualizer to Locate
  • 74. Concurrency Visualizer High level of Contentions during Async
  • 75. CPU View Threads View (Parallel Performance) Cores View Views
  • 76.
  • 77.
  • 78. Full test Close up of Sync Close up of Async Core View
  • 79. Tomas Petricek - F# Webcast (III.) - Using Asynchronous Workflows http://tomasp.net/blog/fsharp-webcast-async.aspx Luke Hoban - F# for Parallel and Asynchronous Programming http://microsoftpdc.com/Sessions/FT20 More info on Asychrounous Workflows
  • 80. The Landscape of Parallel Computing Research: A View from Berkeley 2.0 by David Patterson http://science.officeisp.net/ManycoreComputingWorkshop07/Presentations/David%20Patterson.pdf Parallel Dwarfs http://paralleldwarfs.codeplex.com/ More Research
  • 81. “The architect as we know him today is a product of the Renaissance.” (1) “But the medieval architect was a master craftsman (usually a mason or a carpenter by trace), one who could build as well as design, or at least ‘one trained in that craft even if he had ceased to ply his axe and chisel’(2).” (1) “Not only is he hands on, like the agile architect, but we also learn from Arnold that the great Gothic cathedrals of Europe were built, not with BDUF, but with ENUF” (1). Dana Arnold, Reading Architectural History, 2002 (2). D. Knoop & G. P. Jones, The Medieval Mason, 1933 (3). Architects: Back to the future?, Ian Cooper 2008 The Architect http://codebetter.com/blogs/ian_cooper/archive/2008/01/02/architects-back-to-the-future.aspx
  • 82. visit us at http://fsug.org Thank you. Questions?Architecting Solutions for the Manycore Future Talbott Crowell ThirdM.com http://talbottc.spaces.live.com Twitter: @Talbott and @fsug

Editor's Notes

  1. ENUF = Elements Needed Up Front