2. History of parallelization Definition: a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can be divided into smaller ones, which are then solved concurrently (“in parallel”) Developers have tried to improve performance by parallelizing problems, even before true multicore systems How is this different from multithreading? Multithreading is a type of parallelism
3. Real-life parallelization Consider that we have some eggs to boil (data to process) Before the early 2000s we only had 1 pot (core) and more eggs than we could boil at once (meaning we could boil >1 at a time in parallel, since you can fit >1 egg/pot). After the early 2000s we had 2 pots, thus could boil twice as many eggs at once.
4. Pots vs Boil timeGiven: We have 10 eggs to boil and each egg requires 8minutes in order to be ready to eat. Each pot holds up to5 eggs.Number of pots Boil Time1 16 min2 8 min3 8 min4 8 minSomething interesting occurs… Adding more than 2 potsdoes nothing to decrease the overall time.
5. Time efficiency Time to boil 100 eggs
6. What does this mean? It really doesn’t matter how many cores we use. This problem simply will not speed up by adding cores. Our equations are pretty simple: Pots Needed = Number of Eggs/Eggs Per Pot 20 = 100/5 Time = Ceiling(Pots Needed/Pots) x Boiling Time 160 = Celing(20/1) x 8 In Computer Terms: ExecutionTime = Ceiling(Amount of threads/Cores)xThreadExecutionTime
7. Caveat In the egg example we assume… Thread execution time is constant (never happens) Presume each core executes one thread at a time and does not continue w/ the next thread in the queue until it’s finished – ie given a quad core processor, it can execute 4 threads and give us the same result as the egg boiling w/ 4 pots
8. Short attention span LINQ Warning: I only really know basic LINQ (slowly integrating it into the Real Feeds Project where I can use it) LINQ = Language Integrated Query Something something query – gotcha. Looks like SQL in reverse (we know SQL, right?) Laymans terms – LINQ works against collections of data (any data really that has an enumerator) to get a all or subset of data
9. Simple LINQvar ages = new List<int>(){ 25, 21, 18, 65};var agesInOrder = from age in ages orderby age ascending select age;
11. Ex 2 Take a collection of egg boil times Iterate over the collection and look at 5 items at a time Find the longest cooking time for the egg in the current patch Simulate the boiling time with Thread.Sleep Stop looking for eggs when there are <5 eggs in the current batch
12. Ex 2 (cont) First run – ~1600ms 2nd run - ~1200ms why? Put 5 eggs in the pot After 4 min, remove 2nd from last egg After 8 min, remove remaining egg Add next batch that contains 1 egg After 4min remove the egg from the pot
13. LINQ to the rescue Any for / foreach look can potentially be converted into LINQ Compare Boil() code v1 & v2 Note : Optimized (1600ms vs 1200ms) Nicer to read
14. Parallelize It! We have 2 options Parallel extensions Parallel LINQ
15. Parallel Extensions Introduced in .Net 4.0 Has 2 important methods that we’ll focus on Parallel.For Parallel.For(0, eggs.Length, I => {}); Parallel.ForEach
16. Parallel LINQ Say we have a list of web requests that we need to do Each call takes a certain amount of time & we want to parallelize it In previous examples we’ve relied on an index, but say if we can’t
Be the first to comment