Parallel programming patterns - Олександр Павлишак

596 views

Published on

Паралелізм та concurrency -- напрямок, в якому технології програмування прямують зараз і, без сумніву, прямуватимуть в майбутньому. Багатоядерними процесорами оснащуються комп'ютери навіть початкового рівня, що відкриває можливості для створення швидких ефективних програм із "живими" інтерфейсами. Тому навіть ті розробники, які раніше не стикались із concurrent кодом, будуть все частіше і частіше програмувати із врахуванням паралелізму. В доповіді буде розглянуто шаблони паралельного програмування, способи асинхронного програмування, а також тенденції окремих сучасних технологій в області паралелізму та асинхронності. Слухачі зможуть отримати знання про основні способи організації паралельних обчислень у desktop-, web- і серверних аплікаціях, засоби досягнення responsive GUI, техніки вирішення проблем, що виникають у concurrent програмуванні.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
596
On SlideShare
0
From Embeds
0
Number of Embeds
160
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Parallel programming patterns - Олександр Павлишак

  1. 1. Parallel Programming Patterns Аудиторія: розробники Олександр Павлишак, 2011 pavlyshak@gmail.com
  2. 2. Зміст- Тренд- Основні терміни- Managing state- Паралелізм- Засоби
  3. 3. Вчора
  4. 4. Сьогодні
  5. 5. Завтра
  6. 6. Що відбувається?- Ріст частоти CPU вповільнився- Через фізичні обмеження- Free lunch is over- ПЗ більше не стає швидшим саме по собі
  7. 7. Сучасні тренди- Manycore, multicore- GPGPU, GPU acceleration, heterogeneous computing- Distributed computing, HPC
  8. 8. Основні поняття- Concurrency - Many interleaved threads of control- Parallelism - Same result, but faster- Concurrency != Parallelism - It is not always necessary to care about concurrency while implementing parallelism- Multithreading- Asynchrony
  9. 9. Задачі- CPU-bound - number crunching- I/O-bound - network, disk
  10. 10. Стан- Shared - accessible by more than one thread - sharing is transitive- Private - used by single thread only
  11. 11. Task-based program Application Tasks (CPU, I/O)Runtime (queuing, scheduling)Processors (threads, processes)
  12. 12. Managing state
  13. 13. Isolation- Avoiding shared state- Own copy of state- Examples: - process isolation - intraprocess isolation - by convention
  14. 14. Immutability- Multiple read -- not a problem!- All functions are pure- Requires immutable collections- Functional way: Haskell, F#, Lisp
  15. 15. Synchronization- The only thing that remains to deal with shared mutable state- Kinds: - data synchronization - control synchronization
  16. 16. Data synchronization- Why? To avoid race conditions and data corruption- How? Mutual exclusion- Data remains consistent- Critical regions - locks, monitors, critical sections, spin locks- Code-centered - rather than associated with data
  17. 17. Critical region|Thread 1 |Thread 2|// ... |// ...|lock (locker) ||{ || // ... || data.Operation(); || // ... ||} ||// ... |lock (locker)| |{| | // ...| | data.Operation(); | // ...
  18. 18. Control synchronization- To coordinate control flow - exchange data - orchestrate threads- Waiting, notifications - spin waiting - events - alternative: continuations
  19. 19. Three ways to manage state- Isolation: simple, loosely coupled, highlyscalable, right data structures, locality- Immutability: avoids sync- Synchronization: complex, runtime overheads,contention- in that order
  20. 20. Паралелізм
  21. 21. Підходи до розбиття задач- Data parallelism- Task parallelism- Message based parallelism
  22. 22. Data parallelismHow?- Data is divided up among hardware processors- Same operation is performed on elements- Optionally -- final aggregation
  23. 23. Data parallelismWhen?- Large amounts of data- Processing operation is costly- or both
  24. 24. Data parallelismWhy?- To achieve speedup- For example, with GPU acceleration: - hours instead of days!
  25. 25. Data parallelismEmbarrassingly parallel problems- parallelizable loops- image processingNon-embarrassingly parallel problems- parallel QuickSort
  26. 26. Data parallelism ... ... Thread 1 Thread 2
  27. 27. Data parallelismStructured parallelism- Well defined begin and end points- Examples: - CoBegin - ForAll
  28. 28. CoBeginvar firstDataset = new DataItem[1000];var secondDataset = new DataItem[1000];var thirdDataset = new DataItem[1000];Parallel.Invoke( () => Process(firstDataset), () => Process(secondDataset), () => Process(thirdDataset) );
  29. 29. Parallel Forvar items = new DataItem[1000 * 1000];// ...Parallel.For(0, items.Length, i => { Process(items[i]); });
  30. 30. Parallel ForEachvar tickers = GetNasdaqTickersStream();Parallel.ForEach(tickers, ticker => { Process(ticker); });
  31. 31. Striped Partitioning ... Thread 1 Thread 2
  32. 32. Iterate complex data structuresvar tree = new TreeNode();// ...Parallel.ForEach( TraversePreOrder(tree), node => { Process(node); });
  33. 33. Iterate complex data Thread 1 Thread 2 ...
  34. 34. Declarative parallelismvar items = new DataItem[1000 * 1000];// ...var validItems = from item in items.AsParallel() let processedItem = Process(item) where processedItem.Property > 42 select Convert(processedItem);foreach (var item in validItems){ // ...}
  35. 35. Data parallelismChallenges- Partitioning- Scheduling- Ordering- Merging- Aggregation- Concurrency hazards: data races, contention
  36. 36. Task parallelismHow?- Programs are already functionally partitioned:statements, methods etc.- Run independent pieces in parallel- Control synchronization- State isolation
  37. 37. Task parallelismWhy?- To achieve speedup
  38. 38. Task parallelismKinds- Structured - clear begin and end points- Unstructured - often demands explicit synchronization
  39. 39. Fork/join- Fork: launch tasks asynchronously- Join: wait until they complete- CoBegin, ForAll- Recursive decomposition
  40. 40. Fork/join Task 1 Task 2 Task 3Seq Seq
  41. 41. Fork/joinParallel.Invoke( () => LoadDataFromFile(), () => SavePreviousDataToDB(), () => RenewOtherDataFromWebService());
  42. 42. Fork/joinTask loadData = Task.Factory.StartNew(() => { // ... });Task saveAnotherDataToDB = Task.Factory.StartNew(() => { // ... });// ...Task.WaitAll(loadData, saveAnotherDataToDB);// ...
  43. 43. Fork/joinvoid Walk(TreeNode node) { var tasks = new[] { Task.Factory.StartNew(() => Process(node.Value)), Task.Factory.StartNew(() => Walk(node.Left)), Task.Factory.StartNew(() => Walk(node.Right)) }; Task.WaitAll(tasks);}
  44. 44. Fork/join recursive Root Node LeftSeq Left Seq Right Right Node Left Right
  45. 45. Dataflow parallelism: FuturesTask<DataItem[]> loadDataFuture = Task.Factory.StartNew(() => { //... return LoadDataFromFile(); });var dataIdentifier = SavePreviousDataToDB();RenewOtherDataFromWebService(dataIdentifier);//...DisplayDataToUser(loadDataFuture.Result);
  46. 46. Dataflow parallelism: Futures FutureSeq Seq Seq
  47. 47. Dataflow parallelism: Futures Future Future FutureSeq Seq Seq Seq Seq
  48. 48. Continuations Task Task TaskSeq Seq Seq
  49. 49. Continuationsvar loadData = Task.Factory.StartNew(() => { return LoadDataFromFile(); });var writeToDB = loadData.ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); });var reportToUser = writeToDB.ContinueWith(t => { // ... });reportProgressToUser.Wait();
  50. 50. Producer/consumer pipelinereading parsing storing parsed lines DB lines
  51. 51. Producer/consumer pipelinelines parsed lines DB
  52. 52. Producer/consumervar lines = new BlockingCollection<string>();Task.Factory.StartNew(() => { foreach (var line in File.ReadLines(...)) lines.Add(line); lines.CompleteAdding(); });
  53. 53. Producer/consumervar dataItems = new BlockingCollection<DataItem>();Task.Factory.StartNew(() => { foreach (var line in lines.GetConsumingEnumerable() ) dataItems.Add(Parse(line)); dataItems.CompleteAdding(); });
  54. 54. Producer/consumervar dbTask = Task.Factory.StartNew(() => { foreach (var item in dataItems.GetConsumingEnumerable() ) WriteToDatabase(item); });dbTask.Wait();
  55. 55. Task parallelismChallenges- Scheduling- Cancellation- Exception handling- Concurrency hazards: deadlocks, livelocks,priority inversions etc.
  56. 56. Message based parallelism- Accessing shared state vs. local state- No distinction, unfortunately- Idea: encapsulate shared state changes into messages- Async events- Actors, agents
  57. 57. Засоби
  58. 58. Concurrent data structures- Concurrent Queues, Stacks, Sets, Lists- Blocking collections,- Work stealing queues- Lock free data structures- Immutable data structures
  59. 59. Synchronization primitives- Critical sections,- Monitors,- Auto- and Manual-Reset Events,- Coundown Events,- Mutexes,- Semaphores,- Timers,- RW locks- Barriers
  60. 60. Thread local state- A way to achieve isolationvar parser = new ThreadLocal<Parser>( () => CreateParser());Parallel.ForEach(items, item => parser.Value.Parse(item));
  61. 61. Thread poolsThreadPool.QueueUserWorkItem(_ => { // do some work });
  62. 62. AsyncTask.Factory.StartNew(() => { //... return LoadDataFromFile(); }) .ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); }) .ContinueWith(t => { // ... });
  63. 63. Asyncvar dataItems = await LoadDataFromFileAsync();textBox.Text = dataItems.Count.ToString();await WriteToDatabaseAsync(dataItems);// continue work
  64. 64. Технології- TPL, PLINQ, C# async, TPL Dataflow- PPL, Intel TBB, OpenMP- CUDA, OpenCL, C++ AMP- Actors, STM- Many others
  65. 65. Підсумок- Програмування для багатьох CPU- Concurrency != parallelism- CPU-bound vs. I/O-bound tasks- Private vs. shared state
  66. 66. Підсумок- Managing state: - Isolation - Immutability - Synchronization - Data: mutual exclusion - Control: notifications
  67. 67. Підсумок- Паралелізм: - Data parallelism: scalable - Task parallelism: less scalable - Message based parallelism
  68. 68. Підсумок- Data parallelism - CoBegin - Parallel ForAll - Parallel ForEach - Parallel ForEach over complex data structures - Declarative data parallelism- Challenges: partitioning, scheduling, ordering, merging, aggregation, concurrency hazards
  69. 69. Підсумок- Task parallelism: structured, unstructured - Fork/Join - CoBegin - Recursive decomposition - Futures - Continuations - Producer/consumer (pipelines)- Challenges: scheduling, cancellation, exceptions, concurrency hazards
  70. 70. Підсумок- Засоби/інструменти - Компілятори, бібліотеки - Concurrent data structures - Synchronization primitives - Thread local state - Thread pools - Async invocations - ...
  71. 71. Q/A

×