1b.1Types of Parallel ComputersTwo principal approaches:• Shared memory multiprocessor• Distributed memory multicomputerIT...
1b.2Shared MemoryMultiprocessor
1b.3Conventional ComputerConsists of a processor executing a program stored in a(main) memory:Each main memory location lo...
1b.4Shared Memory Multiprocessor SystemNatural way to extend single processor model - have multipleprocessors connected to...
1b.5Simplistic view of a small shared memorymultiprocessorExamples:• Dual Pentiums• Quad PentiumsProcessors Shared memoryBus
1b.6Real computer system have cache memory between the mainmemory and processors. Level 1 (L1) cache and Level 2 (L2) cach...
1b.7“Recent” innovation• Dual-core and multi-core processors• Two or more independent processors in onepackage• Actually a...
1b.8Single quad core shared memorymultiprocessorL2 CacheMemory controllerMemoryShared memoryChipProcessorL1 cacheProcessor...
1b.9Examples• Intel:– Core Dual processors -- Two processors in one packagesharing a common L2 Cache. 2005-2006– Intel Cor...
1b.10Multiple quad-core multiprocessors(example coit-grid05.uncc.edu)Memory controllerMemoryShared memoryL2 Cachepossible ...
1b.11Programming Shared MemoryMultiprocessorsSeveral possible ways1. Thread libraries - programmer decomposes program into...
1b.123. Use a modified sequential programming language -- addedsyntax to declare shared variables and specify parallelism....
1b.13Message-Passing MulticomputerComplete computers connected through aninterconnection network:ProcessorInterconnectionn...
1b.14Interconnection NetworksMany explored in the 1970s and 1980s• Limited and exhaustive interconnections• 2- and 3-dimen...
1b.15Networked Computers as aComputing Platform• A network of computers became a very attractivealternative to expensive s...
1b.16Key advantages:• Very high performance workstations and PCsreadily available at low cost.• The latest processors can ...
1b.17Beowulf Clusters*• A group of interconnected “commodity”computers achieving high performance withlow cost.• Typically...
1b.18Cluster Interconnects• Originally fast Ethernet on low cost clusters• Gigabit Ethernet - easy upgrade pathMore Specia...
1b.19Dedicated cluster with a master nodeand compute nodesUserMaster nodeCompute nodesDedicated ClusterEthernet interfaceS...
1b.20Software Tools for Clusters• Based upon message passing programming model• User-level libraries provided for explicit...
Next step• Learn the message passingprogramming model, some MPIroutines, write a message-passingprogram and test on the cl...
Upcoming SlideShare
Loading in …5
×

Paralle programming 2

472
-1

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
472
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Paralle programming 2

  1. 1. 1b.1Types of Parallel ComputersTwo principal approaches:• Shared memory multiprocessor• Distributed memory multicomputerITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2010. Aug 26, 2010
  2. 2. 1b.2Shared MemoryMultiprocessor
  3. 3. 1b.3Conventional ComputerConsists of a processor executing a program stored in a(main) memory:Each main memory location located by its address.Addresses start at 0 and extend to 2b- 1 when there areb bits (binary digits) in address.Main memoryProcessorInstructions (to processor)Data (to or from processor)
  4. 4. 1b.4Shared Memory Multiprocessor SystemNatural way to extend single processor model - have multipleprocessors connected to multiple memory modules, such thateach processor can access any memory module:ProcessorsProcessor-memoryInterconnectionsMemory moduleOneaddressspace
  5. 5. 1b.5Simplistic view of a small shared memorymultiprocessorExamples:• Dual Pentiums• Quad PentiumsProcessors Shared memoryBus
  6. 6. 1b.6Real computer system have cache memory between the mainmemory and processors. Level 1 (L1) cache and Level 2 (L2) cache.Example Quad Shared Memory MultiprocessorProcessorL2 CacheBus interfaceL1 cacheProcessorL2 CacheBus interfaceL1 cacheProcessorL2 CacheBus interfaceL1 cacheProcessorL2 CacheBus interfaceL1 cacheMemory controllerMemoryProcessor/memorybusShared memory
  7. 7. 1b.7“Recent” innovation• Dual-core and multi-core processors• Two or more independent processors in onepackage• Actually an old idea but not put into wide practiceuntil recently.• Since L1 cache is usually inside package and L2cache outside package, dual-/multi-core processorsusually share L2 cache.
  8. 8. 1b.8Single quad core shared memorymultiprocessorL2 CacheMemory controllerMemoryShared memoryChipProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cache
  9. 9. 1b.9Examples• Intel:– Core Dual processors -- Two processors in one packagesharing a common L2 Cache. 2005-2006– Intel Core 2 family dual cores, with quad core from Nov2006 onwards– Core i7 processors replacing Core 2 family - Quad coreNov 2008– Intel Teraflops Research Chip (Polaris), a 3.16 GHz, 80-core processor prototype.• Xbox 360 game console -- triple core PowerPCmicroprocessor.• PlayStation 3 Cell processor -- 9 core design.References and more information -- wikipedia
  10. 10. 1b.10Multiple quad-core multiprocessors(example coit-grid05.uncc.edu)Memory controllerMemoryShared memoryL2 Cachepossible L3 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cache
  11. 11. 1b.11Programming Shared MemoryMultiprocessorsSeveral possible ways1. Thread libraries - programmer decomposes program intoindividual parallel sequences, (threads), each being ableto access shared variables declared outside threads.Example Pthreads2. Higher level library functions and preprocessor compilerdirectives to declare shared variables and specifyparallelism. Uses threads.Example OpenMP - industry standard. Consists oflibrary functions, compiler directives, and environmentvariables - needs OpenMP compiler
  12. 12. 1b.123. Use a modified sequential programming language -- addedsyntax to declare shared variables and specify parallelism.Example UPC (Unified Parallel C) - needs a UPCcompiler.4. Use a specially designed parallel programming language --with syntax to express parallelism. Compiler automaticallycreates executable code for each processor (not nowcommon).5. Use a regular sequential programming language such as Cand ask parallelizing compiler to convert it into parallelexecutable code. Also not now common.
  13. 13. 1b.13Message-Passing MulticomputerComplete computers connected through aninterconnection network:ProcessorInterconnectionnetworkLocalComputersMessagesmemory
  14. 14. 1b.14Interconnection NetworksMany explored in the 1970s and 1980s• Limited and exhaustive interconnections• 2- and 3-dimensional meshes• Hypercube• Using Switches:– Crossbar– Trees– Multistage interconnection networks
  15. 15. 1b.15Networked Computers as aComputing Platform• A network of computers became a very attractivealternative to expensive supercomputers andparallel computer systems for high-performancecomputing in early 1990s.• Several early projects. Notable:– Berkeley NOW (network of workstations)project.– NASA Beowulf project.
  16. 16. 1b.16Key advantages:• Very high performance workstations and PCsreadily available at low cost.• The latest processors can easily beincorporated into the system as they becomeavailable.• Existing software can be used or modified.
  17. 17. 1b.17Beowulf Clusters*• A group of interconnected “commodity”computers achieving high performance withlow cost.• Typically using commodity interconnects -high speed Ethernet, and Linux OS.* Beowulf comes from name given by NASA GoddardSpace Flight Center cluster project.
  18. 18. 1b.18Cluster Interconnects• Originally fast Ethernet on low cost clusters• Gigabit Ethernet - easy upgrade pathMore Specialized/Higher Performance• Myrinet - 2.4 Gbits/sec - disadvantage: single vendor• cLan• SCI (Scalable Coherent Interface)• QNet• Infiniband - may be important as infininbandinterfaces may be integrated on next generation PCs
  19. 19. 1b.19Dedicated cluster with a master nodeand compute nodesUserMaster nodeCompute nodesDedicated ClusterEthernet interfaceSwitchExternal networkComputersLocal network
  20. 20. 1b.20Software Tools for Clusters• Based upon message passing programming model• User-level libraries provided for explicitly specifyingmessages to be sent between executing processes oneach computer .• Use with regular programming languages (C, C++, ...).• Can be quite difficult to program correctly as we shallsee.
  21. 21. Next step• Learn the message passingprogramming model, some MPIroutines, write a message-passingprogram and test on the cluster.1b.21
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×