Your SlideShare is downloading. ×
0
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Paralle programming 2
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Paralle programming 2

305

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
305
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 1b.1Types of Parallel ComputersTwo principal approaches:• Shared memory multiprocessor• Distributed memory multicomputerITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2010. Aug 26, 2010
  • 2. 1b.2Shared MemoryMultiprocessor
  • 3. 1b.3Conventional ComputerConsists of a processor executing a program stored in a(main) memory:Each main memory location located by its address.Addresses start at 0 and extend to 2b- 1 when there areb bits (binary digits) in address.Main memoryProcessorInstructions (to processor)Data (to or from processor)
  • 4. 1b.4Shared Memory Multiprocessor SystemNatural way to extend single processor model - have multipleprocessors connected to multiple memory modules, such thateach processor can access any memory module:ProcessorsProcessor-memoryInterconnectionsMemory moduleOneaddressspace
  • 5. 1b.5Simplistic view of a small shared memorymultiprocessorExamples:• Dual Pentiums• Quad PentiumsProcessors Shared memoryBus
  • 6. 1b.6Real computer system have cache memory between the mainmemory and processors. Level 1 (L1) cache and Level 2 (L2) cache.Example Quad Shared Memory MultiprocessorProcessorL2 CacheBus interfaceL1 cacheProcessorL2 CacheBus interfaceL1 cacheProcessorL2 CacheBus interfaceL1 cacheProcessorL2 CacheBus interfaceL1 cacheMemory controllerMemoryProcessor/memorybusShared memory
  • 7. 1b.7“Recent” innovation• Dual-core and multi-core processors• Two or more independent processors in onepackage• Actually an old idea but not put into wide practiceuntil recently.• Since L1 cache is usually inside package and L2cache outside package, dual-/multi-core processorsusually share L2 cache.
  • 8. 1b.8Single quad core shared memorymultiprocessorL2 CacheMemory controllerMemoryShared memoryChipProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cache
  • 9. 1b.9Examples• Intel:– Core Dual processors -- Two processors in one packagesharing a common L2 Cache. 2005-2006– Intel Core 2 family dual cores, with quad core from Nov2006 onwards– Core i7 processors replacing Core 2 family - Quad coreNov 2008– Intel Teraflops Research Chip (Polaris), a 3.16 GHz, 80-core processor prototype.• Xbox 360 game console -- triple core PowerPCmicroprocessor.• PlayStation 3 Cell processor -- 9 core design.References and more information -- wikipedia
  • 10. 1b.10Multiple quad-core multiprocessors(example coit-grid05.uncc.edu)Memory controllerMemoryShared memoryL2 Cachepossible L3 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cacheProcessorL1 cache
  • 11. 1b.11Programming Shared MemoryMultiprocessorsSeveral possible ways1. Thread libraries - programmer decomposes program intoindividual parallel sequences, (threads), each being ableto access shared variables declared outside threads.Example Pthreads2. Higher level library functions and preprocessor compilerdirectives to declare shared variables and specifyparallelism. Uses threads.Example OpenMP - industry standard. Consists oflibrary functions, compiler directives, and environmentvariables - needs OpenMP compiler
  • 12. 1b.123. Use a modified sequential programming language -- addedsyntax to declare shared variables and specify parallelism.Example UPC (Unified Parallel C) - needs a UPCcompiler.4. Use a specially designed parallel programming language --with syntax to express parallelism. Compiler automaticallycreates executable code for each processor (not nowcommon).5. Use a regular sequential programming language such as Cand ask parallelizing compiler to convert it into parallelexecutable code. Also not now common.
  • 13. 1b.13Message-Passing MulticomputerComplete computers connected through aninterconnection network:ProcessorInterconnectionnetworkLocalComputersMessagesmemory
  • 14. 1b.14Interconnection NetworksMany explored in the 1970s and 1980s• Limited and exhaustive interconnections• 2- and 3-dimensional meshes• Hypercube• Using Switches:– Crossbar– Trees– Multistage interconnection networks
  • 15. 1b.15Networked Computers as aComputing Platform• A network of computers became a very attractivealternative to expensive supercomputers andparallel computer systems for high-performancecomputing in early 1990s.• Several early projects. Notable:– Berkeley NOW (network of workstations)project.– NASA Beowulf project.
  • 16. 1b.16Key advantages:• Very high performance workstations and PCsreadily available at low cost.• The latest processors can easily beincorporated into the system as they becomeavailable.• Existing software can be used or modified.
  • 17. 1b.17Beowulf Clusters*• A group of interconnected “commodity”computers achieving high performance withlow cost.• Typically using commodity interconnects -high speed Ethernet, and Linux OS.* Beowulf comes from name given by NASA GoddardSpace Flight Center cluster project.
  • 18. 1b.18Cluster Interconnects• Originally fast Ethernet on low cost clusters• Gigabit Ethernet - easy upgrade pathMore Specialized/Higher Performance• Myrinet - 2.4 Gbits/sec - disadvantage: single vendor• cLan• SCI (Scalable Coherent Interface)• QNet• Infiniband - may be important as infininbandinterfaces may be integrated on next generation PCs
  • 19. 1b.19Dedicated cluster with a master nodeand compute nodesUserMaster nodeCompute nodesDedicated ClusterEthernet interfaceSwitchExternal networkComputersLocal network
  • 20. 1b.20Software Tools for Clusters• Based upon message passing programming model• User-level libraries provided for explicitly specifyingmessages to be sent between executing processes oneach computer .• Use with regular programming languages (C, C++, ...).• Can be quite difficult to program correctly as we shallsee.
  • 21. Next step• Learn the message passingprogramming model, some MPIroutines, write a message-passingprogram and test on the cluster.1b.21

×