SlideShare a Scribd company logo
1 of 21
1b.1
Types of Parallel Computers
Two principal approaches:
• Shared memory multiprocessor
• Distributed memory multicomputer
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2010. Aug 26, 2010
1b.2
Shared Memory
Multiprocessor
1b.3
Conventional Computer
Consists of a processor executing a program stored in a
(main) memory:
Each main memory location located by its address.
Addresses start at 0 and extend to 2b
- 1 when there are
b bits (binary digits) in address.
Main memory
Processor
Instructions (to processor)
Data (to or from processor)
1b.4
Shared Memory Multiprocessor System
Natural way to extend single processor model - have multiple
processors connected to multiple memory modules, such that
each processor can access any memory module:
Processors
Processor-memory
Interconnections
Memory module
One
address
space
1b.5
Simplistic view of a small shared memory
multiprocessor
Examples:
• Dual Pentiums
• Quad Pentiums
Processors Shared memory
Bus
1b.6
Real computer system have cache memory between the main
memory and processors. Level 1 (L1) cache and Level 2 (L2) cache.
Example Quad Shared Memory Multiprocessor
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Memory controller
Memory
Processor/
memory
bus
Shared memory
1b.7
“Recent” innovation
• Dual-core and multi-core processors
• Two or more independent processors in one
package
• Actually an old idea but not put into wide practice
until recently.
• Since L1 cache is usually inside package and L2
cache outside package, dual-/multi-core processors
usually share L2 cache.
1b.8
Single quad core shared memory
multiprocessor
L2 Cache
Memory controller
Memory
Shared memory
Chip
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
1b.9
Examples
• Intel:
– Core Dual processors -- Two processors in one package
sharing a common L2 Cache. 2005-2006
– Intel Core 2 family dual cores, with quad core from Nov
2006 onwards
– Core i7 processors replacing Core 2 family - Quad core
Nov 2008
– Intel Teraflops Research Chip (Polaris), a 3.16 GHz, 80-
core processor prototype.
• Xbox 360 game console -- triple core PowerPC
microprocessor.
• PlayStation 3 Cell processor -- 9 core design.
References and more information -- wikipedia
1b.10
Multiple quad-core multiprocessors
(example coit-grid05.uncc.edu)
Memory controller
Memory
Shared memory
L2 Cache
possible L3 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
1b.11
Programming Shared Memory
Multiprocessors
Several possible ways
1. Thread libraries - programmer decomposes program into
individual parallel sequences, (threads), each being able
to access shared variables declared outside threads.
Example Pthreads
2. Higher level library functions and preprocessor compiler
directives to declare shared variables and specify
parallelism. Uses threads.
Example OpenMP - industry standard. Consists of
library functions, compiler directives, and environment
variables - needs OpenMP compiler
1b.12
3. Use a modified sequential programming language -- added
syntax to declare shared variables and specify parallelism.
Example UPC (Unified Parallel C) - needs a UPC
compiler.
4. Use a specially designed parallel programming language --
with syntax to express parallelism. Compiler automatically
creates executable code for each processor (not now
common).
5. Use a regular sequential programming language such as C
and ask parallelizing compiler to convert it into parallel
executable code. Also not now common.
1b.13
Message-Passing Multicomputer
Complete computers connected through an
interconnection network:
Processor
Interconnection
network
Local
Computers
Messages
memory
1b.14
Interconnection Networks
Many explored in the 1970s and 1980s
• Limited and exhaustive interconnections
• 2- and 3-dimensional meshes
• Hypercube
• Using Switches:
– Crossbar
– Trees
– Multistage interconnection networks
1b.15
Networked Computers as a
Computing Platform
• A network of computers became a very attractive
alternative to expensive supercomputers and
parallel computer systems for high-performance
computing in early 1990s.
• Several early projects. Notable:
– Berkeley NOW (network of workstations)
project.
– NASA Beowulf project.
1b.16
Key advantages:
• Very high performance workstations and PCs
readily available at low cost.
• The latest processors can easily be
incorporated into the system as they become
available.
• Existing software can be used or modified.
1b.17
Beowulf Clusters*
• A group of interconnected “commodity”
computers achieving high performance with
low cost.
• Typically using commodity interconnects -
high speed Ethernet, and Linux OS.
* Beowulf comes from name given by NASA Goddard
Space Flight Center cluster project.
1b.18
Cluster Interconnects
• Originally fast Ethernet on low cost clusters
• Gigabit Ethernet - easy upgrade path
More Specialized/Higher Performance
• Myrinet - 2.4 Gbits/sec - disadvantage: single vendor
• cLan
• SCI (Scalable Coherent Interface)
• QNet
• Infiniband - may be important as infininband
interfaces may be integrated on next generation PCs
1b.19
Dedicated cluster with a master node
and compute nodes
User
Master node
Compute nodes
Dedicated Cluster
Ethernet interface
Switch
External network
Computers
Local network
1b.20
Software Tools for Clusters
• Based upon message passing programming model
• User-level libraries provided for explicitly specifying
messages to be sent between executing processes on
each computer .
• Use with regular programming languages (C, C++, ...).
• Can be quite difficult to program correctly as we shall
see.
Next step
• Learn the message passing
programming model, some MPI
routines, write a message-passing
program and test on the cluster.
1b.21

More Related Content

What's hot

Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programmingShaveta Banda
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computingMehul Patel
 
Parallel computing
Parallel computingParallel computing
Parallel computingvirend111
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors pptSiddhartha Anand
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
Introduction 1
Introduction 1Introduction 1
Introduction 1Yasir Khan
 
Multiprocessor
MultiprocessorMultiprocessor
MultiprocessorA B Shinde
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1Mr SMAK
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Sudarshan Mondal
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platformsSyed Zaid Irshad
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreadingFraboni Ec
 
Cache coherence problem and its solutions
Cache coherence problem and its solutionsCache coherence problem and its solutions
Cache coherence problem and its solutionsMajid Saleem
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computingNiranjana Ambadi
 
Parallel computing and its applications
Parallel computing and its applicationsParallel computing and its applications
Parallel computing and its applicationsBurhan Ahmed
 

What's hot (20)

parallel processing
parallel processingparallel processing
parallel processing
 
Lecture1
Lecture1Lecture1
Lecture1
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors ppt
 
Mimd
MimdMimd
Mimd
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Introduction 1
Introduction 1Introduction 1
Introduction 1
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
 
Cache coherence problem and its solutions
Cache coherence problem and its solutionsCache coherence problem and its solutions
Cache coherence problem and its solutions
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
 
Parallelism
ParallelismParallelism
Parallelism
 
Parallel computing and its applications
Parallel computing and its applicationsParallel computing and its applications
Parallel computing and its applications
 

Viewers also liked

Plumbing Point Loma- Frozen Pipes
Plumbing Point Loma- Frozen PipesPlumbing Point Loma- Frozen Pipes
Plumbing Point Loma- Frozen PipesPeter Somhegyi
 
Android and Smartphones
Android and SmartphonesAndroid and Smartphones
Android and SmartphonesPhilip David
 
第3回twitter研究会「閉会の挨拶」
第3回twitter研究会「閉会の挨拶」第3回twitter研究会「閉会の挨拶」
第3回twitter研究会「閉会の挨拶」Tomohiro Nishitani
 

Viewers also liked (20)

November 2012 announcements
November 2012 announcementsNovember 2012 announcements
November 2012 announcements
 
Announcements for july 2013
Announcements for july 2013Announcements for july 2013
Announcements for july 2013
 
Announcements for Feb 19 2012
Announcements for Feb 19 2012Announcements for Feb 19 2012
Announcements for Feb 19 2012
 
Announcements for March 11 2012
Announcements for March 11 2012Announcements for March 11 2012
Announcements for March 11 2012
 
Command GM
Command GMCommand GM
Command GM
 
Plumbing Point Loma- Frozen Pipes
Plumbing Point Loma- Frozen PipesPlumbing Point Loma- Frozen Pipes
Plumbing Point Loma- Frozen Pipes
 
February 2013 announcements
February 2013 announcementsFebruary 2013 announcements
February 2013 announcements
 
September 2012 announcements
September 2012 announcementsSeptember 2012 announcements
September 2012 announcements
 
July 2012 announcements
July 2012 announcementsJuly 2012 announcements
July 2012 announcements
 
Android and Smartphones
Android and SmartphonesAndroid and Smartphones
Android and Smartphones
 
September 2012 announcements
September 2012 announcementsSeptember 2012 announcements
September 2012 announcements
 
Announcements for june 2014
Announcements for june 2014Announcements for june 2014
Announcements for june 2014
 
Horror film trailer analysis
Horror film trailer analysisHorror film trailer analysis
Horror film trailer analysis
 
Announcements for june 2013
Announcements for june 2013Announcements for june 2013
Announcements for june 2013
 
November 2012 announcements
November 2012 announcementsNovember 2012 announcements
November 2012 announcements
 
赤字決算の対処法
赤字決算の対処法赤字決算の対処法
赤字決算の対処法
 
Announcements for May 2014
Announcements for May 2014Announcements for May 2014
Announcements for May 2014
 
Announcements for june 2013
Announcements for june 2013Announcements for june 2013
Announcements for june 2013
 
GCC
GCCGCC
GCC
 
第3回twitter研究会「閉会の挨拶」
第3回twitter研究会「閉会の挨拶」第3回twitter研究会「閉会の挨拶」
第3回twitter研究会「閉会の挨拶」
 

Similar to Paralle programming 2

finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdfNazarAhmadAlkhidir
 
introduction.pdf
introduction.pdfintroduction.pdf
introduction.pdfxiso
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiAnkit Raj
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architecturesnextlib
 
Lecture 4.pptx
Lecture 4.pptxLecture 4.pptx
Lecture 4.pptxinfomerlin
 
Building Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARMBuilding Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARMSherif Mousa
 
Intel new processors
Intel new processorsIntel new processors
Intel new processorszaid_b
 
Processors and its Types
Processors and its TypesProcessors and its Types
Processors and its TypesNimrah Shahbaz
 
Processes and Threads in Windows Vista
Processes and Threads in Windows VistaProcesses and Threads in Windows Vista
Processes and Threads in Windows VistaTrinh Phuc Tho
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)Yuuki Takano
 
Computer architecture lesson 1
Computer architecture lesson 1Computer architecture lesson 1
Computer architecture lesson 1AbdulwadoodKhan9
 
Linux one vs x86 18 july
Linux one vs x86 18 julyLinux one vs x86 18 july
Linux one vs x86 18 julyDiego Rodriguez
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptMohmdUmer
 
IT Book of Knowledge
IT Book of KnowledgeIT Book of Knowledge
IT Book of KnowledgePhil Primeau
 
Ca lecture 03
Ca lecture 03Ca lecture 03
Ca lecture 03Haris456
 

Similar to Paralle programming 2 (20)

finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
 
introduction.pdf
introduction.pdfintroduction.pdf
introduction.pdf
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash Prajapati
 
The Linux System
The Linux SystemThe Linux System
The Linux System
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architectures
 
Cluster computer
Cluster  computerCluster  computer
Cluster computer
 
General Purpose GPU Computing
General Purpose GPU ComputingGeneral Purpose GPU Computing
General Purpose GPU Computing
 
Lecture 4.pptx
Lecture 4.pptxLecture 4.pptx
Lecture 4.pptx
 
Building Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARMBuilding Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARM
 
Ch04 threads
Ch04 threadsCh04 threads
Ch04 threads
 
Intel new processors
Intel new processorsIntel new processors
Intel new processors
 
Processors and its Types
Processors and its TypesProcessors and its Types
Processors and its Types
 
Processes and Threads in Windows Vista
Processes and Threads in Windows VistaProcesses and Threads in Windows Vista
Processes and Threads in Windows Vista
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
 
Computer architecture lesson 1
Computer architecture lesson 1Computer architecture lesson 1
Computer architecture lesson 1
 
Linux one vs x86
Linux one vs x86 Linux one vs x86
Linux one vs x86
 
Linux one vs x86 18 july
Linux one vs x86 18 julyLinux one vs x86 18 july
Linux one vs x86 18 july
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.ppt
 
IT Book of Knowledge
IT Book of KnowledgeIT Book of Knowledge
IT Book of Knowledge
 
Ca lecture 03
Ca lecture 03Ca lecture 03
Ca lecture 03
 

More from Anshul Sharma (11)

Understanding concurrency
Understanding concurrencyUnderstanding concurrency
Understanding concurrency
 
Interm codegen
Interm codegenInterm codegen
Interm codegen
 
Programming using Open Mp
Programming using Open MpProgramming using Open Mp
Programming using Open Mp
 
Open MPI 2
Open MPI 2Open MPI 2
Open MPI 2
 
Open MPI
Open MPIOpen MPI
Open MPI
 
Parallel programming
Parallel programmingParallel programming
Parallel programming
 
Cuda 3
Cuda 3Cuda 3
Cuda 3
 
Cuda 2
Cuda 2Cuda 2
Cuda 2
 
Cuda intro
Cuda introCuda intro
Cuda intro
 
Des
DesDes
Des
 
Intoduction to Linux
Intoduction to LinuxIntoduction to Linux
Intoduction to Linux
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Paralle programming 2

  • 1. 1b.1 Types of Parallel Computers Two principal approaches: • Shared memory multiprocessor • Distributed memory multicomputer ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2010. Aug 26, 2010
  • 3. 1b.3 Conventional Computer Consists of a processor executing a program stored in a (main) memory: Each main memory location located by its address. Addresses start at 0 and extend to 2b - 1 when there are b bits (binary digits) in address. Main memory Processor Instructions (to processor) Data (to or from processor)
  • 4. 1b.4 Shared Memory Multiprocessor System Natural way to extend single processor model - have multiple processors connected to multiple memory modules, such that each processor can access any memory module: Processors Processor-memory Interconnections Memory module One address space
  • 5. 1b.5 Simplistic view of a small shared memory multiprocessor Examples: • Dual Pentiums • Quad Pentiums Processors Shared memory Bus
  • 6. 1b.6 Real computer system have cache memory between the main memory and processors. Level 1 (L1) cache and Level 2 (L2) cache. Example Quad Shared Memory Multiprocessor Processor L2 Cache Bus interface L1 cache Processor L2 Cache Bus interface L1 cache Processor L2 Cache Bus interface L1 cache Processor L2 Cache Bus interface L1 cache Memory controller Memory Processor/ memory bus Shared memory
  • 7. 1b.7 “Recent” innovation • Dual-core and multi-core processors • Two or more independent processors in one package • Actually an old idea but not put into wide practice until recently. • Since L1 cache is usually inside package and L2 cache outside package, dual-/multi-core processors usually share L2 cache.
  • 8. 1b.8 Single quad core shared memory multiprocessor L2 Cache Memory controller Memory Shared memory Chip Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache
  • 9. 1b.9 Examples • Intel: – Core Dual processors -- Two processors in one package sharing a common L2 Cache. 2005-2006 – Intel Core 2 family dual cores, with quad core from Nov 2006 onwards – Core i7 processors replacing Core 2 family - Quad core Nov 2008 – Intel Teraflops Research Chip (Polaris), a 3.16 GHz, 80- core processor prototype. • Xbox 360 game console -- triple core PowerPC microprocessor. • PlayStation 3 Cell processor -- 9 core design. References and more information -- wikipedia
  • 10. 1b.10 Multiple quad-core multiprocessors (example coit-grid05.uncc.edu) Memory controller Memory Shared memory L2 Cache possible L3 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache
  • 11. 1b.11 Programming Shared Memory Multiprocessors Several possible ways 1. Thread libraries - programmer decomposes program into individual parallel sequences, (threads), each being able to access shared variables declared outside threads. Example Pthreads 2. Higher level library functions and preprocessor compiler directives to declare shared variables and specify parallelism. Uses threads. Example OpenMP - industry standard. Consists of library functions, compiler directives, and environment variables - needs OpenMP compiler
  • 12. 1b.12 3. Use a modified sequential programming language -- added syntax to declare shared variables and specify parallelism. Example UPC (Unified Parallel C) - needs a UPC compiler. 4. Use a specially designed parallel programming language -- with syntax to express parallelism. Compiler automatically creates executable code for each processor (not now common). 5. Use a regular sequential programming language such as C and ask parallelizing compiler to convert it into parallel executable code. Also not now common.
  • 13. 1b.13 Message-Passing Multicomputer Complete computers connected through an interconnection network: Processor Interconnection network Local Computers Messages memory
  • 14. 1b.14 Interconnection Networks Many explored in the 1970s and 1980s • Limited and exhaustive interconnections • 2- and 3-dimensional meshes • Hypercube • Using Switches: – Crossbar – Trees – Multistage interconnection networks
  • 15. 1b.15 Networked Computers as a Computing Platform • A network of computers became a very attractive alternative to expensive supercomputers and parallel computer systems for high-performance computing in early 1990s. • Several early projects. Notable: – Berkeley NOW (network of workstations) project. – NASA Beowulf project.
  • 16. 1b.16 Key advantages: • Very high performance workstations and PCs readily available at low cost. • The latest processors can easily be incorporated into the system as they become available. • Existing software can be used or modified.
  • 17. 1b.17 Beowulf Clusters* • A group of interconnected “commodity” computers achieving high performance with low cost. • Typically using commodity interconnects - high speed Ethernet, and Linux OS. * Beowulf comes from name given by NASA Goddard Space Flight Center cluster project.
  • 18. 1b.18 Cluster Interconnects • Originally fast Ethernet on low cost clusters • Gigabit Ethernet - easy upgrade path More Specialized/Higher Performance • Myrinet - 2.4 Gbits/sec - disadvantage: single vendor • cLan • SCI (Scalable Coherent Interface) • QNet • Infiniband - may be important as infininband interfaces may be integrated on next generation PCs
  • 19. 1b.19 Dedicated cluster with a master node and compute nodes User Master node Compute nodes Dedicated Cluster Ethernet interface Switch External network Computers Local network
  • 20. 1b.20 Software Tools for Clusters • Based upon message passing programming model • User-level libraries provided for explicitly specifying messages to be sent between executing processes on each computer . • Use with regular programming languages (C, C++, ...). • Can be quite difficult to program correctly as we shall see.
  • 21. Next step • Learn the message passing programming model, some MPI routines, write a message-passing program and test on the cluster. 1b.21