SlideShare a Scribd company logo
1
Introduction
Parallel and Distributed Computing
Lecture 1-2 / 18
High Performance Computing most
generally refers to the practice of
aggregating computing power in a
way that delivers much higher
performance than one could get out
of a typical desktop computer or
workstation in order to solve large
problems in science, engineering, or
business.
2
Topics
 Introduction - today’s lecture
 System Architectures (Single Instruction - Single Data, Single Instruction -
Multiple Data, Multiple Instruction - Multiple Data, Shared Memory, Distributed
Memory, Cluster, Multiple Instruction - Single Data)
 Performance Analysis of parallel calculations (the speedup, efficiency, time
execution of algorithm…)
 Parallel numerical methods (Principles of Parallel Algorithm Design, Analytical
Modeling of Parallel Programs, Matrices Operations, Matrix-Vector Operations,
Graph Algorithms…)
 Software (Programming Using the Message-Passing Interface, OpenMP, CUDA,
Corba…)
3
What will you learn today?
 Why Use Parallel Computing?
 Motivation for parallelism (Moor’s law)
 What is traditional programming view?
 What is parallel computing?
 What is distributed computing?
 Concepts and terminology
von Neumann Computer Architecture
4
What is traditional programming view?
 Von Neumann View
- Program Counter + Registers = Thread/process
- Sequential change of Machine state
 Comprised of four main components:
 Memory
 Control Unit
 Arithmetic Logic Unit
 Input/Output
5
Von Neumann Architecture
 Read/write, random access memory is used to store both program
instructions and data
 Program instructions tell the computer to do something
 Data is simply information to be used by the program
 Control unit fetches instructions/data from memory, decodes the
instructions and then sequentially coordinates operations to accomplish
the programmed task.
 Arithmetic Unit performs basic arithmetic operations.
 Input/Output is the interface to the human operator.
6
All of the algorithms we’ve seen so far are
sequential:
• They have one “thread” of execution
• One step follows another in sequence
• One processor is all that is needed to run
the algorithm
Traditional (sequential) Processing View
7
The Computational Power Argument
Moore's law states [1965]:
2X transistors/Chip Every 1.5 or 2 years
Microprocessors have become smaller,
denser, and more powerful.
Gordon Moore is a co-founder of Intel.
What problems are there ?
With the increased use of computers in every sphere of
human activity, computer scientists are faced with two
crucial issues today:
 Processing has to be done faster like never before
 Larger or complex computation problems need to
be solved
8
What problems are there ?
 Increasing the number of transistors as per Moore’s
Law isn’t a solution, as it also increases the power
consumption.
 Power consumption causes a problem of processor
heating…
The perfect solution is PARALLELISM - in
hardware as well as software.
9
What is PARALLELISM ?
PARALLELISM is a form of computation in
which many instructions are carried out
simultaneously, operating on the principle that
large problems can often be divided into smaller
ones, which are then solved concurrently (in
parallel).
10
12
Why Use PARALLELISM ?
Save time and/or money
Expl: Parallel clusters can be built from cheap components
Solve larger problems
Expl: Many problems are so large and/or complex that it is impractical or
impossible to solve them on a single computer, especially given limited
computer memory
Provide concurrency
Expl: Multiple computing resources can do many things simultaneously.
Use of non-local resources
Limits to serial computing
Available memory
Performance
We can run…
 Larger problems
 Faster
 More cases
Parallel programming view
 Parallel computing is a form of computation in which many
calculations are carried out simultaneously.
 In the simplest sense, it is the simultaneous use of multiple
compute resources to solve a computational problem:
 1.To be run using multiple CPUs
 2.A problem is broken into discrete parts that can be solved
concurrently
 3.Each part is further broken down to a series of instructions
 4.Instructions from each part execute simultaneously on
different CPUs
13
Parallel / Distributed ( Cluster) / Grid
Computing
 Parallel computing: use of multiple computers or processors working
together on a common task.
 Each processor works on its section of the problem.
 Processors can exchange information
 Distributed (cluster) computing is where several different computers
(processing elements) work separately and they are connected by a network.
 Distributed computers are highly scalable.
 Grid Computing makes use of computers communicating over the
Internet to work on a given problem (so, that in some respects they can be
regarded as a single computer).
14
Parallel Computer Memory
Architectures
 Shared Memory
 Uniform Memory Access (UMA)
 Non-Uniform Memory Access (NUMA)
 Distributed Memory
 Hybrid Distributed-Shared Memory
18
Shared Memory
General Characteristics:
• Shared memory parallel computers vary widely, but generally have in common
the ability for all processors to access all memory as global address space.
• Multiple processors can operate independently but share the same memory
resources.
• Changes in a memory location effected by one processor are visible to all
other processors.
• Shared memory machines can be divided into two main classes based upon
memory access times: UMA and NUMA.
19
Shared Memory (UMA)
20
Uniform Memory Access (UMA):
 Identical processors, Symmetric Multiprocessor
(SMP)
 Equal access and access times to memory
 Sometimes called CC-UMA - Cache Coherent
UMA.
Cache coherent means if one processor updates
a location in shared memory, all the other
processors know about the update.
Shared Memory (NUMA)
Non-Uniform Memory Access (NUMA):
 Often made by physically linking two or more SMPs
 One SMP can directly access memory of another SMP
 Not all processors have equal access time to all memories
 Memory access across link is slower
 If cache coherency is maintained, then may also be called CC-
NUMA - Cache Coherent NUMA
21
Distributed Memory
Distributed memory systems require a communication network to connect inter-
processor memory.
 Processors have their own local memory.
 Because each processor has its own local memory, it operates independently.
Hence, the concept of cache coherency does not apply.
 When a processor needs access to data in another processor, it is usually the
task of the programmer to explicitly define how and when data is
communicated.
22
Hybrid Distributed-Shared Memory
 The largest and fastest computers in the world today employ both shared and
distributed memory architectures.
 The shared memory component is usually a cache coherent SMP machine.
Processors on a given SMP can address that machines memory as global.
 The distributed memory component is the networking of multiple SMPs. SMPs
know only about their own memory - not the memory on another SMP.
Therefore, network communications are required to move data from one SMP
to another.
23
Key Difference Between Data And Task
Parallelism
Data Parallelism
 It is the division of
threads(processes) or
instructions or tasks
internally into sub-parts for
execution.
 A task ‘A’ is divided into
sub-parts and then
processed.
24
Task Parallelism
 It is the divisions among
threads (processes) or
instructions or tasks
themselves for execution.
 A task ‘A’ and task ‘B’ are
processed separately by
different processors.
Implementation Of Parallel Computing
In Software
 When implemented in software(or rather algorithms), the
terminology calls it ‘parallel programming’.
 An algorithm is split into pieces and then executed, as seen
earlier.
 Important Points In Parallel Programming
 Dependencies - A typical scenario when line 6 of an algorithm is
dependent on lines 2,3,4 and 5
 Application Checkpoints - Just like saving the algorithm, or like creating
a backup point.
 Automatic Parallelisation - Identifying dependencies and parallelising
algorithms automatically. This has achieved limited success.
25
Implementation Of Parallel Computing
In Hardware
 When implemented in hardware, it is called as ‘parallel
processing’.
 Typically, when a chunk of load for execution is divided for
processing by units like cores, processors, CPUs, etc.
26
27
Who is doing Parallel Computing?
What are they using it for?
Physics is parallel.
The Human World is parallel too!
Sequence is unusual
Computer programs = models, distributed processes, increasingly parallel
28
Application Examples with
Massive Parallelism
Artificial Intelligence and Automation
AI is the intelligence exhibited by machines or software.
AI systems requires large amount of parallel computing
for which they are used.
1.Image processing
2.Expert Systems
3.Natural Language Processing(NLP)
4.Pattern Recognition
29
Application Examples with
Massive Parallelism
Genetic Engineering
Several of these analysis produce huge amounts of
information which becomes difficult to handle using single
processing units because of which parallel processing
algorithms are used
30
Application Examples with
Massive Parallelism
Medical Applications
 Parallel computing is used in medical image processing
 Used for scanning human body and scanning human brain
 Used in MRI reconstruction
 Used for vertebra detection and segmentation in X-ray
images
 Used for brain fiber tracking
31
Impediments to Parallel Computing
 Algorithm development is harder
—complexity of specifying and coordinating concurrent activities
 Software development is much harder
—lack of standardized & effective development tools, programming models,
and environments
 Rapid changes in computer system architecture
—today’s hot parallel algorithm may not be suitable for tomorrow’s parallel
computer!
32
Next lecture overview
 Some General Parallel Terminology
 Flynn's Taxonomy
33
Test questions
 What is traditional programming view?
 Who is doing Parallel Computing?
 What are they using it for?
 Types of Parallel Computer Hardware.
34
Textbooks
• Course Textbook:
1. Ananth Grama, George Karypis, Vipin Kumar, Anshul
Gupta “Introduction to Parallel Computing” (2nd Edition)
2. Marc Snir, William Gropp “MPI: The Complete
Reference (2-volume set)”
3. Victor Fijkhout with Edmond Chow, Robert Van De
Geijn “Introduction To High-Performance Scientific
Computing”
• Reserve Texts (recommended that you look at periodically)

More Related Content

What's hot

Migration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming Models
Zvi Avraham
 
Introduction to HPC
Introduction to HPCIntroduction to HPC
Introduction to HPC
Chris Dwan
 
Introduction to High-Performance Computing
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance Computing
Umarudin Zaenuri
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
Ameya Waghmare
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
Mehul Patel
 
Parallel computing
Parallel computingParallel computing
Parallel computingvirend111
 
Parallel processing
Parallel processingParallel processing
Parallel processing
rajshreemuthiah
 
Chapter 1 - introduction - parallel computing
Chapter  1 - introduction - parallel computingChapter  1 - introduction - parallel computing
Chapter 1 - introduction - parallel computing
Heman Pathak
 
Parallel processing
Parallel processingParallel processing
Parallel processing
Syed Zaid Irshad
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecturePiyush Mittal
 
CS4109 Computer System Architecture
CS4109 Computer System ArchitectureCS4109 Computer System Architecture
CS4109 Computer System Architecture
ktosri
 
Distributed Shared Memory Systems
Distributed Shared Memory SystemsDistributed Shared Memory Systems
Distributed Shared Memory Systems
Arush Nagpal
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
Arpan Baishya
 
Evaluation of morden computer & system attributes in ACA
Evaluation of morden computer &  system attributes in ACAEvaluation of morden computer &  system attributes in ACA
Evaluation of morden computer & system attributes in ACA
Pankaj Kumar Jain
 
EE5440 – Computer Architecture - Lecture 1
EE5440 – Computer Architecture - Lecture 1EE5440 – Computer Architecture - Lecture 1
EE5440 – Computer Architecture - Lecture 1
Dilawar Khan
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
Ali Raza
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptx
sundariprabhu
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
Roshan Karunarathna
 
Swapping | Computer Science
Swapping | Computer ScienceSwapping | Computer Science
Swapping | Computer Science
Transweb Global Inc
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architecture
Gagan Kumar
 

What's hot (20)

Migration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming Models
 
Introduction to HPC
Introduction to HPCIntroduction to HPC
Introduction to HPC
 
Introduction to High-Performance Computing
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance Computing
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Chapter 1 - introduction - parallel computing
Chapter  1 - introduction - parallel computingChapter  1 - introduction - parallel computing
Chapter 1 - introduction - parallel computing
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecture
 
CS4109 Computer System Architecture
CS4109 Computer System ArchitectureCS4109 Computer System Architecture
CS4109 Computer System Architecture
 
Distributed Shared Memory Systems
Distributed Shared Memory SystemsDistributed Shared Memory Systems
Distributed Shared Memory Systems
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
Evaluation of morden computer & system attributes in ACA
Evaluation of morden computer &  system attributes in ACAEvaluation of morden computer &  system attributes in ACA
Evaluation of morden computer & system attributes in ACA
 
EE5440 – Computer Architecture - Lecture 1
EE5440 – Computer Architecture - Lecture 1EE5440 – Computer Architecture - Lecture 1
EE5440 – Computer Architecture - Lecture 1
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptx
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
 
Swapping | Computer Science
Swapping | Computer ScienceSwapping | Computer Science
Swapping | Computer Science
 
System interconnect architecture
System interconnect architectureSystem interconnect architecture
System interconnect architecture
 

Similar to parallel computing.ppt

intro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptxintro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptx
ssuser413a98
 
Underlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computingUnderlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computing
GOVERNMENT COLLEGE OF ENGINEERING,TIRUNELVELI
 
Module 2.pdf
Module 2.pdfModule 2.pdf
Module 2.pdf
DrAnjuShukla
 
Chapter 10
Chapter 10Chapter 10
PARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTES
PARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTESPARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTES
PARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTES
suthi
 
Introduction to parallel computing
Introduction to parallel computingIntroduction to parallel computing
Introduction to parallel computing
VIKAS SINGH BHADOURIA
 
A Parallel Computing-a Paradigm to achieve High Performance
A Parallel Computing-a Paradigm to achieve High PerformanceA Parallel Computing-a Paradigm to achieve High Performance
A Parallel Computing-a Paradigm to achieve High Performance
AM Publications
 
Operating Systems
Operating SystemsOperating Systems
Operating Systems
achal02
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lectures
marwaeng
 
Parallel Processing
Parallel ProcessingParallel Processing
Parallel Processing
Mustafa Salam
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
NANDINI SHARMA
 
Parallel processing
Parallel processingParallel processing
Parallel processing
Praveen Kumar
 
Computer's clasification
Computer's clasificationComputer's clasification
Computer's clasificationMayraChF
 
Parallel computing
Parallel computingParallel computing
Parallel computing
Kartik Kalpande Patil
 
Parallel & Distributed processing
Parallel & Distributed processingParallel & Distributed processing
Parallel & Distributed processing
Syed Zaid Irshad
 
Computer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer systemComputer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer system
mohantysikun0
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
Manoraj Pannerselum
 
Seminar
SeminarSeminar
Seminar
hpmizuki
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)
Sudarshan Mondal
 

Similar to parallel computing.ppt (20)

intro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptxintro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptx
 
Underlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computingUnderlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computing
 
Module 2.pdf
Module 2.pdfModule 2.pdf
Module 2.pdf
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
PARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTES
PARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTESPARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTES
PARALLEL ARCHITECTURE AND COMPUTING - SHORT NOTES
 
Introduction to parallel computing
Introduction to parallel computingIntroduction to parallel computing
Introduction to parallel computing
 
A Parallel Computing-a Paradigm to achieve High Performance
A Parallel Computing-a Paradigm to achieve High PerformanceA Parallel Computing-a Paradigm to achieve High Performance
A Parallel Computing-a Paradigm to achieve High Performance
 
Lecture1
Lecture1Lecture1
Lecture1
 
Operating Systems
Operating SystemsOperating Systems
Operating Systems
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lectures
 
Parallel Processing
Parallel ProcessingParallel Processing
Parallel Processing
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Computer's clasification
Computer's clasificationComputer's clasification
Computer's clasification
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Parallel & Distributed processing
Parallel & Distributed processingParallel & Distributed processing
Parallel & Distributed processing
 
Computer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer systemComputer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer system
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
 
Seminar
SeminarSeminar
Seminar
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)
 

More from ssuser413a98

Визначення терміну динамічний об’єкт на зображеннях .pptx
Визначення терміну динамічний об’єкт на зображеннях .pptxВизначення терміну динамічний об’єкт на зображеннях .pptx
Визначення терміну динамічний об’єкт на зображеннях .pptx
ssuser413a98
 
Фильтрация шумов на цифровом изображении.ppt
Фильтрация шумов на цифровом изображении.pptФильтрация шумов на цифровом изображении.ppt
Фильтрация шумов на цифровом изображении.ppt
ssuser413a98
 
Шумоподавление в цифровых изображениях.ppt
Шумоподавление в цифровых  изображениях.pptШумоподавление в цифровых  изображениях.ppt
Шумоподавление в цифровых изображениях.ppt
ssuser413a98
 
Подавление шума в цифровых изображениях.ppt
Подавление шума в цифровых изображениях.pptПодавление шума в цифровых изображениях.ppt
Подавление шума в цифровых изображениях.ppt
ssuser413a98
 
Сегментация изображений в компьютерной графике.ppt
Сегментация  изображений в компьютерной графике.pptСегментация  изображений в компьютерной графике.ppt
Сегментация изображений в компьютерной графике.ppt
ssuser413a98
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
ssuser413a98
 

More from ssuser413a98 (6)

Визначення терміну динамічний об’єкт на зображеннях .pptx
Визначення терміну динамічний об’єкт на зображеннях .pptxВизначення терміну динамічний об’єкт на зображеннях .pptx
Визначення терміну динамічний об’єкт на зображеннях .pptx
 
Фильтрация шумов на цифровом изображении.ppt
Фильтрация шумов на цифровом изображении.pptФильтрация шумов на цифровом изображении.ppt
Фильтрация шумов на цифровом изображении.ppt
 
Шумоподавление в цифровых изображениях.ppt
Шумоподавление в цифровых  изображениях.pptШумоподавление в цифровых  изображениях.ppt
Шумоподавление в цифровых изображениях.ppt
 
Подавление шума в цифровых изображениях.ppt
Подавление шума в цифровых изображениях.pptПодавление шума в цифровых изображениях.ppt
Подавление шума в цифровых изображениях.ppt
 
Сегментация изображений в компьютерной графике.ppt
Сегментация  изображений в компьютерной графике.pptСегментация  изображений в компьютерной графике.ppt
Сегментация изображений в компьютерной графике.ppt
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 

Recently uploaded

CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSCW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
veerababupersonal22
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
top1002
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 

Recently uploaded (20)

CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSCW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERS
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Basic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparelBasic Industrial Engineering terms for apparel
Basic Industrial Engineering terms for apparel
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 

parallel computing.ppt

  • 1. 1 Introduction Parallel and Distributed Computing Lecture 1-2 / 18 High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.
  • 2. 2 Topics  Introduction - today’s lecture  System Architectures (Single Instruction - Single Data, Single Instruction - Multiple Data, Multiple Instruction - Multiple Data, Shared Memory, Distributed Memory, Cluster, Multiple Instruction - Single Data)  Performance Analysis of parallel calculations (the speedup, efficiency, time execution of algorithm…)  Parallel numerical methods (Principles of Parallel Algorithm Design, Analytical Modeling of Parallel Programs, Matrices Operations, Matrix-Vector Operations, Graph Algorithms…)  Software (Programming Using the Message-Passing Interface, OpenMP, CUDA, Corba…)
  • 3. 3 What will you learn today?  Why Use Parallel Computing?  Motivation for parallelism (Moor’s law)  What is traditional programming view?  What is parallel computing?  What is distributed computing?  Concepts and terminology von Neumann Computer Architecture
  • 4. 4 What is traditional programming view?  Von Neumann View - Program Counter + Registers = Thread/process - Sequential change of Machine state  Comprised of four main components:  Memory  Control Unit  Arithmetic Logic Unit  Input/Output
  • 5. 5 Von Neumann Architecture  Read/write, random access memory is used to store both program instructions and data  Program instructions tell the computer to do something  Data is simply information to be used by the program  Control unit fetches instructions/data from memory, decodes the instructions and then sequentially coordinates operations to accomplish the programmed task.  Arithmetic Unit performs basic arithmetic operations.  Input/Output is the interface to the human operator.
  • 6. 6 All of the algorithms we’ve seen so far are sequential: • They have one “thread” of execution • One step follows another in sequence • One processor is all that is needed to run the algorithm Traditional (sequential) Processing View
  • 7. 7 The Computational Power Argument Moore's law states [1965]: 2X transistors/Chip Every 1.5 or 2 years Microprocessors have become smaller, denser, and more powerful. Gordon Moore is a co-founder of Intel.
  • 8. What problems are there ? With the increased use of computers in every sphere of human activity, computer scientists are faced with two crucial issues today:  Processing has to be done faster like never before  Larger or complex computation problems need to be solved 8
  • 9. What problems are there ?  Increasing the number of transistors as per Moore’s Law isn’t a solution, as it also increases the power consumption.  Power consumption causes a problem of processor heating… The perfect solution is PARALLELISM - in hardware as well as software. 9
  • 10. What is PARALLELISM ? PARALLELISM is a form of computation in which many instructions are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (in parallel). 10
  • 11. 12 Why Use PARALLELISM ? Save time and/or money Expl: Parallel clusters can be built from cheap components Solve larger problems Expl: Many problems are so large and/or complex that it is impractical or impossible to solve them on a single computer, especially given limited computer memory Provide concurrency Expl: Multiple computing resources can do many things simultaneously. Use of non-local resources Limits to serial computing Available memory Performance We can run…  Larger problems  Faster  More cases
  • 12. Parallel programming view  Parallel computing is a form of computation in which many calculations are carried out simultaneously.  In the simplest sense, it is the simultaneous use of multiple compute resources to solve a computational problem:  1.To be run using multiple CPUs  2.A problem is broken into discrete parts that can be solved concurrently  3.Each part is further broken down to a series of instructions  4.Instructions from each part execute simultaneously on different CPUs 13
  • 13. Parallel / Distributed ( Cluster) / Grid Computing  Parallel computing: use of multiple computers or processors working together on a common task.  Each processor works on its section of the problem.  Processors can exchange information  Distributed (cluster) computing is where several different computers (processing elements) work separately and they are connected by a network.  Distributed computers are highly scalable.  Grid Computing makes use of computers communicating over the Internet to work on a given problem (so, that in some respects they can be regarded as a single computer). 14
  • 14. Parallel Computer Memory Architectures  Shared Memory  Uniform Memory Access (UMA)  Non-Uniform Memory Access (NUMA)  Distributed Memory  Hybrid Distributed-Shared Memory 18
  • 15. Shared Memory General Characteristics: • Shared memory parallel computers vary widely, but generally have in common the ability for all processors to access all memory as global address space. • Multiple processors can operate independently but share the same memory resources. • Changes in a memory location effected by one processor are visible to all other processors. • Shared memory machines can be divided into two main classes based upon memory access times: UMA and NUMA. 19
  • 16. Shared Memory (UMA) 20 Uniform Memory Access (UMA):  Identical processors, Symmetric Multiprocessor (SMP)  Equal access and access times to memory  Sometimes called CC-UMA - Cache Coherent UMA. Cache coherent means if one processor updates a location in shared memory, all the other processors know about the update.
  • 17. Shared Memory (NUMA) Non-Uniform Memory Access (NUMA):  Often made by physically linking two or more SMPs  One SMP can directly access memory of another SMP  Not all processors have equal access time to all memories  Memory access across link is slower  If cache coherency is maintained, then may also be called CC- NUMA - Cache Coherent NUMA 21
  • 18. Distributed Memory Distributed memory systems require a communication network to connect inter- processor memory.  Processors have their own local memory.  Because each processor has its own local memory, it operates independently. Hence, the concept of cache coherency does not apply.  When a processor needs access to data in another processor, it is usually the task of the programmer to explicitly define how and when data is communicated. 22
  • 19. Hybrid Distributed-Shared Memory  The largest and fastest computers in the world today employ both shared and distributed memory architectures.  The shared memory component is usually a cache coherent SMP machine. Processors on a given SMP can address that machines memory as global.  The distributed memory component is the networking of multiple SMPs. SMPs know only about their own memory - not the memory on another SMP. Therefore, network communications are required to move data from one SMP to another. 23
  • 20. Key Difference Between Data And Task Parallelism Data Parallelism  It is the division of threads(processes) or instructions or tasks internally into sub-parts for execution.  A task ‘A’ is divided into sub-parts and then processed. 24 Task Parallelism  It is the divisions among threads (processes) or instructions or tasks themselves for execution.  A task ‘A’ and task ‘B’ are processed separately by different processors.
  • 21. Implementation Of Parallel Computing In Software  When implemented in software(or rather algorithms), the terminology calls it ‘parallel programming’.  An algorithm is split into pieces and then executed, as seen earlier.  Important Points In Parallel Programming  Dependencies - A typical scenario when line 6 of an algorithm is dependent on lines 2,3,4 and 5  Application Checkpoints - Just like saving the algorithm, or like creating a backup point.  Automatic Parallelisation - Identifying dependencies and parallelising algorithms automatically. This has achieved limited success. 25
  • 22. Implementation Of Parallel Computing In Hardware  When implemented in hardware, it is called as ‘parallel processing’.  Typically, when a chunk of load for execution is divided for processing by units like cores, processors, CPUs, etc. 26
  • 23. 27 Who is doing Parallel Computing? What are they using it for? Physics is parallel. The Human World is parallel too! Sequence is unusual Computer programs = models, distributed processes, increasingly parallel
  • 24. 28 Application Examples with Massive Parallelism Artificial Intelligence and Automation AI is the intelligence exhibited by machines or software. AI systems requires large amount of parallel computing for which they are used. 1.Image processing 2.Expert Systems 3.Natural Language Processing(NLP) 4.Pattern Recognition
  • 25. 29 Application Examples with Massive Parallelism Genetic Engineering Several of these analysis produce huge amounts of information which becomes difficult to handle using single processing units because of which parallel processing algorithms are used
  • 26. 30 Application Examples with Massive Parallelism Medical Applications  Parallel computing is used in medical image processing  Used for scanning human body and scanning human brain  Used in MRI reconstruction  Used for vertebra detection and segmentation in X-ray images  Used for brain fiber tracking
  • 27. 31 Impediments to Parallel Computing  Algorithm development is harder —complexity of specifying and coordinating concurrent activities  Software development is much harder —lack of standardized & effective development tools, programming models, and environments  Rapid changes in computer system architecture —today’s hot parallel algorithm may not be suitable for tomorrow’s parallel computer!
  • 28. 32 Next lecture overview  Some General Parallel Terminology  Flynn's Taxonomy
  • 29. 33 Test questions  What is traditional programming view?  Who is doing Parallel Computing?  What are they using it for?  Types of Parallel Computer Hardware.
  • 30. 34 Textbooks • Course Textbook: 1. Ananth Grama, George Karypis, Vipin Kumar, Anshul Gupta “Introduction to Parallel Computing” (2nd Edition) 2. Marc Snir, William Gropp “MPI: The Complete Reference (2-volume set)” 3. Victor Fijkhout with Edmond Chow, Robert Van De Geijn “Introduction To High-Performance Scientific Computing” • Reserve Texts (recommended that you look at periodically)

Editor's Notes

  1. The term parallel computation is generally applied to any data processing, in which several computer operations can be executed simultaneously. Achieving parallelism is only possible if the following requirements to architectural principles of computer systems are met: independent functioning of separate computer devices – this requirement equally concerns all the main computer system components - processors, storage devices, input/output devices; redundancy of computer system elements – redundancy can be realized in the following basic forms: use of specialized devices such as separate processors for integer and real valued arithmetic, multilevel memory devices (registers, cache); duplication of computer devices by means of using separate processors of the same type or several RAM devices, etc. Processor pipelines may be an additional form of achieving parallelism when carrying out operations in the devices is represented as executing a sequence of subcommands which constitute an operation. As a result, when such devices are engaged in computation several different data elements may be at different processing stages simultaneously. Possible ways of achieving parallelism are discussed in detail in Patterson and Hennessy (1996), Culler and Singh (1998); the same works describe the history of parallel computations and give particular examples of parallel computers (see also Xu and Hwang (1998), Culler, Singh and Gupta (1998) Buyya (1999)). Considering the problems of parallel computations one should distinguish the following modes of independent program parts execution: Multitasking (time shared) mode. In multitasking mode a single processor is used for carrying out processes. This mode is pseudo-parallel when only one process is active (is being carried out) while the other processes are in the stand-by mode queuing to use the processor. The use of time shared mode can make computations more efficient (e.g. if one of the processes can not be carried out because the data input is expected, the processor can be used for carrying out the process ready for execution - see Tanenbaum (2001)). Such parallel computation effects as the necessity of processes mutual exclusion and synchronization etc also manifest themselves in this mode and as a result this mode can be used for initial preparation of parallel programs; Parallel execution. In case of parallel execution several instructions of data processing can be carried out simultaneously. This computational mode can be provided not only if several processors are available but also by means of pipeline and vector processing devices; Distributed computations. This term is used to denote parallel data processing which involves the use of several processing devices located at a distance from each other. As the data transmission through communication lines among the processing devices leads to considerable time delays, efficient data processing in this computational mode is possible only for parallel algorithms with low intensity of interprocessor data transmission streams. The above mentioned conditions are typical of the computations in multicomputer systems which are created when several separate computers are connected by LAN or WAN communication channels.
  2. Traditionally, software has been written for serial computation: To be run on a single computer having a single Central Processing Unit (CPU); 2. A problem is broken into a discrete series of instructions. 3. Instructions are executed one after another. 4. Only one instruction may execute at any moment in time.
  3. Закон мура является основной мотивацией для перехода к ПРО
  4. compare
  5. Parallel computing allows: Solve problems that don’t fit on a single CPU’s memory space Solve problems that can’t be solved in a reasonable time
  6. The diversity of parallel computing systems is immense. In a sense each system is unique – each systems use various types of hardware: processors (Intel, IBM, AMD, HP, NEC, Cray, …), interconnection networks (Ethernet, Myrinet, Infiniband, SCI, …). They operate under various operating systems (Unix/Linux versions, Windows , …) and they use different software. It may seem impossible to find something common for all these system types. But it is not so. Later we will try to formulate some well-known variants of parallel computer systems classifications, but before that we will analyze some examples.
  7. A cluster is a group of loosely coupled computers that work together closely, so that in some respects they can be regarded as a single computer. Clusters are composed of multiple standalone machines connected by a network. While machines in a cluster do not have to be symmetric, load balancing is more difficult if they are not. The most common type of cluster is the Beowulf cluster, which is a cluster implemented on multiple identical commercial off-the-shelf computers connected with a TCP/IP Ethernet local area network. Grid computing is the most distributed form of parallel computing. It makes use of computers communicating over the Internet to work on a given problem. Because of the low bandwidth and extremely high latency available on the Internet, grid computing typically deals only with embarrassingly parallel problems. Most grid computing applications use middleware, software that sits between the operating system and the application to manage network resources and standardize the software interface.
  8. There is no such thing as "multiprocessor" or "multicore" programming. The distinction between "multiprocessor" and "multicore" computers is probably not relevant to you as an application programmer; it has to do with subtleties of how the cores share access to memory. In order to take advantage of a multicore (or multiprocessor) computer, you need a program written in such a way that it can be run in parallel, and a runtime that will allow the program to actually be executed in parallel on multiple cores (and operating system, although any operating system you can run on your PC will do this). This is really parallel programming, although there are different approaches to parallel programming. A multicore processor is a processor that includes multiple execution units ("cores") on the same chip. These processors differ from superscalar processors, which can issue multiple instructions per cycle from one instruction stream (thread); by contrast, a multicore processor can issue multiple instructions per cycle from multiple instruction streams. Each core in a multicore processor can potentially be superscalar as well—that is, on every cycle, each core can issue multiple instructions from one instruction stream. Parallel computers can be roughly classified according to the level at which the hardware supports parallelism—with multi-core and multi-processor computers having multiple processing elements within a single machine,
  9. Parallel computers can be roughly classified according to the level at which the hardware supports parallelism—with multi-core and multi-processor computers having multiple processing elements within a single machine, while clusters, MPPs, and grids use multiple computers to work on the same task. Specialized parallel computer architectures are sometimes used alongside traditional processors, for accelerating specific tasks. Parallel computer programs are more difficult to write than sequential ones,[5] because concurrency introduces several new classes of potential software bugs, of which race conditions are the most common. Communication and synchronization between the different subtasks are typically one of the greatest obstacles to getting good parallel program performance. A cluster is a group of loosely coupled computers that work together closely, so that in some respects they can be regarded as a single computer.[26] Clusters are composed of multiple standalone machines connected by a network. While machines in a cluster do not have to be symmetric, load balancing is more difficult if they are not. The most common type of cluster is the Beowulf cluster, which is a cluster implemented on multiple identical commercial off-the-shelf computers connected with a TCP/IP Ethernet local area network.[27] Beowulf technology was originally developed by Thomas Sterling and Donald Becker. The vast majority of the TOP500 supercomputers are clusters.[28] A cluster is group of computers connected in a local area network (LAN). A cluster is able to function as a unified computational resource. It implies higher reliability and efficiency than an LAN as well as a considerably lower cost in comparison to the other parallel computing system types (due to the use of standard hardware and software solutions). The beginning of cluster era was signified by the first project with the primary purpose of establishing connection among computers - ARPANET2 project. That was the period when the first principles were formulated which proved to be fundamental. Those principles later lead to the creation of local and global computational networks and of course to the creation of world wide computer network, the Internet. It’s true however that the first cluster appeared more than 20 years later. Those years were marked by a giant breakthrough in hardware development, the emergence of microprocessors and PCs which conquered the market, the accretion of parallel programming concepts and techniques, which eventually lead to the solution to the age-long problem, the problem of each parallel computational facility unicity which was the development of standards for the creation of parallel programs for systems with shared and distributed memory. In addition to that the available solutions in the area of highly efficient systems were very expensive at that time as they implied the use of high performance and specific components. The constant improvement of PC cost/performance ratio should also be taken into consideration. In the light of all those facts the emergence of clusters was inevitable.
  10. the problems of parallel computations one should distinguish the following modes of independent program parts execution: Multitasking (time shared) mode. In multitasking mode a single processor is used for carrying out processes. This mode is pseudo-parallel when only one process is active (is being carried out) while the other processes are in the stand-by mode queuing to use the processor. The use of time shared mode can make computations more efficient (e.g. if one of the processes can not be carried out because the data input is expected, the processor can be used for carrying out the process ready for execution - see Tanenbaum (2001)). Such parallel computation effects as the necessity of processes mutual exclusion and synchronization etc also manifest themselves in this mode and as a result this mode can be used for initial preparation of parallel programs; Parallel execution. In case of parallel execution several instructions of data processing can be carried out simultaneously. This computational mode can be provided not only if several processors are available but also by means of pipeline and vector processing devices; Distributed computations. This term is used to denote parallel data processing which involves the use of several processing devices located at a distance from each other. As the data transmission through communication lines among the processing devices leads to considerable time delays, efficient data processing in this computational mode is possible only for parallel algorithms with low intensity of interprocessor data transmission streams. The above mentioned conditions are typical of the computations in multicomputer systems which are created when several separate computers are connected by LAN or WAN communication channels.
  11. Classes of parallel comp Multicore computing A multicore processor is a processor that includes multiple execution units. These processors differ from superscalar processors, which can issue multiple instructions per cycle from one instruction stream (thread); by contrast, a multicore processor can issue multiple instructions per cycle from multiple instruction streams. Each core in a multicore processor can potentially be superscalar as well—that is, on every cycle, each core can issue multiple instructions from one instruction stream. Symmetric multiprocessing A symmetric multiprocessor (SMP) is a computer system with multiple identical processors that share memory and connect via a bus. Bus contention prevents bus architectures from scaling. As a result, SMPs generally do not comprise more than 32 processors.&quot; Because of the small size of the processors and the significant reduction in the requirements for bus bandwidth achieved by large caches, such symmetric multiprocessors are extremely cost-effective, provided that a sufficient amount of memory bandwidth exists.” </li></ul>uters Parallel computers can be classified according to the level at which the hardware supports parallelism. This classification is broadly analogous to the distance between basic computing nodes.
  12. Synchronization between tasks is likewise the programmers responsibility
  13. Science —Global climate modeling —Biology: genomics; protein folding; drug design —Astrophysical modeling —Computational Chemistry —Computational Material Sciences and Nanosciences • Engineering —Semiconductor design —Earthquake and structural modeling —Computation fluid dynamics (airplane design) —Combustion (engine design) —Crash simulation • Business —Financial and economic modeling —Transaction processing, web services and search engines • Defense —Nuclear weapons -- test by simulations —Cryptography
  14. Rapid pace of change - быстрые темпы изменения Impediments - преграды