SlideShare a Scribd company logo
Advanced Computer Architecture
The Architecture of
Parallel Computers
Computer Systems
Hardware
Architecture
Operating
System
Application
Software
No Component
Can be Treated
In Isolation
From the Others
Hardware Issues
• Number and Type of Processors
• Processor Control
• Memory Hierarchy
• I/O devices and Peripherals
• Operating System Support
• Applications Software Compatibility
Operating System Issues
• Allocating and Managing Resources
• Access to Hardware Features
– Multi-Processing
– Multi-Threading
• I/O Management
• Access to Peripherals
• Efficiency
Applications Issues
• Compiler/Linker Support
• Programmability
• OS/Hardware Feature Availability
• Compatibility
• Parallel Compilers
– Preprocessor
– Precompiler
– Parallelizing Compiler
Architecture Evolution
• Scalar Architecture
• Prefetch Fetch/Execute Overlap
• Multiple Functional Units
• Pipelining
• Vector Processors
• Lock-Step Processors
• Multi-Processor
Flynn’s Classification
• Consider Instruction Streams and Data
Streams Separately.
• SISD - Single Instruction, Single Data
Stream
• SIMD - Single Instruction, Multiple Data
Streams
• MIMD - Multiple Instruction, Multiple Data
Streams.
• MISD - (rare) Multiple Instruction, Single
Data Stream
SISD
• Conventional Computers.
• Pipelined Systems
• Multiple-Functional Unit Systems
• Pipelined Vector Processors
• Includes most computers encountered in
everyday life
SIMD
• Multiple Processors Execute a Single
Program
• Each Processor operates on its own data
• Vector Processors
• Array Processors
• PRAM Theoretical Model
MIMD
• Multiple Processors cooperate on a single
task
• Each Processor runs a different program
• Each Processor operates on different data
• Many Commercial Examples Exist
MISD
• A Single Data Stream passes through
multiple processors
• Different operations are triggered on
different processors
• Systolic Arrays
• Wave-Front Arrays
Programming Issues
• Parallel Computers are Difficult to Program
• Automatic Parallelization Techniques are
only Partially Successful
• Programming languages are few, not well
supported, and difficult to use.
• Parallel Algorithms are difficult to design.
Performance Issues
• Clock Rate / Cycle Time = τ
• Cycles Per Instruction (Average) = CPI
• Instruction Count = Ic
• Time, T = Ic × CPI × τ
• p = Processor Cycles, m = Memory Cycles,
k = Memory/Processor cycle ratio
• T = Ic × (p + m × k) × τ
Performance Issues II
• Ic & p affected by processor design and
compiler technology.
• m affected mainly by compiler technology
τ affected by processor design
• k affected by memory hierarchy structure
and design
Other Measures
• MIPS rate - Millions of instructions per
second
• Clock Rate for similar processors
• MFLOPS rate - Millions of floating point
operations per second.
• These measures are not neccessarily directly
comparable between different types of
processors.
Parallelizing Code
• Implicitly
– Write Sequential Algorithms
– Use a Parallelizing Compiler
– Rely on compiler to find parallelism
• Explicitly
– Design Parallel Algorithms
– Write in a Parallel Language
– Rely on Human to find Parallelism
Multi-Processors
• Multi-Processors generally share memory,
while multi-computers do not.
– Uniform memory model
– Non-Uniform Memory Model
– Cache-Only
• MIMD Machines
Multi-Computers
• Independent Computers that Don’t Share
Memory.
• Connected by High-Speed Communication
Network
• More tightly coupled than a collection of
independent computers
• Cooperate on a single problem
Vector Computers
• Independent Vector Hardware
• May be an attached processor
• Has both scalar and vector instructions
• Vector instructions operate in highly
pipelined mode
• Can be Memory-to-Memory or Register-to-
Register
SIMD Computers
• One Control Processor
• Several Processing Elements
• All Processing Elements execute the same
instruction at the same time
• Interconnection network between PEs
determines memory access and PE
interaction
The PRAM Model
• SIMD Style Programming
• Uniform Global Memory
• Local Memory in Each PE
• Memory Conflict Resolution
– CRCW - Common Read, Common Write
– CREW - Common Read, Exclusive Write
– EREW - Exclusive Read, Exclusive Write
– ERCW - (rare) Exclusive Read, Common Write
The VLSI Model
• Implement Algorithm as a mostly
combinational circuit
• Determine the area required for
implementation
• Determine the depth of the circuit
Advanced Computer Architecture
The Architecture of
Parallel Computers
Computer Systems
Hardware
Architecture
Operating
System
Application
Software
No Component
Can be Treated
In Isolation
From the Others
Hardware Issues
• Number and Type of Processors
• Processor Control
• Memory Hierarchy
• I/O devices and Peripherals
• Operating System Support
• Applications Software Compatibility
Operating System Issues
• Allocating and Managing Resources
• Access to Hardware Features
– Multi-Processing
– Multi-Threading
• I/O Management
• Access to Peripherals
• Efficiency
Applications Issues
• Compiler/Linker Support
• Programmability
• OS/Hardware Feature Availability
• Compatibility
• Parallel Compilers
– Preprocessor
– Precompiler
– Parallelizing Compiler
Architecture Evolution
• Scalar Architecture
• Prefetch Fetch/Execute Overlap
• Multiple Functional Units
• Pipelining
• Vector Processors
• Lock-Step Processors
• Multi-Processor
Flynn’s Classification
• Consider Instruction Streams and Data
Streams Separately.
• SISD - Single Instruction, Single Data
Stream
• SIMD - Single Instruction, Multiple Data
Streams
• MIMD - Multiple Instruction, Multiple Data
Streams.
• MISD - (rare) Multiple Instruction, Single
Data Stream
SISD
• Conventional Computers.
• Pipelined Systems
• Multiple-Functional Unit Systems
• Pipelined Vector Processors
• Includes most computers encountered in
everyday life
SIMD
• Multiple Processors Execute a Single
Program
• Each Processor operates on its own data
• Vector Processors
• Array Processors
• PRAM Theoretical Model
MIMD
• Multiple Processors cooperate on a single
task
• Each Processor runs a different program
• Each Processor operates on different data
• Many Commercial Examples Exist
MISD
• A Single Data Stream passes through
multiple processors
• Different operations are triggered on
different processors
• Systolic Arrays
• Wave-Front Arrays
Programming Issues
• Parallel Computers are Difficult to Program
• Automatic Parallelization Techniques are
only Partially Successful
• Programming languages are few, not well
supported, and difficult to use.
• Parallel Algorithms are difficult to design.
Performance Issues
• Clock Rate / Cycle Time = τ
• Cycles Per Instruction (Average) = CPI
• Instruction Count = Ic
• Time, T = Ic × CPI × τ
• p = Processor Cycles, m = Memory Cycles,
k = Memory/Processor cycle ratio
• T = Ic × (p + m × k) × τ
Performance Issues II
• Ic & p affected by processor design and
compiler technology.
• m affected mainly by compiler technology
τ affected by processor design
• k affected by memory hierarchy structure
and design
Other Measures
• MIPS rate - Millions of instructions per
second
• Clock Rate for similar processors
• MFLOPS rate - Millions of floating point
operations per second.
• These measures are not neccessarily directly
comparable between different types of
processors.
Parallelizing Code
• Implicitly
– Write Sequential Algorithms
– Use a Parallelizing Compiler
– Rely on compiler to find parallelism
• Explicitly
– Design Parallel Algorithms
– Write in a Parallel Language
– Rely on Human to find Parallelism
Multi-Processors
• Multi-Processors generally share memory,
while multi-computers do not.
– Uniform memory model
– Non-Uniform Memory Model
– Cache-Only
• MIMD Machines
Multi-Computers
• Independent Computers that Don’t Share
Memory.
• Connected by High-Speed Communication
Network
• More tightly coupled than a collection of
independent computers
• Cooperate on a single problem
Vector Computers
• Independent Vector Hardware
• May be an attached processor
• Has both scalar and vector instructions
• Vector instructions operate in highly
pipelined mode
• Can be Memory-to-Memory or Register-to-
Register
SIMD Computers
• One Control Processor
• Several Processing Elements
• All Processing Elements execute the same
instruction at the same time
• Interconnection network between PEs
determines memory access and PE
interaction
The PRAM Model
• SIMD Style Programming
• Uniform Global Memory
• Local Memory in Each PE
• Memory Conflict Resolution
– CRCW - Common Read, Common Write
– CREW - Common Read, Exclusive Write
– EREW - Exclusive Read, Exclusive Write
– ERCW - (rare) Exclusive Read, Common Write
The VLSI Model
• Implement Algorithm as a mostly
combinational circuit
• Determine the area required for
implementation
• Determine the depth of the circuit

More Related Content

Similar to archintro.pdf

Introduction to embedded system design
Introduction to embedded system designIntroduction to embedded system design
Introduction to embedded system design
Mukesh Bansal
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
Ali Raza
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
Ali Raza
 
parallel processing.ppt
parallel processing.pptparallel processing.ppt
parallel processing.ppt
NANDHINIS109942
 
chapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptchapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).ppt
NANDHINIS109942
 
Chap4.ppt
Chap4.pptChap4.ppt
Chap4.ppt
SaniyaSultana9
 
Chap4.ppt
Chap4.pptChap4.ppt
Chap4.ppt
mvpk14486
 
The Central Processing Unit(CPU) for Chapter 4
The Central Processing Unit(CPU) for Chapter 4The Central Processing Unit(CPU) for Chapter 4
The Central Processing Unit(CPU) for Chapter 4
MKKhaing
 
Chap4.ppt
Chap4.pptChap4.ppt
Chap4.ppt
Praches1
 
Computer !
Computer !Computer !
Computer !
Usman Shah
 
Chapter4 Data Processing
Chapter4 Data ProcessingChapter4 Data Processing
Chapter4 Data Processing
Muhammad Waqas
 
Embeddedsystem basic for Engineering Students
Embeddedsystem basic for Engineering StudentsEmbeddedsystem basic for Engineering Students
Embeddedsystem basic for Engineering Students
Electro 8
 
Computer system organization
Computer system organizationComputer system organization
Computer system organization
Syed Zaid Irshad
 
Ch 2
Ch 2Ch 2
Ch 2
Sam Govea
 
CSA unit5.pptx
CSA unit5.pptxCSA unit5.pptx
CSA unit5.pptx
AbcvDef
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
John D Almon
 
Week 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptxWeek 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptx
FaizanSaleem81
 
Parallel processing
Parallel processingParallel processing
Parallel processing
Syed Zaid Irshad
 
3rd the cpu
3rd the cpu3rd the cpu
3rd the cpu
Dianna Manalo
 
Multiprocessor.pptx
 Multiprocessor.pptx Multiprocessor.pptx
Multiprocessor.pptx
Muhammad54342
 

Similar to archintro.pdf (20)

Introduction to embedded system design
Introduction to embedded system designIntroduction to embedded system design
Introduction to embedded system design
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
parallel processing.ppt
parallel processing.pptparallel processing.ppt
parallel processing.ppt
 
chapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptchapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).ppt
 
Chap4.ppt
Chap4.pptChap4.ppt
Chap4.ppt
 
Chap4.ppt
Chap4.pptChap4.ppt
Chap4.ppt
 
The Central Processing Unit(CPU) for Chapter 4
The Central Processing Unit(CPU) for Chapter 4The Central Processing Unit(CPU) for Chapter 4
The Central Processing Unit(CPU) for Chapter 4
 
Chap4.ppt
Chap4.pptChap4.ppt
Chap4.ppt
 
Computer !
Computer !Computer !
Computer !
 
Chapter4 Data Processing
Chapter4 Data ProcessingChapter4 Data Processing
Chapter4 Data Processing
 
Embeddedsystem basic for Engineering Students
Embeddedsystem basic for Engineering StudentsEmbeddedsystem basic for Engineering Students
Embeddedsystem basic for Engineering Students
 
Computer system organization
Computer system organizationComputer system organization
Computer system organization
 
Ch 2
Ch 2Ch 2
Ch 2
 
CSA unit5.pptx
CSA unit5.pptxCSA unit5.pptx
CSA unit5.pptx
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
Week 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptxWeek 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptx
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
3rd the cpu
3rd the cpu3rd the cpu
3rd the cpu
 
Multiprocessor.pptx
 Multiprocessor.pptx Multiprocessor.pptx
Multiprocessor.pptx
 

Recently uploaded

A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
NazakatAliKhoso2
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
IJNSA Journal
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEM
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEMTIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEM
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEM
HODECEDSIET
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
Aditya Rajan Patra
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
mahammadsalmanmech
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 

Recently uploaded (20)

A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
Textile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdfTextile Chemical Processing and Dyeing.pdf
Textile Chemical Processing and Dyeing.pdf
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEM
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEMTIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEM
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEM
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Recycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part IIRecycled Concrete Aggregate in Construction Part II
Recycled Concrete Aggregate in Construction Part II
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 

archintro.pdf

  • 1. Advanced Computer Architecture The Architecture of Parallel Computers
  • 3. Hardware Issues • Number and Type of Processors • Processor Control • Memory Hierarchy • I/O devices and Peripherals • Operating System Support • Applications Software Compatibility
  • 4. Operating System Issues • Allocating and Managing Resources • Access to Hardware Features – Multi-Processing – Multi-Threading • I/O Management • Access to Peripherals • Efficiency
  • 5. Applications Issues • Compiler/Linker Support • Programmability • OS/Hardware Feature Availability • Compatibility • Parallel Compilers – Preprocessor – Precompiler – Parallelizing Compiler
  • 6. Architecture Evolution • Scalar Architecture • Prefetch Fetch/Execute Overlap • Multiple Functional Units • Pipelining • Vector Processors • Lock-Step Processors • Multi-Processor
  • 7. Flynn’s Classification • Consider Instruction Streams and Data Streams Separately. • SISD - Single Instruction, Single Data Stream • SIMD - Single Instruction, Multiple Data Streams • MIMD - Multiple Instruction, Multiple Data Streams. • MISD - (rare) Multiple Instruction, Single Data Stream
  • 8. SISD • Conventional Computers. • Pipelined Systems • Multiple-Functional Unit Systems • Pipelined Vector Processors • Includes most computers encountered in everyday life
  • 9. SIMD • Multiple Processors Execute a Single Program • Each Processor operates on its own data • Vector Processors • Array Processors • PRAM Theoretical Model
  • 10. MIMD • Multiple Processors cooperate on a single task • Each Processor runs a different program • Each Processor operates on different data • Many Commercial Examples Exist
  • 11. MISD • A Single Data Stream passes through multiple processors • Different operations are triggered on different processors • Systolic Arrays • Wave-Front Arrays
  • 12. Programming Issues • Parallel Computers are Difficult to Program • Automatic Parallelization Techniques are only Partially Successful • Programming languages are few, not well supported, and difficult to use. • Parallel Algorithms are difficult to design.
  • 13. Performance Issues • Clock Rate / Cycle Time = τ • Cycles Per Instruction (Average) = CPI • Instruction Count = Ic • Time, T = Ic × CPI × τ • p = Processor Cycles, m = Memory Cycles, k = Memory/Processor cycle ratio • T = Ic × (p + m × k) × τ
  • 14. Performance Issues II • Ic & p affected by processor design and compiler technology. • m affected mainly by compiler technology τ affected by processor design • k affected by memory hierarchy structure and design
  • 15. Other Measures • MIPS rate - Millions of instructions per second • Clock Rate for similar processors • MFLOPS rate - Millions of floating point operations per second. • These measures are not neccessarily directly comparable between different types of processors.
  • 16. Parallelizing Code • Implicitly – Write Sequential Algorithms – Use a Parallelizing Compiler – Rely on compiler to find parallelism • Explicitly – Design Parallel Algorithms – Write in a Parallel Language – Rely on Human to find Parallelism
  • 17. Multi-Processors • Multi-Processors generally share memory, while multi-computers do not. – Uniform memory model – Non-Uniform Memory Model – Cache-Only • MIMD Machines
  • 18. Multi-Computers • Independent Computers that Don’t Share Memory. • Connected by High-Speed Communication Network • More tightly coupled than a collection of independent computers • Cooperate on a single problem
  • 19. Vector Computers • Independent Vector Hardware • May be an attached processor • Has both scalar and vector instructions • Vector instructions operate in highly pipelined mode • Can be Memory-to-Memory or Register-to- Register
  • 20. SIMD Computers • One Control Processor • Several Processing Elements • All Processing Elements execute the same instruction at the same time • Interconnection network between PEs determines memory access and PE interaction
  • 21. The PRAM Model • SIMD Style Programming • Uniform Global Memory • Local Memory in Each PE • Memory Conflict Resolution – CRCW - Common Read, Common Write – CREW - Common Read, Exclusive Write – EREW - Exclusive Read, Exclusive Write – ERCW - (rare) Exclusive Read, Common Write
  • 22. The VLSI Model • Implement Algorithm as a mostly combinational circuit • Determine the area required for implementation • Determine the depth of the circuit
  • 23. Advanced Computer Architecture The Architecture of Parallel Computers
  • 25. Hardware Issues • Number and Type of Processors • Processor Control • Memory Hierarchy • I/O devices and Peripherals • Operating System Support • Applications Software Compatibility
  • 26. Operating System Issues • Allocating and Managing Resources • Access to Hardware Features – Multi-Processing – Multi-Threading • I/O Management • Access to Peripherals • Efficiency
  • 27. Applications Issues • Compiler/Linker Support • Programmability • OS/Hardware Feature Availability • Compatibility • Parallel Compilers – Preprocessor – Precompiler – Parallelizing Compiler
  • 28. Architecture Evolution • Scalar Architecture • Prefetch Fetch/Execute Overlap • Multiple Functional Units • Pipelining • Vector Processors • Lock-Step Processors • Multi-Processor
  • 29. Flynn’s Classification • Consider Instruction Streams and Data Streams Separately. • SISD - Single Instruction, Single Data Stream • SIMD - Single Instruction, Multiple Data Streams • MIMD - Multiple Instruction, Multiple Data Streams. • MISD - (rare) Multiple Instruction, Single Data Stream
  • 30. SISD • Conventional Computers. • Pipelined Systems • Multiple-Functional Unit Systems • Pipelined Vector Processors • Includes most computers encountered in everyday life
  • 31. SIMD • Multiple Processors Execute a Single Program • Each Processor operates on its own data • Vector Processors • Array Processors • PRAM Theoretical Model
  • 32. MIMD • Multiple Processors cooperate on a single task • Each Processor runs a different program • Each Processor operates on different data • Many Commercial Examples Exist
  • 33. MISD • A Single Data Stream passes through multiple processors • Different operations are triggered on different processors • Systolic Arrays • Wave-Front Arrays
  • 34. Programming Issues • Parallel Computers are Difficult to Program • Automatic Parallelization Techniques are only Partially Successful • Programming languages are few, not well supported, and difficult to use. • Parallel Algorithms are difficult to design.
  • 35. Performance Issues • Clock Rate / Cycle Time = τ • Cycles Per Instruction (Average) = CPI • Instruction Count = Ic • Time, T = Ic × CPI × τ • p = Processor Cycles, m = Memory Cycles, k = Memory/Processor cycle ratio • T = Ic × (p + m × k) × τ
  • 36. Performance Issues II • Ic & p affected by processor design and compiler technology. • m affected mainly by compiler technology τ affected by processor design • k affected by memory hierarchy structure and design
  • 37. Other Measures • MIPS rate - Millions of instructions per second • Clock Rate for similar processors • MFLOPS rate - Millions of floating point operations per second. • These measures are not neccessarily directly comparable between different types of processors.
  • 38. Parallelizing Code • Implicitly – Write Sequential Algorithms – Use a Parallelizing Compiler – Rely on compiler to find parallelism • Explicitly – Design Parallel Algorithms – Write in a Parallel Language – Rely on Human to find Parallelism
  • 39. Multi-Processors • Multi-Processors generally share memory, while multi-computers do not. – Uniform memory model – Non-Uniform Memory Model – Cache-Only • MIMD Machines
  • 40. Multi-Computers • Independent Computers that Don’t Share Memory. • Connected by High-Speed Communication Network • More tightly coupled than a collection of independent computers • Cooperate on a single problem
  • 41. Vector Computers • Independent Vector Hardware • May be an attached processor • Has both scalar and vector instructions • Vector instructions operate in highly pipelined mode • Can be Memory-to-Memory or Register-to- Register
  • 42. SIMD Computers • One Control Processor • Several Processing Elements • All Processing Elements execute the same instruction at the same time • Interconnection network between PEs determines memory access and PE interaction
  • 43. The PRAM Model • SIMD Style Programming • Uniform Global Memory • Local Memory in Each PE • Memory Conflict Resolution – CRCW - Common Read, Common Write – CREW - Common Read, Exclusive Write – EREW - Exclusive Read, Exclusive Write – ERCW - (rare) Exclusive Read, Common Write
  • 44. The VLSI Model • Implement Algorithm as a mostly combinational circuit • Determine the area required for implementation • Determine the depth of the circuit