SlideShare a Scribd company logo
CSE539: Advanced Computer Architecture

Chapter 8

Multivector and SIMD Computers
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani

Sumit Mittu
Assistant Professor, CSE/IT
Lovely Professional University
sumit.12735@lpu.co.in
In this chapter…
•
•
•
•

Vector Processing Principles
Compound Vector Operations
Vector Loops and Chaining
SIMD Computer Implementation Models

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

2
VECTOR PROCESSING PRINCIPLES
• Vector Processing Definitions
o
o
o
o
o
o

Vector
Stride
Vector Processor
Vector Processing
Vectorization
Vectorizing Compiler or Vectorizer

• Vector Instruction Types
o Vector-vector instructions
o Vector-scalar instructions
o Vector-memory instructions

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

3
VECTOR PROCESSING PRINCIPLES

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

4
VECTOR PROCESSING PRINCIPLES
• Vector-Vector Instructions
o F1:
o F2:
o Examples:

Vi  Vj
Vi x Vj Vk
V1 = sin(V2)

V3 = V1+ V2

• Vector-Scalar Instructions
o F3:
o Examples:

s x Vi  Vj
V2 = 6 + V1

• Vector-Memory Instructions
o F4:
o F5:
o Examples:

MV
VM
X = V1

(Vector Load)
(Vector Store)
V2 = Y

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

5
VECTOR PROCESSING PRINCIPLES
• Vector Reduction Instructions
o F6:
o F7:

Vi  s
Vi x Vj  s

• Gather and Scatter Instructions
o F8:
o F9:

M  Vi x Vj
Vi x Vj  M

(Gather)
(Scatter)

Vi x Vm  Vj

(Vm is a binary vector)

• Masking
o F10:

• Examples…

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

6
VECTOR PROCESSING PRINCIPLES

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

7
VECTOR PROCESSING PRINCIPLES
• Vector-Access Memory Schemes
o Vector-operand Specifications
• Base address, stride and length
o C-Access Memory Organization
• Low-order m-way interleaved memory
o S-access Memory Organizations
• High-order m-way interleaved memory
o C/S Access Memory Organization

• Early Supercomputers (Vectors Processors)
o Cray Series
o CDC Cyber

ETA 10E
Fujitsu VP2600

NEC Sx-X 44
Hitachi 820/80

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

8
VECTOR PROCESSING PRINCIPLES
• Relative Vector/Scalar Performance
o Vector/scalar speed ratio
o Vectorization ratio in program
o Relative Performance P is given by:
•

𝑷=

𝟏
𝟏−𝒇 + 𝒇/𝒓

=

r
f

𝒓
𝟏−𝒇 𝒓 + 𝒇

o When f is low, the speedup cannot be high even with very high r
o Limiting Case:
• P  1 if f  0
o Maximum Case:
• P  r if f  1
o Powerful single chip processors and multicore system-on-a-chip provide High-Performance
Computing (HPC) using MIMD and/or SPMD configurations with large no. of processors.

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

9
COMPUOUND VECTOR PROCESSING
• Compound Vector Operations
o Compound Vector Functions (CVFs)
• Composite function of vector operations converted from a looping structure of linked scalar
operations
o CVF Example: The SAXPY (Single-precision A multiply X Plus Y) Code
• For I = 1 to N
o Load
R1, X(I)
o Load
R2, Y(I)
o Multiply
R1, A
o Add
R2, R1
o Store
Y(I), R2
• (End of Loop)
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

10
COMPUOUND VECTOR PROCESSING
• One-dimensional CVF Examples
o V(I) = V2(I) + V(3) x V(4)
o V1(I) = B(I) + C(I)
o A(I) = V(I) x S + B(I)
o A(I) = V(I) + B(I) + C(I)
o A(I) = Q x v1(I) (R x B(I) + C(I)), etc.
Legend:
o Vi(I) are vector registers
o A(I), B(I), C(I) are vectors in memory
o Q, S are scalars available from scalar registers in memory

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

11
COMPUOUND VECTOR PROCESSING
• Vector Loops
o Vector segmentation or strip-mining approach
o Example

• Vector Chaining
o Example: SAXPY code
• Limited Chaining using only one memory-access pipe in Cray-I
• Complete Chaining using three memory-access pipes in Cray X-MP

• Functional Unit Independence
• Vector Recurrence

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

12
COMPUOUND VECTOR PROCESSING

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

13
COMPUOUND VECTOR PROCESSING

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

14
SIMD COMPUTER ORGANIZATIONS
• SIMD Computer Variants
o Array Processor
o Associative Processor

• SIMD Processor v/s SISD v/s Vector Processor Operation
o Illustration: for(i=0;i<5;i++) a[i] = a[i]+2;
o Lockstep mode of operation in SIMD processor
o Relative Performance comparison

• SIMD Implementation Models
o Distributed Memory Model
• E.g. Illiac IV
o Shared memory Model
• E.g. BSP (Burroughs Scientific Processor)
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

15
SIMD COMPUTER ORGANIZATIONS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

16
SIMD COMPUTER ORGANIZATIONS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

17
SIMD COMPUTER ORGANIZATIONS
• SIMD Instructions
o Scalar Operations
• Arithmetic/Logical
o Vector Operations
• Arithmetic/Logical
o Data Routing Operations
• Permutations, broadcasts, multicasts, rotation and shifting
o Masking Operations
• Enable/Disable PEs

• Host and I/O
• Bit-slice and Word-slice Processing
o WSBS, WSBP, WPBS, WPBP
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

18

More Related Content

What's hot

Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
vampugani
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-Systems
Venkata Sreeram
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
Ali Raza
 
Video display device
Video display deviceVideo display device
Video display device
missagrata
 
Distributed systems scheduling
Distributed systems schedulingDistributed systems scheduling
Distributed systems scheduling
Pragati Startup Presentation Designer firm
 
Connection Machine
Connection MachineConnection Machine
Connection Machine
butest
 
Dma
DmaDma
Arbitration in computer organization
 Arbitration in computer organization   Arbitration in computer organization
Arbitration in computer organization
Amit kashyap
 
Computer Architecture and organization
Computer Architecture and organizationComputer Architecture and organization
Computer Architecture and organization
Badrinath Kadam
 
Computer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureComputer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer Architecture
Haris456
 
introduction to microprocessors
introduction to microprocessorsintroduction to microprocessors
introduction to microprocessors
vishi1993
 
Distributed system
Distributed systemDistributed system
Distributed system
Syed Zaid Irshad
 
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
Sagar Rai
 
RAID LEVELS
RAID LEVELSRAID LEVELS
RAID LEVELS
Uzair Khan
 
Ram presentation
Ram presentationRam presentation
Ram presentation
Kadai McFadden
 
Parallel processing
Parallel processingParallel processing
Parallel processing
rajshreemuthiah
 
High performance computing
High performance computingHigh performance computing
High performance computing
punjab engineering college, chandigarh
 
Aca2 07 new
Aca2 07 newAca2 07 new
Aca2 07 new
Sumit Mittu
 
Cache memory
Cache memoryCache memory
Cache memory
Faiq Ali Sayed
 
Lil endian.ppt
Lil endian.pptLil endian.ppt
Lil endian.ppt
Ganga R Jaiswal
 

What's hot (20)

Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-Systems
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
Video display device
Video display deviceVideo display device
Video display device
 
Distributed systems scheduling
Distributed systems schedulingDistributed systems scheduling
Distributed systems scheduling
 
Connection Machine
Connection MachineConnection Machine
Connection Machine
 
Dma
DmaDma
Dma
 
Arbitration in computer organization
 Arbitration in computer organization   Arbitration in computer organization
Arbitration in computer organization
 
Computer Architecture and organization
Computer Architecture and organizationComputer Architecture and organization
Computer Architecture and organization
 
Computer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureComputer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer Architecture
 
introduction to microprocessors
introduction to microprocessorsintroduction to microprocessors
introduction to microprocessors
 
Distributed system
Distributed systemDistributed system
Distributed system
 
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
SDN( Software Defined Network) and NFV(Network Function Virtualization) for I...
 
RAID LEVELS
RAID LEVELSRAID LEVELS
RAID LEVELS
 
Ram presentation
Ram presentationRam presentation
Ram presentation
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Aca2 07 new
Aca2 07 newAca2 07 new
Aca2 07 new
 
Cache memory
Cache memoryCache memory
Cache memory
 
Lil endian.ppt
Lil endian.pptLil endian.ppt
Lil endian.ppt
 

Viewers also liked

Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processing
Kamal Acharya
 
Advanced Multimedia
Advanced MultimediaAdvanced Multimedia
Advanced Multimedia
kadalrocker
 
Program and Network Properties
Program and Network PropertiesProgram and Network Properties
Program and Network Properties
Beekrum Duwal
 
Aca2 09 new
Aca2 09 newAca2 09 new
Aca2 09 new
Sumit Mittu
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compiler
Vignesh Tamil
 
Introduction Cell Processor
Introduction Cell ProcessorIntroduction Cell Processor
Introduction Cell Processor
coolmirza143
 
Aca2 06 new
Aca2 06 newAca2 06 new
Aca2 06 new
Sumit Mittu
 
Computer Organozation
Computer OrganozationComputer Organozation
Computer Organozation
Aabha Tiwari
 
Coa swetappt copy
Coa swetappt   copyCoa swetappt   copy
Coa swetappt copy
sweta_pari
 
Parallel computing chapter 3
Parallel computing chapter 3Parallel computing chapter 3
Parallel computing chapter 3
Md. Mahedi Mahfuj
 
Lec3 final
Lec3 finalLec3 final
Lec3 final
Gichelle Amon
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwang
Sumedha
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
Mr SMAK
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
Mr SMAK
 
Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...
Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...
Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...
saumyatapu
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
Kernel TLV
 
Parallel processing Concepts
Parallel processing ConceptsParallel processing Concepts
Parallel processing Concepts
Army Public School and College -Faisal
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
Ashish Kumar
 
Network programming in java - PPT
Network programming in java - PPTNetwork programming in java - PPT
Network programming in java - PPT
kamal kotecha
 
vector application
vector applicationvector application
vector application
rajat shukla
 

Viewers also liked (20)

Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processing
 
Advanced Multimedia
Advanced MultimediaAdvanced Multimedia
Advanced Multimedia
 
Program and Network Properties
Program and Network PropertiesProgram and Network Properties
Program and Network Properties
 
Aca2 09 new
Aca2 09 newAca2 09 new
Aca2 09 new
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compiler
 
Introduction Cell Processor
Introduction Cell ProcessorIntroduction Cell Processor
Introduction Cell Processor
 
Aca2 06 new
Aca2 06 newAca2 06 new
Aca2 06 new
 
Computer Organozation
Computer OrganozationComputer Organozation
Computer Organozation
 
Coa swetappt copy
Coa swetappt   copyCoa swetappt   copy
Coa swetappt copy
 
Parallel computing chapter 3
Parallel computing chapter 3Parallel computing chapter 3
Parallel computing chapter 3
 
Lec3 final
Lec3 finalLec3 final
Lec3 final
 
Computer architecture kai hwang
Computer architecture   kai hwangComputer architecture   kai hwang
Computer architecture kai hwang
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...
Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...
Project pptVLSI ARCHITECTURE FOR AN IMAGE COMPRESSION SYSTEM USING VECTOR QUA...
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
Parallel processing Concepts
Parallel processing ConceptsParallel processing Concepts
Parallel processing Concepts
 
Computer architecture
Computer architecture Computer architecture
Computer architecture
 
Network programming in java - PPT
Network programming in java - PPTNetwork programming in java - PPT
Network programming in java - PPT
 
vector application
vector applicationvector application
vector application
 

Similar to Aca2 08 new

ARLabs:Profile & Training Programs
ARLabs:Profile & Training ProgramsARLabs:Profile & Training Programs
ARLabs:Profile & Training Programs
Anubhav Seth
 
2022-S1-IT2070-Lecture-06-Algorithms.pptx
2022-S1-IT2070-Lecture-06-Algorithms.pptx2022-S1-IT2070-Lecture-06-Algorithms.pptx
2022-S1-IT2070-Lecture-06-Algorithms.pptx
pradeepwalter
 
System on Chip Design and Modelling Dr. David J Greaves
System on Chip Design and Modelling   Dr. David J GreavesSystem on Chip Design and Modelling   Dr. David J Greaves
System on Chip Design and Modelling Dr. David J Greaves
Satya Harish
 
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
li50916ku
 
A_Brief_Summary_on_Summer_Courses[1]
A_Brief_Summary_on_Summer_Courses[1]A_Brief_Summary_on_Summer_Courses[1]
A_Brief_Summary_on_Summer_Courses[1]
Gayatri Kindo
 
Model-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentModel-based programming and AI-assisted software development
Model-based programming and AI-assisted software development
Eficode
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
Mohit Garg
 
Preparing future workforce ready for industry 4.0 @ Oakland University
Preparing future workforce ready for industry 4.0 @ Oakland UniversityPreparing future workforce ready for industry 4.0 @ Oakland University
Preparing future workforce ready for industry 4.0 @ Oakland University
Umang Tuladhar
 
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- IntroLec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Hsien-Hsin Sean Lee, Ph.D.
 
Clean architecture
Clean architectureClean architecture
Clean architecture
Travis Frisinger
 
Online and Offline Testing Of C-Bist Using Sram
Online and Offline Testing Of C-Bist Using SramOnline and Offline Testing Of C-Bist Using Sram
Online and Offline Testing Of C-Bist Using Sram
iosrjce
 
LOTAR-PDES: Engineering digitalization through task automation and reuse in t...
LOTAR-PDES: Engineering digitalization through task automation and reuse in t...LOTAR-PDES: Engineering digitalization through task automation and reuse in t...
LOTAR-PDES: Engineering digitalization through task automation and reuse in t...
CARLOS III UNIVERSITY OF MADRID
 
From Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems ArchitecturesFrom Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems Architectures
Obeo
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Lionel Briand
 
Static Memory Management for Efficient Mobile Sensing Applications
Static Memory Management for Efficient Mobile Sensing ApplicationsStatic Memory Management for Efficient Mobile Sensing Applications
Static Memory Management for Efficient Mobile Sensing Applications
Farley Lai
 
Gourp 12 Report.pptx
Gourp 12 Report.pptxGourp 12 Report.pptx
Gourp 12 Report.pptx
ShubhamMane733576
 
Esl basics
Esl basicsEsl basics
Esl basics
敬倫 林
 
Dc dc bost converter simulation research
Dc dc bost converter simulation research Dc dc bost converter simulation research
Dc dc bost converter simulation research
Engr.Muhammad Mujtaba Asad
 
Metric Recovery from Unweighted k-NN Graphs
Metric Recovery from Unweighted k-NN GraphsMetric Recovery from Unweighted k-NN Graphs
Metric Recovery from Unweighted k-NN Graphs
joisino
 
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
Edge AI and Vision Alliance
 

Similar to Aca2 08 new (20)

ARLabs:Profile & Training Programs
ARLabs:Profile & Training ProgramsARLabs:Profile & Training Programs
ARLabs:Profile & Training Programs
 
2022-S1-IT2070-Lecture-06-Algorithms.pptx
2022-S1-IT2070-Lecture-06-Algorithms.pptx2022-S1-IT2070-Lecture-06-Algorithms.pptx
2022-S1-IT2070-Lecture-06-Algorithms.pptx
 
System on Chip Design and Modelling Dr. David J Greaves
System on Chip Design and Modelling   Dr. David J GreavesSystem on Chip Design and Modelling   Dr. David J Greaves
System on Chip Design and Modelling Dr. David J Greaves
 
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
 
A_Brief_Summary_on_Summer_Courses[1]
A_Brief_Summary_on_Summer_Courses[1]A_Brief_Summary_on_Summer_Courses[1]
A_Brief_Summary_on_Summer_Courses[1]
 
Model-based programming and AI-assisted software development
Model-based programming and AI-assisted software developmentModel-based programming and AI-assisted software development
Model-based programming and AI-assisted software development
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Preparing future workforce ready for industry 4.0 @ Oakland University
Preparing future workforce ready for industry 4.0 @ Oakland UniversityPreparing future workforce ready for industry 4.0 @ Oakland University
Preparing future workforce ready for industry 4.0 @ Oakland University
 
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- IntroLec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
 
Clean architecture
Clean architectureClean architecture
Clean architecture
 
Online and Offline Testing Of C-Bist Using Sram
Online and Offline Testing Of C-Bist Using SramOnline and Offline Testing Of C-Bist Using Sram
Online and Offline Testing Of C-Bist Using Sram
 
LOTAR-PDES: Engineering digitalization through task automation and reuse in t...
LOTAR-PDES: Engineering digitalization through task automation and reuse in t...LOTAR-PDES: Engineering digitalization through task automation and reuse in t...
LOTAR-PDES: Engineering digitalization through task automation and reuse in t...
 
From Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems ArchitecturesFrom Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems Architectures
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 
Static Memory Management for Efficient Mobile Sensing Applications
Static Memory Management for Efficient Mobile Sensing ApplicationsStatic Memory Management for Efficient Mobile Sensing Applications
Static Memory Management for Efficient Mobile Sensing Applications
 
Gourp 12 Report.pptx
Gourp 12 Report.pptxGourp 12 Report.pptx
Gourp 12 Report.pptx
 
Esl basics
Esl basicsEsl basics
Esl basics
 
Dc dc bost converter simulation research
Dc dc bost converter simulation research Dc dc bost converter simulation research
Dc dc bost converter simulation research
 
Metric Recovery from Unweighted k-NN Graphs
Metric Recovery from Unweighted k-NN GraphsMetric Recovery from Unweighted k-NN Graphs
Metric Recovery from Unweighted k-NN Graphs
 
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
“Introduction to Optimizing ML Models for the Edge,” a Presentation from Cisc...
 

More from Sumit Mittu

Int306 03
Int306 03Int306 03
Int306 03
Sumit Mittu
 
Int306 02
Int306 02Int306 02
Int306 02
Sumit Mittu
 
Int306 01
Int306 01Int306 01
Int306 01
Sumit Mittu
 
Int306 00
Int306 00Int306 00
Int306 00
Sumit Mittu
 
Int306 04
Int306 04Int306 04
Int306 04
Sumit Mittu
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
Sumit Mittu
 
Aca11 bk2 ch9
Aca11 bk2 ch9Aca11 bk2 ch9
Aca11 bk2 ch9
Sumit Mittu
 

More from Sumit Mittu (7)

Int306 03
Int306 03Int306 03
Int306 03
 
Int306 02
Int306 02Int306 02
Int306 02
 
Int306 01
Int306 01Int306 01
Int306 01
 
Int306 00
Int306 00Int306 00
Int306 00
 
Int306 04
Int306 04Int306 04
Int306 04
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
 
Aca11 bk2 ch9
Aca11 bk2 ch9Aca11 bk2 ch9
Aca11 bk2 ch9
 

Recently uploaded

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
ArianaBusciglio
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Ashish Kohli
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 

Recently uploaded (20)

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 

Aca2 08 new

  • 1. CSE539: Advanced Computer Architecture Chapter 8 Multivector and SIMD Computers Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani Sumit Mittu Assistant Professor, CSE/IT Lovely Professional University sumit.12735@lpu.co.in
  • 2. In this chapter… • • • • Vector Processing Principles Compound Vector Operations Vector Loops and Chaining SIMD Computer Implementation Models Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 2
  • 3. VECTOR PROCESSING PRINCIPLES • Vector Processing Definitions o o o o o o Vector Stride Vector Processor Vector Processing Vectorization Vectorizing Compiler or Vectorizer • Vector Instruction Types o Vector-vector instructions o Vector-scalar instructions o Vector-memory instructions Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 3
  • 4. VECTOR PROCESSING PRINCIPLES Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 4
  • 5. VECTOR PROCESSING PRINCIPLES • Vector-Vector Instructions o F1: o F2: o Examples: Vi  Vj Vi x Vj Vk V1 = sin(V2) V3 = V1+ V2 • Vector-Scalar Instructions o F3: o Examples: s x Vi  Vj V2 = 6 + V1 • Vector-Memory Instructions o F4: o F5: o Examples: MV VM X = V1 (Vector Load) (Vector Store) V2 = Y Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 5
  • 6. VECTOR PROCESSING PRINCIPLES • Vector Reduction Instructions o F6: o F7: Vi  s Vi x Vj  s • Gather and Scatter Instructions o F8: o F9: M  Vi x Vj Vi x Vj  M (Gather) (Scatter) Vi x Vm  Vj (Vm is a binary vector) • Masking o F10: • Examples… Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 6
  • 7. VECTOR PROCESSING PRINCIPLES Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 7
  • 8. VECTOR PROCESSING PRINCIPLES • Vector-Access Memory Schemes o Vector-operand Specifications • Base address, stride and length o C-Access Memory Organization • Low-order m-way interleaved memory o S-access Memory Organizations • High-order m-way interleaved memory o C/S Access Memory Organization • Early Supercomputers (Vectors Processors) o Cray Series o CDC Cyber ETA 10E Fujitsu VP2600 NEC Sx-X 44 Hitachi 820/80 Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 8
  • 9. VECTOR PROCESSING PRINCIPLES • Relative Vector/Scalar Performance o Vector/scalar speed ratio o Vectorization ratio in program o Relative Performance P is given by: • 𝑷= 𝟏 𝟏−𝒇 + 𝒇/𝒓 = r f 𝒓 𝟏−𝒇 𝒓 + 𝒇 o When f is low, the speedup cannot be high even with very high r o Limiting Case: • P  1 if f  0 o Maximum Case: • P  r if f  1 o Powerful single chip processors and multicore system-on-a-chip provide High-Performance Computing (HPC) using MIMD and/or SPMD configurations with large no. of processors. Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 9
  • 10. COMPUOUND VECTOR PROCESSING • Compound Vector Operations o Compound Vector Functions (CVFs) • Composite function of vector operations converted from a looping structure of linked scalar operations o CVF Example: The SAXPY (Single-precision A multiply X Plus Y) Code • For I = 1 to N o Load R1, X(I) o Load R2, Y(I) o Multiply R1, A o Add R2, R1 o Store Y(I), R2 • (End of Loop) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 10
  • 11. COMPUOUND VECTOR PROCESSING • One-dimensional CVF Examples o V(I) = V2(I) + V(3) x V(4) o V1(I) = B(I) + C(I) o A(I) = V(I) x S + B(I) o A(I) = V(I) + B(I) + C(I) o A(I) = Q x v1(I) (R x B(I) + C(I)), etc. Legend: o Vi(I) are vector registers o A(I), B(I), C(I) are vectors in memory o Q, S are scalars available from scalar registers in memory Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 11
  • 12. COMPUOUND VECTOR PROCESSING • Vector Loops o Vector segmentation or strip-mining approach o Example • Vector Chaining o Example: SAXPY code • Limited Chaining using only one memory-access pipe in Cray-I • Complete Chaining using three memory-access pipes in Cray X-MP • Functional Unit Independence • Vector Recurrence Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 12
  • 13. COMPUOUND VECTOR PROCESSING Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 13
  • 14. COMPUOUND VECTOR PROCESSING Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 14
  • 15. SIMD COMPUTER ORGANIZATIONS • SIMD Computer Variants o Array Processor o Associative Processor • SIMD Processor v/s SISD v/s Vector Processor Operation o Illustration: for(i=0;i<5;i++) a[i] = a[i]+2; o Lockstep mode of operation in SIMD processor o Relative Performance comparison • SIMD Implementation Models o Distributed Memory Model • E.g. Illiac IV o Shared memory Model • E.g. BSP (Burroughs Scientific Processor) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 15
  • 16. SIMD COMPUTER ORGANIZATIONS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 16
  • 17. SIMD COMPUTER ORGANIZATIONS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 17
  • 18. SIMD COMPUTER ORGANIZATIONS • SIMD Instructions o Scalar Operations • Arithmetic/Logical o Vector Operations • Arithmetic/Logical o Data Routing Operations • Permutations, broadcasts, multicasts, rotation and shifting o Masking Operations • Enable/Disable PEs • Host and I/O • Bit-slice and Word-slice Processing o WSBS, WSBP, WPBS, WPBP Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 18