SlideShare a Scribd company logo
1 of 33
ARRAY PROCESSOR FEATURINGARRAY PROCESSOR FEATURING
AN EFFECTIVE FIFO BASED DATAAN EFFECTIVE FIFO BASED DATA
STREAM MANAGEMENTSTREAM MANAGEMENT
PROJECT INTERNALPROJECT INTERNAL GUIDEGUIDE
Mrs.I.VATSALAPRIYA.M.E.,Mrs.I.VATSALAPRIYA.M.E.,
PROJECT MEMBERSPROJECT MEMBERS::
S.SATHIYA SAINATHAN,S.SATHIYA SAINATHAN,
P.SRIBALAMURUGANP.SRIBALAMURUGAN
SYNOPSISSYNOPSIS
1. ABSTRACT
2. NEED FOR PARALLEL COMPUTING
3. INTRODUCTION TO PARALLEL PROCESSOR AND ITS FEATURES
4. ARRAY PROCESSOR
5. SYSTOLIC ARRAY PROCESSOR
6. BASE PAPER ARCHITECTURE FOR MATRIX CALCULATION
7. PROJECT THEME IMAGE ROTATION AND IMAGE TRANSPOSE
8. COMPARISON BETWEEN MATLAB AND ARRAY PROCESSOR
9. PROPOSED ARCHITECTURE
10. OUTPUT AND OTHER APPLICATIONS
11. CONCLUSION
ABSTRACTABSTRACT
• In array processors, data I/O management is the key to realizing
high-speed matrix operations that are often required in image
processing.
• In this project, we propose an array processor utilizing an
effective data I/O mechanism featuring external FIFOs.
• FIFOs are used as buffers to store Initial matrix data and
partially processed results. Therefore, matrix operations,
including the algorithm to solve the Algebraic Path Problem
(APP), can be performed without any data I/Os.
• In addition, we can eliminate register files from the processing
elements (PEs) if we construct the PE array by controlling the
external FIFOs systematically and transferring the data from the
FIFOs to the PE array (vice-versa).
• This enables us to simplify each PE structure and realize a large
array processor with limited hardware resources.
• The FIFOs themselves can be easily realized using conventional
discrete FIFO or memory chips.
Need for Parallel ComputingNeed for Parallel Computing
• Each and Every Future Field development
depends on Digital computing!
• Controlling Applications By means of
Digital circuit is simple and cost effective.
• The increase in complex computational
steps in digital processing, results in
Performance degradation.
• To solve this global problem, we are going
for an highly efficient architectural design
for Parallel Computing.
Parallel vs. Serial ComputingParallel vs. Serial Computing
Serial Computing Parallel Computing
Traditionally, software has been written
for serial computation.
To be run on a single computer having a
single Central Processing Unit (CPU).
A problem is broken into a discrete series of
instructions.
Instructions are executed one after another.
Only one instruction may execute at any
moment in time.
Parallel computing is the
simultaneous use of multiple compute
resources to solve a computational
problem.
To be run using multiple CPUs.
A problem is broken into discrete
parts that can be solved concurrently.
Each part is further broken down to a
series of instructions.
Instructions from each part execute
simultaneously on different CPUs.
Features of Parallel ComputingFeatures of Parallel Computing
• To process Multiple datas simultaneously.
• It reduces the computation time.
• The cost function of extended architecture
design is compromised to achieve
accuracy and speed of execution.
• Complexity is Reduced.
• It has infinite advantages.
Array processorArray processor
• A multiprocessor composed of a set of
identical central processing units.
• A processor, that is capable of performing
simultaneous computations on elements of
an array of data in some number of
dimensions.
• CPU will act synchronously(parallel) under
the control of a common unit.
• Exclusively designed for matrix calculation.
Systolic Array ProcessorSystolic Array Processor
• It is the existing processor.
• A systolic array is a pipe network
arrangement of processing units called
cells.
• It has parallel computing operation.
• Cells are used to compute data and stores
independently of each other.
• Cells consist of data processing units.
• DPU’s connected with each other by mesh
like arrangement.
Block diagram of systolic arrayBlock diagram of systolic array
Drawbacks of systolic arrayDrawbacks of systolic array
processorprocessor
• Expensive.
• Highly specialized for particular
applications.
• Difficult to build.
• Limited Memory.
• More number of registers are required.
Features of array processorFeatures of array processor
• High speed matrix operation.
• We can eliminate register files from
processing units.
• This is achieved here by means of FIFO’S.
• Control and scalar type instructions are
executed in the control unit .
• Vector instructions are performed in the
processing elements .
Base paper architectureBase paper architecture
• A design architecture of a
2D array processor is
proposed by eliminating
the use of ALU and
external RAM Memory.
Since all the calculations
can be performed by
rotating and shifting of the
MATRIX data.
• Consists of individual
Processing Elements.
• Supports simple
instruction set
• Avoids Algebraic Path
Problem.
2D toroidal structure of our
Proposed array Processor
Our Objective
• Project aim is to rotate and transpose an
image in matrix by taking the image
coefficients.
• The working of both Matlab and array
processor ‘image rotation and transpose’.
• To show how the diffrence in ‘time and
registers required’ comparing both the
methods.
Image rotation in Matlab
• This is considered as the normal method
of image rotation.
• More number of clock cycle.
• More memory required.
• More internal registers to store data.
• Time consuming process.
Image rotation in Matlab
Time taken for rotation = 0.128728 seconds.
Example
• By taking the above 2x2 matrix let us
calculate how much time and memory it’s
going to consume in both the systems.
• Aim: to achieve 90’ rotation and transpose
using Matlab and array processor.
location [0] [1]
[0] 1 2
[1] 3 4
Matlab algorithm for image
rotation
Required variables:
• Temporary variables: s,t
• Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])
• Required variables: 6
Procedure for rotation:
• S=a[0][0],t=a[0][1];-------1st
clock cycle
• A[0][0]=a[1][0]; -----------2nd
clock cycle
• A[1][0]=a[1][1]; -----------3rd
clock cycle
• A[0][1]=s; ------------------4th
clock cycle
• A[1][1]=t; ------------------5th
clock cycle
Drawbacks in matlab
rotation
• More variables are required.
• It takes 5 clock cycles for one variable to
be rotated.
• It takes 0.128sec to rotate an image.
• More memory(registers) is required.
• As per design consideration more gates
are also needed.
Array processor algorithm
for image rotation
For that same example, Algorithm for
rotation in Array processor is:
• A[0][0]<-a[1][0];
• A[0][1]<-a[0][0]; 1st
clock cycle (PARALLEL)
• A[1][0]<-a[1][1];
• A[1][1]<-a[0][1];
(“No need of Temporary Variables”)
Advantages of array processor
Rotation
• It takes only one clock cycle.
• It takes 150 nS to rotate the image.
• No need of Temporary variables.
• Less memory (registers).
• Less gates are required.
• Design is also simple.
Matlab algorithm for image
Transpose
Required variables:
• Temporary variables: s,t
• Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])
• Required variables: 6
Procedure for rotation:
• S=a[1][0],t=a[1][1];-------1st
clock cycle
• A[0][0]=a[0][0]; -----------2nd
clock cycle
• A[1][0]=a[0][1]; -----------3rd
clock cycle
• A[0][1]=s; ------------------4th
clock cycle
• A[1][1]=t; ------------------5th
clock cycle
Matlab algorithm for image
Transpose
Time taken for transpose = 0.082730 seconds
Drawbacks in matlab
Transpose
• It takes 5 clock cycles for one variable to
be transposed.
• It takes 0.0827sec to transpose an image
• More memory(registers) is required.
• As per design consideration more gates
are also needed.
Array processor algorithm
for image transpose
For that same example, Algorithm for
transpose in Array processor is:
• A[0][0]<-a[0][0];
• A[0][1]<-a[1][0]; 1st
clock cycle (PARALLEL)
• A[1][0]<-a[0][1];
• A[1][1]<-a[1][1];
(“No need of Temporary Variables”)
Advantages of array processor
Transpose
• It takes only one clock cycle.
• It takes 100 nS to transpose the image.
• No need of Temporary variables.
• Less memory (registers).
• Less gates are required.
• Design is also simple.
Proposed architecture for
image rotation
•The internal architecture of PE’s and FIFO’s are nothing but
registers.
•It shouldn’t have any character as it is going to obey the coded
program according to the proposed system.
Proposed architecture for
image transpose
•The internal architecture of PE’s and FIFO’s are nothing but
registers.
•It shouldn’t have any character as it is going to obey the coded
program according to the proposed system.
Operation of proposed system
• Rotate and Transpose commands are
activated.
• The rotation and transpose done in a
single clock cycle synchronously.
• All the processing elements are capable of
reading as well as writing the datas.
• Read and write operations are performed
synchronously (Parallel).
• Buses & FIFOs in between the PE’s plays a
major role in reducing the number of
registers.
Output of the rotated image
coefficients
Time taken for rotation = 150 nS.
Output of the transposed
image coefficients
Time taken for transposition = 100 nS.
Comparison between matlab
and array processor operations
OTHER APPLICATIONS OFOTHER APPLICATIONS OF
ARRAY PROCESSORARRAY PROCESSOR
Source: https://computing.llnl.gov/tutorials/parallel_comp/
CONCLUSION
• Thus the image processing in array
processor is proved to be more efficient
than any other system.
• In future the number of registers used can
be reduced by using more buses in PE’s.
• So the time of processing can also be
reduced by reducing the usage of
registers.
• From this project we have learnt one end
of the chip design.

More Related Content

Viewers also liked

India monsoon mission 2012 - GOI Programme
India monsoon mission 2012 - GOI ProgrammeIndia monsoon mission 2012 - GOI Programme
India monsoon mission 2012 - GOI ProgrammeSheeti Das
 
Forms Design
Forms Design Forms Design
Forms Design Ido Green
 
Respect Earth And Life In All It’S Diversity
Respect Earth And Life In All It’S DiversityRespect Earth And Life In All It’S Diversity
Respect Earth And Life In All It’S Diversitymoto454
 
Barranquilla/capital del dpto. del Atlantico/ pais-Colombia
Barranquilla/capital del dpto. del Atlantico/ pais-ColombiaBarranquilla/capital del dpto. del Atlantico/ pais-Colombia
Barranquilla/capital del dpto. del Atlantico/ pais-ColombiaJanettOlacireguiE
 
IES Alpedrete proyecto Erasmus+ Treasure Hunt
IES Alpedrete proyecto Erasmus+ Treasure HuntIES Alpedrete proyecto Erasmus+ Treasure Hunt
IES Alpedrete proyecto Erasmus+ Treasure Huntrosinavega
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processingKamal Acharya
 

Viewers also liked (10)

India monsoon mission 2012 - GOI Programme
India monsoon mission 2012 - GOI ProgrammeIndia monsoon mission 2012 - GOI Programme
India monsoon mission 2012 - GOI Programme
 
Refamilia079
Refamilia079Refamilia079
Refamilia079
 
Forms Design
Forms Design Forms Design
Forms Design
 
Degree_Certificate
Degree_CertificateDegree_Certificate
Degree_Certificate
 
Respect Earth And Life In All It’S Diversity
Respect Earth And Life In All It’S DiversityRespect Earth And Life In All It’S Diversity
Respect Earth And Life In All It’S Diversity
 
Barranquilla/capital del dpto. del Atlantico/ pais-Colombia
Barranquilla/capital del dpto. del Atlantico/ pais-ColombiaBarranquilla/capital del dpto. del Atlantico/ pais-Colombia
Barranquilla/capital del dpto. del Atlantico/ pais-Colombia
 
IES Alpedrete proyecto Erasmus+ Treasure Hunt
IES Alpedrete proyecto Erasmus+ Treasure HuntIES Alpedrete proyecto Erasmus+ Treasure Hunt
IES Alpedrete proyecto Erasmus+ Treasure Hunt
 
Resistor
ResistorResistor
Resistor
 
Network Layer
Network LayerNetwork Layer
Network Layer
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processing
 

Similar to Sathya Final review

Introduction to embedded computing and arm processors
Introduction to embedded computing and arm processorsIntroduction to embedded computing and arm processors
Introduction to embedded computing and arm processorsSiva Kumar
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxAkshitAgiwal1
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented DesignRodrigo Campos
 
introduction COA(M1).pptx
introduction COA(M1).pptxintroduction COA(M1).pptx
introduction COA(M1).pptxBhavanaMinchu
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD) Ali Raza
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD) Ali Raza
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Summit
 
Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017 Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017 Alpine Data
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain MultithreadingDharmesh Tank
 
Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecturekomal mistry
 
Cloud computing_processing frameworks
Cloud computing_processing frameworksCloud computing_processing frameworks
Cloud computing_processing frameworksReem Abdel-Rahman
 
BIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.pptBIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.pptKadri20
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureBalaji Vignesh
 
Performance Tuning by Dijesh P
Performance Tuning by Dijesh PPerformance Tuning by Dijesh P
Performance Tuning by Dijesh PPlusOrMinusZero
 
Lecutre-6 Datapath Design.ppt
Lecutre-6 Datapath Design.pptLecutre-6 Datapath Design.ppt
Lecutre-6 Datapath Design.pptRaJibRaju3
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data WarehousesConnor McDonald
 
Vector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array ProcessorsVector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array ProcessorsHsuvas Borkakoty
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 

Similar to Sathya Final review (20)

JavaFX 101
JavaFX 101JavaFX 101
JavaFX 101
 
Introduction to embedded computing and arm processors
Introduction to embedded computing and arm processorsIntroduction to embedded computing and arm processors
Introduction to embedded computing and arm processors
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptx
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
 
introduction COA(M1).pptx
introduction COA(M1).pptxintroduction COA(M1).pptx
introduction COA(M1).pptx
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
 
Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017 Spark Autotuning - Spark Summit East 2017
Spark Autotuning - Spark Summit East 2017
 
FIne Grain Multithreading
FIne Grain MultithreadingFIne Grain Multithreading
FIne Grain Multithreading
 
Digital signal processor architecture
Digital signal processor architectureDigital signal processor architecture
Digital signal processor architecture
 
Cloud computing_processing frameworks
Cloud computing_processing frameworksCloud computing_processing frameworks
Cloud computing_processing frameworks
 
BIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.pptBIL406-Chapter-6-Basic Parallelism and CPU.ppt
BIL406-Chapter-6-Basic Parallelism and CPU.ppt
 
Unit 5 Advanced Computer Architecture
Unit 5 Advanced Computer ArchitectureUnit 5 Advanced Computer Architecture
Unit 5 Advanced Computer Architecture
 
Performance Tuning by Dijesh P
Performance Tuning by Dijesh PPerformance Tuning by Dijesh P
Performance Tuning by Dijesh P
 
Lecutre-6 Datapath Design.ppt
Lecutre-6 Datapath Design.pptLecutre-6 Datapath Design.ppt
Lecutre-6 Datapath Design.ppt
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data Warehouses
 
Nbvtalkataitamimageprocessingconf
NbvtalkataitamimageprocessingconfNbvtalkataitamimageprocessingconf
Nbvtalkataitamimageprocessingconf
 
Vector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array ProcessorsVector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array Processors
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 

Sathya Final review

  • 1. ARRAY PROCESSOR FEATURINGARRAY PROCESSOR FEATURING AN EFFECTIVE FIFO BASED DATAAN EFFECTIVE FIFO BASED DATA STREAM MANAGEMENTSTREAM MANAGEMENT PROJECT INTERNALPROJECT INTERNAL GUIDEGUIDE Mrs.I.VATSALAPRIYA.M.E.,Mrs.I.VATSALAPRIYA.M.E., PROJECT MEMBERSPROJECT MEMBERS:: S.SATHIYA SAINATHAN,S.SATHIYA SAINATHAN, P.SRIBALAMURUGANP.SRIBALAMURUGAN
  • 2. SYNOPSISSYNOPSIS 1. ABSTRACT 2. NEED FOR PARALLEL COMPUTING 3. INTRODUCTION TO PARALLEL PROCESSOR AND ITS FEATURES 4. ARRAY PROCESSOR 5. SYSTOLIC ARRAY PROCESSOR 6. BASE PAPER ARCHITECTURE FOR MATRIX CALCULATION 7. PROJECT THEME IMAGE ROTATION AND IMAGE TRANSPOSE 8. COMPARISON BETWEEN MATLAB AND ARRAY PROCESSOR 9. PROPOSED ARCHITECTURE 10. OUTPUT AND OTHER APPLICATIONS 11. CONCLUSION
  • 3. ABSTRACTABSTRACT • In array processors, data I/O management is the key to realizing high-speed matrix operations that are often required in image processing. • In this project, we propose an array processor utilizing an effective data I/O mechanism featuring external FIFOs. • FIFOs are used as buffers to store Initial matrix data and partially processed results. Therefore, matrix operations, including the algorithm to solve the Algebraic Path Problem (APP), can be performed without any data I/Os. • In addition, we can eliminate register files from the processing elements (PEs) if we construct the PE array by controlling the external FIFOs systematically and transferring the data from the FIFOs to the PE array (vice-versa). • This enables us to simplify each PE structure and realize a large array processor with limited hardware resources. • The FIFOs themselves can be easily realized using conventional discrete FIFO or memory chips.
  • 4. Need for Parallel ComputingNeed for Parallel Computing • Each and Every Future Field development depends on Digital computing! • Controlling Applications By means of Digital circuit is simple and cost effective. • The increase in complex computational steps in digital processing, results in Performance degradation. • To solve this global problem, we are going for an highly efficient architectural design for Parallel Computing.
  • 5. Parallel vs. Serial ComputingParallel vs. Serial Computing Serial Computing Parallel Computing Traditionally, software has been written for serial computation. To be run on a single computer having a single Central Processing Unit (CPU). A problem is broken into a discrete series of instructions. Instructions are executed one after another. Only one instruction may execute at any moment in time. Parallel computing is the simultaneous use of multiple compute resources to solve a computational problem. To be run using multiple CPUs. A problem is broken into discrete parts that can be solved concurrently. Each part is further broken down to a series of instructions. Instructions from each part execute simultaneously on different CPUs.
  • 6. Features of Parallel ComputingFeatures of Parallel Computing • To process Multiple datas simultaneously. • It reduces the computation time. • The cost function of extended architecture design is compromised to achieve accuracy and speed of execution. • Complexity is Reduced. • It has infinite advantages.
  • 7. Array processorArray processor • A multiprocessor composed of a set of identical central processing units. • A processor, that is capable of performing simultaneous computations on elements of an array of data in some number of dimensions. • CPU will act synchronously(parallel) under the control of a common unit. • Exclusively designed for matrix calculation.
  • 8. Systolic Array ProcessorSystolic Array Processor • It is the existing processor. • A systolic array is a pipe network arrangement of processing units called cells. • It has parallel computing operation. • Cells are used to compute data and stores independently of each other. • Cells consist of data processing units. • DPU’s connected with each other by mesh like arrangement.
  • 9. Block diagram of systolic arrayBlock diagram of systolic array
  • 10. Drawbacks of systolic arrayDrawbacks of systolic array processorprocessor • Expensive. • Highly specialized for particular applications. • Difficult to build. • Limited Memory. • More number of registers are required.
  • 11. Features of array processorFeatures of array processor • High speed matrix operation. • We can eliminate register files from processing units. • This is achieved here by means of FIFO’S. • Control and scalar type instructions are executed in the control unit . • Vector instructions are performed in the processing elements .
  • 12. Base paper architectureBase paper architecture • A design architecture of a 2D array processor is proposed by eliminating the use of ALU and external RAM Memory. Since all the calculations can be performed by rotating and shifting of the MATRIX data. • Consists of individual Processing Elements. • Supports simple instruction set • Avoids Algebraic Path Problem. 2D toroidal structure of our Proposed array Processor
  • 13. Our Objective • Project aim is to rotate and transpose an image in matrix by taking the image coefficients. • The working of both Matlab and array processor ‘image rotation and transpose’. • To show how the diffrence in ‘time and registers required’ comparing both the methods.
  • 14. Image rotation in Matlab • This is considered as the normal method of image rotation. • More number of clock cycle. • More memory required. • More internal registers to store data. • Time consuming process.
  • 15. Image rotation in Matlab Time taken for rotation = 0.128728 seconds.
  • 16. Example • By taking the above 2x2 matrix let us calculate how much time and memory it’s going to consume in both the systems. • Aim: to achieve 90’ rotation and transpose using Matlab and array processor. location [0] [1] [0] 1 2 [1] 3 4
  • 17. Matlab algorithm for image rotation Required variables: • Temporary variables: s,t • Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1]) • Required variables: 6 Procedure for rotation: • S=a[0][0],t=a[0][1];-------1st clock cycle • A[0][0]=a[1][0]; -----------2nd clock cycle • A[1][0]=a[1][1]; -----------3rd clock cycle • A[0][1]=s; ------------------4th clock cycle • A[1][1]=t; ------------------5th clock cycle
  • 18. Drawbacks in matlab rotation • More variables are required. • It takes 5 clock cycles for one variable to be rotated. • It takes 0.128sec to rotate an image. • More memory(registers) is required. • As per design consideration more gates are also needed.
  • 19. Array processor algorithm for image rotation For that same example, Algorithm for rotation in Array processor is: • A[0][0]<-a[1][0]; • A[0][1]<-a[0][0]; 1st clock cycle (PARALLEL) • A[1][0]<-a[1][1]; • A[1][1]<-a[0][1]; (“No need of Temporary Variables”)
  • 20. Advantages of array processor Rotation • It takes only one clock cycle. • It takes 150 nS to rotate the image. • No need of Temporary variables. • Less memory (registers). • Less gates are required. • Design is also simple.
  • 21. Matlab algorithm for image Transpose Required variables: • Temporary variables: s,t • Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1]) • Required variables: 6 Procedure for rotation: • S=a[1][0],t=a[1][1];-------1st clock cycle • A[0][0]=a[0][0]; -----------2nd clock cycle • A[1][0]=a[0][1]; -----------3rd clock cycle • A[0][1]=s; ------------------4th clock cycle • A[1][1]=t; ------------------5th clock cycle
  • 22. Matlab algorithm for image Transpose Time taken for transpose = 0.082730 seconds
  • 23. Drawbacks in matlab Transpose • It takes 5 clock cycles for one variable to be transposed. • It takes 0.0827sec to transpose an image • More memory(registers) is required. • As per design consideration more gates are also needed.
  • 24. Array processor algorithm for image transpose For that same example, Algorithm for transpose in Array processor is: • A[0][0]<-a[0][0]; • A[0][1]<-a[1][0]; 1st clock cycle (PARALLEL) • A[1][0]<-a[0][1]; • A[1][1]<-a[1][1]; (“No need of Temporary Variables”)
  • 25. Advantages of array processor Transpose • It takes only one clock cycle. • It takes 100 nS to transpose the image. • No need of Temporary variables. • Less memory (registers). • Less gates are required. • Design is also simple.
  • 26. Proposed architecture for image rotation •The internal architecture of PE’s and FIFO’s are nothing but registers. •It shouldn’t have any character as it is going to obey the coded program according to the proposed system.
  • 27. Proposed architecture for image transpose •The internal architecture of PE’s and FIFO’s are nothing but registers. •It shouldn’t have any character as it is going to obey the coded program according to the proposed system.
  • 28. Operation of proposed system • Rotate and Transpose commands are activated. • The rotation and transpose done in a single clock cycle synchronously. • All the processing elements are capable of reading as well as writing the datas. • Read and write operations are performed synchronously (Parallel). • Buses & FIFOs in between the PE’s plays a major role in reducing the number of registers.
  • 29. Output of the rotated image coefficients Time taken for rotation = 150 nS.
  • 30. Output of the transposed image coefficients Time taken for transposition = 100 nS.
  • 31. Comparison between matlab and array processor operations
  • 32. OTHER APPLICATIONS OFOTHER APPLICATIONS OF ARRAY PROCESSORARRAY PROCESSOR Source: https://computing.llnl.gov/tutorials/parallel_comp/
  • 33. CONCLUSION • Thus the image processing in array processor is proved to be more efficient than any other system. • In future the number of registers used can be reduced by using more buses in PE’s. • So the time of processing can also be reduced by reducing the usage of registers. • From this project we have learnt one end of the chip design.