Sathya Final review

ARRAY PROCESSOR FEATURINGARRAY PROCESSOR FEATURING
AN EFFECTIVE FIFO BASED DATAAN EFFECTIVE FIFO BASED DATA
STREAM MANAGEMENTSTREAM MANAGEMENT
PROJECT INTERNALPROJECT INTERNAL GUIDEGUIDE
Mrs.I.VATSALAPRIYA.M.E.,Mrs.I.VATSALAPRIYA.M.E.,
PROJECT MEMBERSPROJECT MEMBERS::
S.SATHIYA SAINATHAN,S.SATHIYA SAINATHAN,
P.SRIBALAMURUGANP.SRIBALAMURUGAN

SYNOPSISSYNOPSIS
1. ABSTRACT
2. NEED FOR PARALLEL COMPUTING
3. INTRODUCTION TO PARALLEL PROCESSOR AND ITS FEATURES
4. ARRAY PROCESSOR
5. SYSTOLIC ARRAY PROCESSOR
6. BASE PAPER ARCHITECTURE FOR MATRIX CALCULATION
7. PROJECT THEME IMAGE ROTATION AND IMAGE TRANSPOSE
8. COMPARISON BETWEEN MATLAB AND ARRAY PROCESSOR
9. PROPOSED ARCHITECTURE
10. OUTPUT AND OTHER APPLICATIONS
11. CONCLUSION

ABSTRACTABSTRACT
• In array processors, data I/O management is the key to realizing
high-speed matrix operations that are often required in image
processing.
• In this project, we propose an array processor utilizing an
effective data I/O mechanism featuring external FIFOs.
• FIFOs are used as buffers to store Initial matrix data and
partially processed results. Therefore, matrix operations,
including the algorithm to solve the Algebraic Path Problem
(APP), can be performed without any data I/Os.
• In addition, we can eliminate register files from the processing
elements (PEs) if we construct the PE array by controlling the
external FIFOs systematically and transferring the data from the
FIFOs to the PE array (vice-versa).
• This enables us to simplify each PE structure and realize a large
array processor with limited hardware resources.
• The FIFOs themselves can be easily realized using conventional
discrete FIFO or memory chips.

Need for Parallel ComputingNeed for Parallel Computing
• Each and Every Future Field development
depends on Digital computing!
• Controlling Applications By means of
Digital circuit is simple and cost effective.
• The increase in complex computational
steps in digital processing, results in
Performance degradation.
• To solve this global problem, we are going
for an highly efficient architectural design
for Parallel Computing.

Parallel vs. Serial ComputingParallel vs. Serial Computing
Serial Computing Parallel Computing
Traditionally, software has been written
for serial computation.
To be run on a single computer having a
single Central Processing Unit (CPU).
A problem is broken into a discrete series of
instructions.
Instructions are executed one after another.
Only one instruction may execute at any
moment in time.
Parallel computing is the
simultaneous use of multiple compute
resources to solve a computational
problem.
To be run using multiple CPUs.
A problem is broken into discrete
parts that can be solved concurrently.
Each part is further broken down to a
series of instructions.
Instructions from each part execute
simultaneously on different CPUs.

Features of Parallel ComputingFeatures of Parallel Computing
• To process Multiple datas simultaneously.
• It reduces the computation time.
• The cost function of extended architecture
design is compromised to achieve
accuracy and speed of execution.
• Complexity is Reduced.
• It has infinite advantages.

Array processorArray processor
• A multiprocessor composed of a set of
identical central processing units.
• A processor, that is capable of performing
simultaneous computations on elements of
an array of data in some number of
dimensions.
• CPU will act synchronously(parallel) under
the control of a common unit.
• Exclusively designed for matrix calculation.

Systolic Array ProcessorSystolic Array Processor
• It is the existing processor.
• A systolic array is a pipe network
arrangement of processing units called
cells.
• It has parallel computing operation.
• Cells are used to compute data and stores
independently of each other.
• Cells consist of data processing units.
• DPU’s connected with each other by mesh
like arrangement.

Block diagram of systolic arrayBlock diagram of systolic array

Drawbacks of systolic arrayDrawbacks of systolic array
processorprocessor
• Expensive.
• Highly specialized for particular
applications.
• Difficult to build.
• Limited Memory.
• More number of registers are required.

Features of array processorFeatures of array processor
• High speed matrix operation.
• We can eliminate register files from
processing units.
• This is achieved here by means of FIFO’S.
• Control and scalar type instructions are
executed in the control unit .
• Vector instructions are performed in the
processing elements .

Base paper architectureBase paper architecture
• A design architecture of a
2D array processor is
proposed by eliminating
the use of ALU and
external RAM Memory.
Since all the calculations
can be performed by
rotating and shifting of the
MATRIX data.
• Consists of individual
Processing Elements.
• Supports simple
instruction set
• Avoids Algebraic Path
Problem.
2D toroidal structure of our
Proposed array Processor

Our Objective
• Project aim is to rotate and transpose an
image in matrix by taking the image
coefficients.
• The working of both Matlab and array
processor ‘image rotation and transpose’.
• To show how the diffrence in ‘time and
registers required’ comparing both the
methods.

Image rotation in Matlab
• This is considered as the normal method
of image rotation.
• More number of clock cycle.
• More memory required.
• More internal registers to store data.
• Time consuming process.

Image rotation in Matlab
Time taken for rotation = 0.128728 seconds.

Example
• By taking the above 2x2 matrix let us
calculate how much time and memory it’s
going to consume in both the systems.
• Aim: to achieve 90’ rotation and transpose
using Matlab and array processor.
location [0] [1]
[0] 1 2
[1] 3 4

Matlab algorithm for image
rotation
Required variables:
• Temporary variables: s,t
• Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])
• Required variables: 6
Procedure for rotation:
• S=a[0][0],t=a[0][1];-------1st
clock cycle
• A[0][0]=a[1][0]; -----------2nd
clock cycle
• A[1][0]=a[1][1]; -----------3rd
clock cycle
• A[0][1]=s; ------------------4th
clock cycle
• A[1][1]=t; ------------------5th
clock cycle

Drawbacks in matlab
rotation
• More variables are required.
• It takes 5 clock cycles for one variable to
be rotated.
• It takes 0.128sec to rotate an image.
• More memory(registers) is required.
• As per design consideration more gates
are also needed.

Array processor algorithm
for image rotation
For that same example, Algorithm for
rotation in Array processor is:
• A[0][0]<-a[1][0];
• A[0][1]<-a[0][0]; 1st
clock cycle (PARALLEL)
• A[1][0]<-a[1][1];
• A[1][1]<-a[0][1];
(“No need of Temporary Variables”)

Advantages of array processor
Rotation
• It takes only one clock cycle.
• It takes 150 nS to rotate the image.
• No need of Temporary variables.
• Less memory (registers).
• Less gates are required.
• Design is also simple.

Transpose
Required variables:
• Temporary variables: s,t
• Matrix Location: (a[0][0],a[0][1],a[1][0],a[1][1])
• Required variables: 6
Procedure for rotation:
• S=a[1][0],t=a[1][1];-------1st
clock cycle
• A[0][0]=a[0][0]; -----------2nd
clock cycle
• A[1][0]=a[0][1]; -----------3rd
clock cycle
• A[0][1]=s; ------------------4th
clock cycle
• A[1][1]=t; ------------------5th
clock cycle

Transpose
Time taken for transpose = 0.082730 seconds

Drawbacks in matlab
Transpose
• It takes 5 clock cycles for one variable to
be transposed.
• It takes 0.0827sec to transpose an image
• More memory(registers) is required.
• As per design consideration more gates
are also needed.

Array processor algorithm
for image transpose
For that same example, Algorithm for
transpose in Array processor is:
• A[0][0]<-a[0][0];
• A[0][1]<-a[1][0]; 1st
clock cycle (PARALLEL)
• A[1][0]<-a[0][1];
• A[1][1]<-a[1][1];
(“No need of Temporary Variables”)

Advantages of array processor
Transpose
• It takes only one clock cycle.
• It takes 100 nS to transpose the image.
• No need of Temporary variables.
• Less memory (registers).
• Less gates are required.
• Design is also simple.

Proposed architecture for
image rotation
•The internal architecture of PE’s and FIFO’s are nothing but
registers.
•It shouldn’t have any character as it is going to obey the coded
program according to the proposed system.

Proposed architecture for
image transpose
•The internal architecture of PE’s and FIFO’s are nothing but
registers.
•It shouldn’t have any character as it is going to obey the coded
program according to the proposed system.

Operation of proposed system
• Rotate and Transpose commands are
activated.
• The rotation and transpose done in a
single clock cycle synchronously.
• All the processing elements are capable of
reading as well as writing the datas.
• Read and write operations are performed
synchronously (Parallel).
• Buses & FIFOs in between the PE’s plays a
major role in reducing the number of
registers.

Output of the rotated image
coefficients
Time taken for rotation = 150 nS.

Output of the transposed
image coefficients
Time taken for transposition = 100 nS.

Comparison between matlab
and array processor operations

OTHER APPLICATIONS OFOTHER APPLICATIONS OF
ARRAY PROCESSORARRAY PROCESSOR
Source: https://computing.llnl.gov/tutorials/parallel_comp/

CONCLUSION
• Thus the image processing in array
processor is proved to be more efficient
than any other system.
• In future the number of registers used can
be reduced by using more buses in PE’s.
• So the time of processing can also be
reduced by reducing the usage of
registers.
• From this project we have learnt one end
of the chip design.

Sathya Final review

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Similar to Sathya Final review

Similar to Sathya Final review (20)

Sathya Final review