2016.04.21 - State of the Art

Politecnico di Milano
Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB)
TITOLO
XOHW16 Meeting
Tizio Caio
Tizio.caio@mail.polimi.it
Thursday, November 11, 2015
project
HAMS
Chiara Gatti
chiara1.gatti@mail.polimi.it
Guido Lanfranchi
guido2.lanfranchi@mail.polimi.it
STATE OF THE ART
April 21th, 2016
NECST Lab, Politecnico di Milano
Credits: Shahriar Emil from the Noun Project

2
State of the
Art
Matlab HDL Coder HW matrix inversion

3
State of the
Art
- Matrices can not be passed
directly as I/O (but can be
managed internally)
- Requires fixed-point conversion
(not directly available for function
«inv» and «pinv»)
- Requires HW-adapted algorithms
(eg. CORDIC)
 not trivial!

4
State of the
Art
managed internally)
(eg. CORDIC)
 not trivial!

5
State of the
Art
managed internally)
(eg. CORDIC)
 not trivial!
HW Devices
Applicative
domains
Algorithms

6
84%
11%
5%
Xilinx
Altera
other
- Virtex II
- Virtex 4 FXGO
- Virtex 5
- Virtex 7
- RC-1000
Hardware Devices (*)
(*) data extracted from 27 papers related
to our topic. References at the end

7
APPLICATIONS
DIGITAL SIGNAL
PROCESSING (DSP)
PURE MATHS OTHER SIMULATIONS
Applicative domains

8
APPLICATIONS
DIGITAL SIGNAL
PROCESSING (DSP)
Image
processing
Communications
Tele, radio,
wireless…
Data
detection
PURE MATHS
OTHER
SIMULATIONS
75%

9
APPLICATIONS
DIGITAL SIGNAL
PROCESSING (DSP)
Image
processing
Communications
Tele, radio,
wireless…
Data
detection
PURE MATHS
OTHER
SIMULATIONS
14%

10
APPLICATIONS
DIGITAL SIGNAL
PROCESSING (DSP)
Image
processing
Communications
Tele, radio,
wireless…
Data
detection
PURE MATHS
OTHER
SIMULATIONS
1%

11
APPLICATIONS
DIGITAL SIGNAL
PROCESSING (DSP)
Image
processing
Communications
Tele, radio,
wireless…
Data
detection
PURE MATHS
OTHER
SIMULATIONS

12
Algorithms
SVD method°Greville’s algorithm
Full rank QR
factorization
Moore-Penrose Pseudo Inverse*
* Corrieu P, «Fast Computation of Moore-Penrose Inverse Matrices», Neural Information Processing, 2005

13
Algorithms
SVD method°Greville’s algorithm
Full rank QR
factorization
Let be A = U*∑*V’  then pinv(A) = V*pinv(∑)*U’
* Corrieu P, «Fast Computation of Moore-Penrose Inverse Matrices», Neural Information Processing, 2005

14
Algorithms
SVD method°Rank Decomposition QR Method
° Rahmati et al, “FPGA Based Singular Value Decomposition for Image Processing Applications ”, 2008
QR algorithm
Computationally efficient
Hemkumar, "A systolic VLSI architecture
for complex SVD", 1992
Jacobi method
More accurate, parallelism
Luk, Park, "A proof of convergence for two
parallel Jacobi SVD algorithms", 2002

15
Some results
«Reconfigurable FPGA-Based Unit for Singular Value Decomposition of
Large m x n Matrices», Ledesma-Carrillo et al., 2011
vs Matlab 7.3.0.267 utilizing 2.4GHz Intel Core Duo Processor

16
Some results
Singular Value Matlab* FPGA % error
σ1 2.6603 2.7500 3.3718
σ2 2.3113 2.3125 0.0519
Elapsed Time 2.7141 s 24.3143 ms
“Reconfigurable FPGA-Based Unit for Singular Value Decomposition
of Large m x n Matrices”, Ledesma-Carrillo et al., 2011
SVD Computation of a 32x127 Matrix: this table shows the corresponding
singular values with the minimum and maximum estimation errors for the
case of a 32 x 127 matrix. This table also shows the elapsed time for the
software and hardware implementations.
*Matlab 7.3.0.267 utilizing 2.4GHz Intel Core Duo Processor

17
Some results
“Reconfigurable FPGA-Based Unit for Singular Value Decomposition
of Large m x n Matrices”, Ledesma-Carrillo et al., 2011
Resources Utilization Xilinx Spartan 3
3S1000ft256-4
Altera Cyclone II
EP2C35F672C6
Programmable Logic 78% 14%
Memory 100% 75%
Multipliers 100% 39%
Max. Op. Freq. 57.981 MHz 65.928 MHz
Resource Utilization of the Proposed FPGA-Based SVD Computation Unit for the
32x127 case study matrix

18
Some results
«Reconfigurable FPGA-Based Unit for Singular Value Decomposition of
Large m x n Matrices», Ledesma-Carrillo et al., 2011
o Before this work:
• non-symmetric matrices up to 8x8
• larger symmetric matrices
o After this work:
• Large mxn matrices…
• but up to 32x127

19
Our contribution
vs vs

20
• Managing of the whole interface
• It is not needed to write HDL-
friendly Matlab code (only
function)
Our contribution
vs

21
Matlab HDL Coder
Our contribution
vs
HW matrix inversion
Applicative
domains
Fluid dynamics simulation of
an oxygenator for ECC
• Managing of the whole interface
• It is not needed to write HDL-
friendly Matlab code (only
function)

22
Our contribution
Matlab HDL Coder
Management of larger matrices
(up to 8000x8000)
vs
HW matrix inversion

23
Our contribution
Matlab HDL Coder
vs
Management of larger matrices
(up to 8000x8000)
through
(i) strong parallelism
(ii) streaming in data transfer
(iii) Xilinx Virtex 7 VC707
HW matrix inversion

24
HAMSproject
Contact us!
You can find us…
hams.necst@gmail.com
chiara1.gatti@mail.polimi.it
guido2.lanfranchi@mail.polimi.it
www.facebook.com/hams.project
https://twitter.com/HAMS_project
http://www.slideshare.net/HAMSproject
https://www.youtube.com/channel/UCaovqRpUc7D_Uf2WJHL0rvA
ANY QUESTIONS?

25
References
[1] Wang et al, “A CORDIC-Based Dynamically Reconfigurable FPGA Architecture for Signal Processing Algorithms”, 2008
[2] Burian et al, “A Fixed-Point Implementation of Matrix Inversion Using Cholesky Decomposition”, 2004
[3] Bigdeli et al, “A New Pipelined Systolic Array-Based Architecture for Matri Inversion in FPGAs with Kalman Filter Case Study”, 2005
[4] Edmann et al, “A Scalable Pipelined Complex Valued Matrix Inversion Architecture”, 2005
[5] Garcia et al, “A Suitable FPGA Implementation of Floating-Point Matrix Inversion Based on Gauss-Jordan Elimination», 2011
[6] Ahmedsaid et al, “Accelerating SVD on Reconfigurable Hardware for Image Denoising”, 2004
[7] Kumar et al, “An Approach to Design a Matrix Inversion HW Module using FPGA”, 2014
[8] Irturk et al, “An Efficient FPGA Implementation of Scalable Matrix Inversion Core usign QR Decomposition”, 2009
[9] Norton et al, “An Evaluation of the Xilinx Virtex-4 FPGA for On-Board Processin in an Advanced Imaging System”, 2009
[10] Irturk et al, “An FPGA Design Space Exploration Tool for Matrix Inversion Archiectures”, 2008
[11] Ma et al, “An FPGA-based Singular Value Decomposition Processor ”, 2006
[12] Wu et al, “Approximate Matrix Inversion for High-Throughput Data Detection in the Large-Scale MIMO Uplink ”, 2013
[13] Irturk et al, “Automatic Generation of Decomposition based Matrix Inversion Architectures ”, 2008
[14] Szekowka et al, “CORDIC and SVD Implementation in Digital Hardware ”, 2010
[15] Sergiyenko et al, “Error-Free Computation of Inverse Matrices in FPGA ”, 2013
[16] Rahmati et al, “FPGA Based Singular Value Decomposition for Image Processing Applications ”, 2008
[17] Grammenos et al, “FPGA Design of a Truncated SVD Based Receiver for the detection of SEFDM Signals ”, 2011
[18] Karkooti et al, “FPGA Implementation of Matrix Inversion Using QRD-RLS Algorithm”, 2005
[19] Blace et al, “High level Prototyping and FPGA Implementation of the Orthogonal Matching Pursuit Algorithm ”, 2012
[20] Ahmedsaid et al, “Improved SVD Systolic Array and Implementation on FPGA”, 2003
[21] S. Hu and Q. Yan, “Inversion of Vandermonde Matrices in FPGAs ”, 2004
[22] Ohta et al, “Matrix Decomposition Suitable for FPGA Implementation of N-contnuous OFDM ”, 2014
[23] Chisty et al, “Matrix Inversion Using QR Decomposition by Parabolic Synthesis ”, 2012
[24] Ma et al, “QR Decomposition-Based Matrix Inversion for High Embedded MIMO Receivers ”, 2011
[25] Wernke et al, “Real-Time Data Processing for an Advanced Imaging System Using the Xilinx Virtex-5 FPGA ”, 2009
[26] Ledesma-Carrillo et al, “Reconfigurable FPGA-Based Unit for Singular Value Decomposition of Large mxn Matrices ”, 2011
[27] Wang et al, “Singular Value Decomposition Hardware for MIMO - State of the Art and Custom Design ”, 2010

2016.04.21 - State of the Art

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (14)

Similar to 2016.04.21 - State of the Art

Similar to 2016.04.21 - State of the Art (20)

More from HAMSproject

More from HAMSproject (13)

Recently uploaded

Recently uploaded (7)

2016.04.21 - State of the Art