SlideShare a Scribd company logo
1 of 9
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
An Efficient Fault-Tolerance Design for Integer Parallel Matrix–Vector
Multiplications
Abstract:
Parallel matrix processing is a typical operation in many systems, and in particular
matrix–vector multiplication (MVM) is one of the most common operations in the
modern digital signal processing and digital communication systems. This paper proposes
a fault tolerant design for integer parallel MVMs. The scheme combines ideas from error
correction codes with the self-checking capability of MVM. Field-programmable gate
array evaluation shows that the proposed scheme can significantly reduce the overheads
compared to the protection of each MVM on its own. Therefore, the proposed technique
can be used to reduce the cost of providing fault tolerance in practical implementations.
Software Implementation:
 Modelsim
 Xilinx 14.2
Existing System:
Matrix–vector multiplication (MVM) is a typical computation in domain transformation,
projection, or feature extraction. With increasing demand for high performance or
throughput, parallel computing is becoming more important in high-performance
computing systems, cloud computing systems, and communication systems. For example,
parallel matrix processing with a common input vector (parallel MVMs) is performed for
pre coding in multiuser multi-in multi-out (MIMO) (especially large-scale MIMO) and
high-performance low density parity check decoders. Parallel processing may run on a
large number of processing units, e.g., GPUs or distributed systems, and experiments
show that soft errors can affect the outputs. In this ABFT design, the two input matrices
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
are changed to a “row checksum” matrix and a “column checksum” matrix, respectively,
and single errors in the resulting “full checksum” matrix can be detected and corrected
based on its row checksum and column checksum.
Since the row checksum does not exist for an input vector, such ABFT scheme can only
be used for error detection in MVM. In the ABFT schemes proposed , checksum
technique is used to locate the affected result element or blocks of result elements in
MVM, and partial re computation is used for error correction. Such schemes are suitable
for the cases in which the delay introduced by re computation is acceptable. But this is
not always the case, especially in high-throughput communication receivers where the
inputs arrive in real time from A/D converters. A general method to protect linear
dynamic systems was proposed. Partof that system model can be extracted as an MVM. I
n a method to protect parallel digital filters was proposed. The scheme is based on
considering each filter as a bit in an error correction code (ECC) so that parity check
filters are added and used to detect and correct errors. The schemes in share a similar
approach in that redundant rows are added to the processing matrix by combination (or
coding) of the original rows of the matrix so that the relations among the multiple outputs
can be used to locate and correct the erroneous elements. However, this approach only
works for single MVM. In the protection of parallel fast Fourier transform (FFT)
operations on different input vectors was studied. In that case, a self checking property of
the FFT was combined with the ECC method to reduce the overhead required for
protection. we proposed an idea based on ECC and MVM self-checking (checksum) for
error protection of parallel MVMs. We assume real-time applications in which integer
vectors come from ADCs with a fixed timing and processing is synchronous and
repetitive, so a re computation is not tolerable.
This paper is an extension of that two page conference paper, and provides new results
based on the FGPA implementation. To the best of the authors’ knowledge, no other
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
fault-tolerant design for parallel MVMs has been proposed. In the proposed structure, a
“detection matrix” D and a “sum matrix” A are added for fault tolerance. D is generated
by performing ECC to the checksum row of each matrix, and A is a combination of all
matrices under protection. By comparing the results of the original MVMs and that of the
check matrix, the fault can be located. Then, the erroneous output vector can be
recovered by subtracting the correct output vectors from the sum output vector.
Disadvantages:
 Overheads is not reduced
 Cost is high
 Errors are not identified easily
Proposed System:
Fault tolerance design for parallel matrix processing
A. Basic Structure The basic structure of the proposed fault-tolerant system for parallel
MVMs is shown in Fig. 1. From Fig. 1, we see that the original P matrices remain
unchanged, and the redundant part includes two matrices (regardless of the number of
parallel matrices under protection):
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Fig. 1. Fault-tolerant structure for parallel MVMs
detection matrix and sum matrix . By comparing the P outputs
with the checking result the faulty vector
can be detected, and then the correct output vector , can be obtained by subtracting
the correct vectors among from the “sum output vector” . The
number of rows of D (Rp) is determined by the number of parallel matrices (P). When
considering the case that only one MVM may fail, a Hamming code can be used for
protection, and the relationship between P and Rp would be
On the other hand, in our proposed technique (Fig. 1), a branch is an entire MVM
processing.
Generation of Detection Matrix and SUM Matrix
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
The sum matrix is just the direct sum of the original matrix
For the detection matrix, let us assume P = 4 to facilitate the illustration of the basic idea,
and assuming only one branch output may contain errors. First, a checksum row for each
matrix can be generated as
Then, the rows of detection D can be generated ,Hamming code as
Then, the error detection matrix can be defined as matrix as
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
if there are no errors, we should have
where “sum(x)” performs addition of all items within vector x, and
In (12), the result of is an integer vector, and sum(
produces an integer scalar. In summary, if we denote
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
Fig. 2. Implementation structure of MVM (N = 20)
error detection of a single branch can also be done. After identifying the failing branch,
the corresponding output vector can be recovered by subtracting the right output vectors
from the sum output vector
Since the error detection is based on the sum of the output vectors, and the whole vector
could be recovered during the correction, the proposed scheme could deal with multiple
errors in a single output vector
Advantages:
 Overheads are reduced
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
 Cost is low
 Error can be easily identified by error detection method
References:
[1] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. Englewood Cliffs, NJ, USA:
Prentice-Hall, 1989.
[2] J. S. Lim, Two-Dimensional Digital Signal Processing. Englewood Cliffs, NJ, USA: Prentice-Hall,
1989.
[3] A. Grama, A. Gupta, G. Karypis, and V. Kumar, Introduction to Parallel Computing, 2nd ed.
London, U.K.: Pearson, Jan. 2003.
[4] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. Hoboken, NJ,
USA: Wiley, 1999.
[5] S. Roger, C. Ramiro, A. Gonzalez, V. Almenar, and A. M. Vidal, “Fully parallel GPU
implementation of a fixed-complexity soft-output MIMO detector,” IEEE Trans. Veh. Technol., vol. 61,
no. 8, pp. 3796–3800, Oct. 2012.
[6] Q. Qi and C. Chakrabarti, “Parallel high throughput soft-output sphere decoder,” in Proc. IEEE
Workshop Signal Process. Syst., Oct. 2010, pp. 174–179.
[7] M. Wu, C. Dick, and J. R. Cavallaro, “Improving MIMO sphere detection through antenna detection
order scheduling,” in Proc. SDR Forum, 2011, pp. 280–284.
[8] G. Wang, M. Wu, Y. Sun, and J. R. Cavallaro, “A massively parallel implementation of QC-LDPC
decoder on GPU,” in Proc. IEEE 9th Symp. Appl. Specific Process. (SASP), Jun. 2011, pp. 82–85.
[9] G. Wang, H. Shen, B. Yin, M. Wu, Y. Sun, and J. R. Cavallaro, “Parallel nonbinary LDPC decoding
on GPU,” in Proc. ASILOMAR Conf., Nov. 2012, pp. 1277–1281.
[10] I. S. Haque and V. S. Pande, “Hard data on soft errors: A largescale assessment of real-world error
rates in GPGPU,” in Proc. 10th IEEE/ACM Int. Conf. Cluster, Cloud Grid Comput. (CCGrid), May
2010, pp. 691–696.
NXFEE INNOVATION
(SEMICONDUCTOR IP &PRODUCT DEVELOPMENT)
(ISO : 9001:2015Certified Company),
# 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam,
Pondicherry– 605004, India.
Buy Project on Online :www.nxfee.com | contact : +91 9789443203 |
email : nxfee.innovation@gmail.com
_________________________________________________________________
[11] P. Rech, C. Aguiar, C. Frost, and L. Carro, “An efficient and experimentally tuned software-based
hardening strategy for matrix multiplication on GPUs,” IEEE Trans. Nucl. Sci., vol. 60, no. 4, pp. 2797–
2804, Aug. 2013.
[12] K.-H. Huang and J. A. Abraham, “Algorithm-based fault tolerance for matrix operations,” IEEE
Trans. Comput., vol. C-33, no. 6, pp. 518–528, Jun. 1984.
[13] C. Ding, C. Karlsson, H. Liu, T. Davies, and Z. Chen, “Matrix multiplication on GPUs with on-line
fault tolerance,” in Proc. IEEE 9th Int. Symp. Parallel Distrib. Process. Appl., May 2011, pp. 311–317.

More Related Content

Similar to An efficient fault tolerance design for integer parallel matrix vector

Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...Nxfee Innovation
 
Approximate sum of-products designs based on distributed arithmetic
Approximate sum of-products designs based on distributed arithmeticApproximate sum of-products designs based on distributed arithmetic
Approximate sum of-products designs based on distributed arithmeticNxfee Innovation
 
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...Nxfee Innovation
 
A reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applicationsA reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applicationsNxfee Innovation
 
A high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolutionA high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolutionNxfee Innovation
 
A closed form expression for minimum operating voltage of cmos d flip-flop
A closed form expression for minimum operating voltage of cmos d flip-flopA closed form expression for minimum operating voltage of cmos d flip-flop
A closed form expression for minimum operating voltage of cmos d flip-flopNxfee Innovation
 
An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...Nxfee Innovation
 
Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...IAESIJAI
 
NaveadKazi_resume
NaveadKazi_resumeNaveadKazi_resume
NaveadKazi_resumeNavead Kazi
 
Design and fpga implementation of a reconfigurable digital down converter for...
Design and fpga implementation of a reconfigurable digital down converter for...Design and fpga implementation of a reconfigurable digital down converter for...
Design and fpga implementation of a reconfigurable digital down converter for...Nxfee Innovation
 
Saravanan_QA_Automation & Manual Testing_2Years-Exp
Saravanan_QA_Automation & Manual Testing_2Years-ExpSaravanan_QA_Automation & Manual Testing_2Years-Exp
Saravanan_QA_Automation & Manual Testing_2Years-ExpSaravanan Sangapillai
 
Quality management-mc qs
Quality management-mc qsQuality management-mc qs
Quality management-mc qsVipul Kulkarni
 
CV Wasif Sayed 10June2016
CV Wasif Sayed 10June2016CV Wasif Sayed 10June2016
CV Wasif Sayed 10June2016Wasif Sayed
 
CV Wasif Sayed 08June2016
CV Wasif Sayed 08June2016CV Wasif Sayed 08June2016
CV Wasif Sayed 08June2016Wasif Sayed
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...IRJET Journal
 
Combating data leakage trojans in commercial and asic applications with time ...
Combating data leakage trojans in commercial and asic applications with time ...Combating data leakage trojans in commercial and asic applications with time ...
Combating data leakage trojans in commercial and asic applications with time ...Nxfee Innovation
 
Web Development Using Cloud Computing and Payment Gateway
Web Development Using Cloud Computing and Payment GatewayWeb Development Using Cloud Computing and Payment Gateway
Web Development Using Cloud Computing and Payment GatewayIRJET Journal
 
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching schemeA 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching schemeNxfee Innovation
 

Similar to An efficient fault tolerance design for integer parallel matrix vector (20)

Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...Vector processing aware advanced clock-gating techniques for low-power fused ...
Vector processing aware advanced clock-gating techniques for low-power fused ...
 
Approximate sum of-products designs based on distributed arithmetic
Approximate sum of-products designs based on distributed arithmeticApproximate sum of-products designs based on distributed arithmetic
Approximate sum of-products designs based on distributed arithmetic
 
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
Algorithm and vlsi architecture design of proportionate type lms adaptive fil...
 
A reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applicationsA reconfigurable ldpc decoder optimized applications
A reconfigurable ldpc decoder optimized applications
 
A high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolutionA high accuracy programmable pulse generator with a 10-ps timing resolution
A high accuracy programmable pulse generator with a 10-ps timing resolution
 
A closed form expression for minimum operating voltage of cmos d flip-flop
A closed form expression for minimum operating voltage of cmos d flip-flopA closed form expression for minimum operating voltage of cmos d flip-flop
A closed form expression for minimum operating voltage of cmos d flip-flop
 
An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...An energy efficient programmable many core accelerator for personalized biome...
An energy efficient programmable many core accelerator for personalized biome...
 
Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...Product defect detection based on convolutional autoencoder and one-class cla...
Product defect detection based on convolutional autoencoder and one-class cla...
 
NaveadKazi_resume
NaveadKazi_resumeNaveadKazi_resume
NaveadKazi_resume
 
Design and fpga implementation of a reconfigurable digital down converter for...
Design and fpga implementation of a reconfigurable digital down converter for...Design and fpga implementation of a reconfigurable digital down converter for...
Design and fpga implementation of a reconfigurable digital down converter for...
 
Saravanan_QA_Automation & Manual Testing_2Years-Exp
Saravanan_QA_Automation & Manual Testing_2Years-ExpSaravanan_QA_Automation & Manual Testing_2Years-Exp
Saravanan_QA_Automation & Manual Testing_2Years-Exp
 
Quality management-mc qs
Quality management-mc qsQuality management-mc qs
Quality management-mc qs
 
CV Wasif Sayed 10June2016
CV Wasif Sayed 10June2016CV Wasif Sayed 10June2016
CV Wasif Sayed 10June2016
 
CV Wasif Sayed 08June2016
CV Wasif Sayed 08June2016CV Wasif Sayed 08June2016
CV Wasif Sayed 08June2016
 
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
Machine Learning Approaches to Predict Customer Churn in Telecommunications I...
 
Combating data leakage trojans in commercial and asic applications with time ...
Combating data leakage trojans in commercial and asic applications with time ...Combating data leakage trojans in commercial and asic applications with time ...
Combating data leakage trojans in commercial and asic applications with time ...
 
Pravin Bio Data
Pravin Bio DataPravin Bio Data
Pravin Bio Data
 
Web Development Using Cloud Computing and Payment Gateway
Web Development Using Cloud Computing and Payment GatewayWeb Development Using Cloud Computing and Payment Gateway
Web Development Using Cloud Computing and Payment Gateway
 
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching schemeA 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
A 12 bit 40-ms s sar adc with a fast-binary-window dac switching scheme
 
CV_Rajiv_Sharma
CV_Rajiv_SharmaCV_Rajiv_Sharma
CV_Rajiv_Sharma
 

More from Nxfee Innovation

VLSI IEEE Transaction 2018 - IEEE Transaction
VLSI IEEE Transaction 2018 - IEEE Transaction VLSI IEEE Transaction 2018 - IEEE Transaction
VLSI IEEE Transaction 2018 - IEEE Transaction Nxfee Innovation
 
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...Nxfee Innovation
 
The implementation of the improved omp for aic reconstruction based on parall...
The implementation of the improved omp for aic reconstruction based on parall...The implementation of the improved omp for aic reconstruction based on parall...
The implementation of the improved omp for aic reconstruction based on parall...Nxfee Innovation
 
Securing the present block cipher against combined side channel analysis and ...
Securing the present block cipher against combined side channel analysis and ...Securing the present block cipher against combined side channel analysis and ...
Securing the present block cipher against combined side channel analysis and ...Nxfee Innovation
 
Low complexity methodology for complex square-root computation
Low complexity methodology for complex square-root computationLow complexity methodology for complex square-root computation
Low complexity methodology for complex square-root computationNxfee Innovation
 
Efficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresEfficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresNxfee Innovation
 
Analysis and design of cost effective, high-throughput ldpc decoders
Analysis and design of cost effective, high-throughput ldpc decodersAnalysis and design of cost effective, high-throughput ldpc decoders
Analysis and design of cost effective, high-throughput ldpc decodersNxfee Innovation
 
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...Nxfee Innovation
 
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...Nxfee Innovation
 

More from Nxfee Innovation (10)

VLSI IEEE Transaction 2018 - IEEE Transaction
VLSI IEEE Transaction 2018 - IEEE Transaction VLSI IEEE Transaction 2018 - IEEE Transaction
VLSI IEEE Transaction 2018 - IEEE Transaction
 
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
Noise insensitive pll using a gate-voltage-boosted source-follower regulator ...
 
The implementation of the improved omp for aic reconstruction based on parall...
The implementation of the improved omp for aic reconstruction based on parall...The implementation of the improved omp for aic reconstruction based on parall...
The implementation of the improved omp for aic reconstruction based on parall...
 
Securing the present block cipher against combined side channel analysis and ...
Securing the present block cipher against combined side channel analysis and ...Securing the present block cipher against combined side channel analysis and ...
Securing the present block cipher against combined side channel analysis and ...
 
Low complexity methodology for complex square-root computation
Low complexity methodology for complex square-root computationLow complexity methodology for complex square-root computation
Low complexity methodology for complex square-root computation
 
Efficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft coresEfficient fpga mapping of pipeline sdf fft cores
Efficient fpga mapping of pipeline sdf fft cores
 
Analysis and design of cost effective, high-throughput ldpc decoders
Analysis and design of cost effective, high-throughput ldpc decodersAnalysis and design of cost effective, high-throughput ldpc decoders
Analysis and design of cost effective, high-throughput ldpc decoders
 
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
A flexible wildcard pattern matching accelerator via simultaneous discrete fi...
 
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
A 128 tap highly tunable cmos if finite impulse response filter for pulsed ra...
 
Nxfee Innovation Brochure
Nxfee Innovation BrochureNxfee Innovation Brochure
Nxfee Innovation Brochure
 

Recently uploaded

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Recently uploaded (20)

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

An efficient fault tolerance design for integer parallel matrix vector

  • 1. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ An Efficient Fault-Tolerance Design for Integer Parallel Matrix–Vector Multiplications Abstract: Parallel matrix processing is a typical operation in many systems, and in particular matrix–vector multiplication (MVM) is one of the most common operations in the modern digital signal processing and digital communication systems. This paper proposes a fault tolerant design for integer parallel MVMs. The scheme combines ideas from error correction codes with the self-checking capability of MVM. Field-programmable gate array evaluation shows that the proposed scheme can significantly reduce the overheads compared to the protection of each MVM on its own. Therefore, the proposed technique can be used to reduce the cost of providing fault tolerance in practical implementations. Software Implementation:  Modelsim  Xilinx 14.2 Existing System: Matrix–vector multiplication (MVM) is a typical computation in domain transformation, projection, or feature extraction. With increasing demand for high performance or throughput, parallel computing is becoming more important in high-performance computing systems, cloud computing systems, and communication systems. For example, parallel matrix processing with a common input vector (parallel MVMs) is performed for pre coding in multiuser multi-in multi-out (MIMO) (especially large-scale MIMO) and high-performance low density parity check decoders. Parallel processing may run on a large number of processing units, e.g., GPUs or distributed systems, and experiments show that soft errors can affect the outputs. In this ABFT design, the two input matrices
  • 2. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ are changed to a “row checksum” matrix and a “column checksum” matrix, respectively, and single errors in the resulting “full checksum” matrix can be detected and corrected based on its row checksum and column checksum. Since the row checksum does not exist for an input vector, such ABFT scheme can only be used for error detection in MVM. In the ABFT schemes proposed , checksum technique is used to locate the affected result element or blocks of result elements in MVM, and partial re computation is used for error correction. Such schemes are suitable for the cases in which the delay introduced by re computation is acceptable. But this is not always the case, especially in high-throughput communication receivers where the inputs arrive in real time from A/D converters. A general method to protect linear dynamic systems was proposed. Partof that system model can be extracted as an MVM. I n a method to protect parallel digital filters was proposed. The scheme is based on considering each filter as a bit in an error correction code (ECC) so that parity check filters are added and used to detect and correct errors. The schemes in share a similar approach in that redundant rows are added to the processing matrix by combination (or coding) of the original rows of the matrix so that the relations among the multiple outputs can be used to locate and correct the erroneous elements. However, this approach only works for single MVM. In the protection of parallel fast Fourier transform (FFT) operations on different input vectors was studied. In that case, a self checking property of the FFT was combined with the ECC method to reduce the overhead required for protection. we proposed an idea based on ECC and MVM self-checking (checksum) for error protection of parallel MVMs. We assume real-time applications in which integer vectors come from ADCs with a fixed timing and processing is synchronous and repetitive, so a re computation is not tolerable. This paper is an extension of that two page conference paper, and provides new results based on the FGPA implementation. To the best of the authors’ knowledge, no other
  • 3. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ fault-tolerant design for parallel MVMs has been proposed. In the proposed structure, a “detection matrix” D and a “sum matrix” A are added for fault tolerance. D is generated by performing ECC to the checksum row of each matrix, and A is a combination of all matrices under protection. By comparing the results of the original MVMs and that of the check matrix, the fault can be located. Then, the erroneous output vector can be recovered by subtracting the correct output vectors from the sum output vector. Disadvantages:  Overheads is not reduced  Cost is high  Errors are not identified easily Proposed System: Fault tolerance design for parallel matrix processing A. Basic Structure The basic structure of the proposed fault-tolerant system for parallel MVMs is shown in Fig. 1. From Fig. 1, we see that the original P matrices remain unchanged, and the redundant part includes two matrices (regardless of the number of parallel matrices under protection):
  • 4. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ Fig. 1. Fault-tolerant structure for parallel MVMs detection matrix and sum matrix . By comparing the P outputs with the checking result the faulty vector can be detected, and then the correct output vector , can be obtained by subtracting the correct vectors among from the “sum output vector” . The number of rows of D (Rp) is determined by the number of parallel matrices (P). When considering the case that only one MVM may fail, a Hamming code can be used for protection, and the relationship between P and Rp would be On the other hand, in our proposed technique (Fig. 1), a branch is an entire MVM processing. Generation of Detection Matrix and SUM Matrix
  • 5. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ The sum matrix is just the direct sum of the original matrix For the detection matrix, let us assume P = 4 to facilitate the illustration of the basic idea, and assuming only one branch output may contain errors. First, a checksum row for each matrix can be generated as Then, the rows of detection D can be generated ,Hamming code as Then, the error detection matrix can be defined as matrix as
  • 6. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ if there are no errors, we should have where “sum(x)” performs addition of all items within vector x, and In (12), the result of is an integer vector, and sum( produces an integer scalar. In summary, if we denote
  • 7. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ Fig. 2. Implementation structure of MVM (N = 20) error detection of a single branch can also be done. After identifying the failing branch, the corresponding output vector can be recovered by subtracting the right output vectors from the sum output vector Since the error detection is based on the sum of the output vectors, and the whole vector could be recovered during the correction, the proposed scheme could deal with multiple errors in a single output vector Advantages:  Overheads are reduced
  • 8. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________  Cost is low  Error can be easily identified by error detection method References: [1] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. Englewood Cliffs, NJ, USA: Prentice-Hall, 1989. [2] J. S. Lim, Two-Dimensional Digital Signal Processing. Englewood Cliffs, NJ, USA: Prentice-Hall, 1989. [3] A. Grama, A. Gupta, G. Karypis, and V. Kumar, Introduction to Parallel Computing, 2nd ed. London, U.K.: Pearson, Jan. 2003. [4] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. Hoboken, NJ, USA: Wiley, 1999. [5] S. Roger, C. Ramiro, A. Gonzalez, V. Almenar, and A. M. Vidal, “Fully parallel GPU implementation of a fixed-complexity soft-output MIMO detector,” IEEE Trans. Veh. Technol., vol. 61, no. 8, pp. 3796–3800, Oct. 2012. [6] Q. Qi and C. Chakrabarti, “Parallel high throughput soft-output sphere decoder,” in Proc. IEEE Workshop Signal Process. Syst., Oct. 2010, pp. 174–179. [7] M. Wu, C. Dick, and J. R. Cavallaro, “Improving MIMO sphere detection through antenna detection order scheduling,” in Proc. SDR Forum, 2011, pp. 280–284. [8] G. Wang, M. Wu, Y. Sun, and J. R. Cavallaro, “A massively parallel implementation of QC-LDPC decoder on GPU,” in Proc. IEEE 9th Symp. Appl. Specific Process. (SASP), Jun. 2011, pp. 82–85. [9] G. Wang, H. Shen, B. Yin, M. Wu, Y. Sun, and J. R. Cavallaro, “Parallel nonbinary LDPC decoding on GPU,” in Proc. ASILOMAR Conf., Nov. 2012, pp. 1277–1281. [10] I. S. Haque and V. S. Pande, “Hard data on soft errors: A largescale assessment of real-world error rates in GPGPU,” in Proc. 10th IEEE/ACM Int. Conf. Cluster, Cloud Grid Comput. (CCGrid), May 2010, pp. 691–696.
  • 9. NXFEE INNOVATION (SEMICONDUCTOR IP &PRODUCT DEVELOPMENT) (ISO : 9001:2015Certified Company), # 45, Vivekanandar Street, Dhevan kandappa Mudaliar nagar, Nainarmandapam, Pondicherry– 605004, India. Buy Project on Online :www.nxfee.com | contact : +91 9789443203 | email : nxfee.innovation@gmail.com _________________________________________________________________ [11] P. Rech, C. Aguiar, C. Frost, and L. Carro, “An efficient and experimentally tuned software-based hardening strategy for matrix multiplication on GPUs,” IEEE Trans. Nucl. Sci., vol. 60, no. 4, pp. 2797– 2804, Aug. 2013. [12] K.-H. Huang and J. A. Abraham, “Algorithm-based fault tolerance for matrix operations,” IEEE Trans. Comput., vol. C-33, no. 6, pp. 518–528, Jun. 1984. [13] C. Ding, C. Karlsson, H. Liu, T. Davies, and Z. Chen, “Matrix multiplication on GPUs with on-line fault tolerance,” in Proc. IEEE 9th Int. Symp. Parallel Distrib. Process. Appl., May 2011, pp. 311–317.