FPGA Implementation of LDPC Encoder for Terrestrial Television
1570285065
1. A Novel and Efficient Design of Golay Encoder for
Ultra Deep Submicron Technologies
Chiranjeevi Sheelam, JVR Ravindra
Center for Advanced Computing Research Laboratory (C-ACRL)
Department of Electronics and Communication Engineering
Vardhaman College of Engineering, Shamshabad, Kacharam
Hyderabad, Telangana, India
email: chiranjeevi497@gmail.com, jayanthi@ieee.org
Abstract—This paper lays out two different approaches for
generation of binary golay code (23, 12). Namely, Linear feedback
shift register (LFSR) based CRC and hardware architecture
based on CRC. There are certain disadvantages associated with
these two architectures. To overcome those disadvantages, a
new architecture has been proposed for binary golay code (23,
12) generation. This paper also presents an efficient hardware
architecture to generate extended golay code (24, 12). High speed,
low latency, low area and low power architecture has been
designed and verified.
Index Terms—Binary golay code (23, 12), Linear feedback shift
register (LFSR), Cyclic redundancy check (CRC) and Extended
golay code (24, 12).
I. INTRODUCTION
Error detection and correction plays an important role in
information theory. It enables reliable transmission of digital
data over noisy communication channel [1]. In wireless com-
munication, when a message is transmitted from source to
destination, data may get distorted or corrupted by noise. Error
detection involves detection of errors, whereas error correction
implicates restoration of original message. Redundant bits are
added to the data bits at the time of transmission called
as check bits. Forward error correction (FEC) codes are of
two types, namely block codes and convolution codes. The
block code notation (n, k, d) describes a block with length n,
message with length k and the minimum hamming distance
between two codewords is d. A codeword C can detect and
correct up to d − 1 and (d − 1)/2 errors respectively.
There are various types of block codes are available. One of
the most powerful known block code is the binary golay code
(23, 12, 7), represented as G23. This code was discovered
by M .J .E Golay [2] in 1949 to address error correction.
Golay code is a perfect linear error correcting code and one
of the non-trivial perfect codes. It is the only known code,
which can correct up to three errors or detect upto seven errors
in a block of 23 elements. The addition of a parity bit to
the binary golay code(G23) gives the rate-1/2 and self dual
extended golay code(24, 12, 8), represented as G24. This has
found numerical applications such as in voyager missions of
JPL-NASA, ultrasound Imaging or as a coded excitation for a
laser.
The rest of the paper is structured as follows. Literature
survey is conducted in Section II, Section III represents the
proposed architecture for binary golay code(G23) generation,
extended golay code (G24) generation and simulation results.
Conclusions are shown in Section IV.
II. LITERATURE SURVEY
In the extended golay code, 12 bits of data is encoded into
24 bits codeword in such a way that 7 errors can be detected or
3 errors can be corrected. Let G = [I, B] be the 12x24 matrix,
where I and B are the 12x12 identity matrix and matrix over
GF(2). Any linear code C with this generator matrix G is
called the extended golay code(G24). Properties of extended
golay code G24 with generator matrix G and B[2] are
• The 12x12 matrix B follows BT
= B, BBT
= I and
B2
= I.
• One more generator matrix for G24 is [B, I].
• G24 is self-dual three error correcting code.
• Parity check matrix for G24 is a 24x12 matrix. H =
[B, I]T
or [I, B]T
The binary golay code can be obtained by puncturing extended
golay code(G24). Suppose B’ be the 12x11 matrix acheived
by deleting the last column from 12x12 matrix B. Any linear
code with this 12x23 generator matrix G=[I, B’] is called the
binary golay code(G23). There are various encoding methods
are available. [3]-[5] deal with different algorithms, but these
are not sufficient for hardware implementation. LFSR based
[6]-[9] methods are suitable for hardware implementation but
they are not suitable beacuse of high latency. In past few years
many encoding architectures [10]-[19] have been presented.
Most recently [20] proposed an hardware implementation
based on CRC. This paper briefs two important encoding
methods.
A. LFSR Based Golay Encoder
Fig.1 shows an encoder architecture for (23, 12) golay code
with generator polynomial.
g(x) = x11
+ x9
+ x7
+ x6
+ x5
+ x1
+ 1(AE3h)
The encoder consists of a polynomial division register, (n−k)
11 delay stages and (n) 23 codeword stages. The delay stages
are used for delaying the information bits.So, that information
bits enter the codeword stages in exact synchronization.
2016 Intl. Conference on Advances in Computing, Communications and Informatics (ICACCI), Sept. 21-24, 2016, Jaipur, India
978-1-5090-2028-7/16/$31.00 @2016 IEEE 300
2. Fig. 1. LFSR based golay encoder
• Our information word i(x) = [ik−1, ik−2, i2, i1, i0]. The
MSB ik−1i.e., i11 is the first bit to enter into the register
followed by i10, i9, .....i2, i1, i0.
• After (k) 12 shifts the information bits have entered the
register and the remainder stages contain the remainder
of i(x)/g(x). But for systematic codeword we require
(xn−k
.i(x))/g(x). This is achieved by shifting the reg-
ister an additional (n − k) 11 times. Therefore after
(n−k+k) 23 shifts, the register stages contain the check-
bits.
• Further (n − k) 11 shifts are required to shift the check-
bits into the codeword stages. Additional two switches
are used to divert the bit stream when required.
B. Existing Architecture for Binary Golay Code Generation
Satyabrata sarangi and Swapna banerjee [20] proposed an
architecture shown in Fig. 2.
Fig. 2. Existing architecture for binary Golay code
Initially the 12 bit message is stored in R8 and the char-
acteristic polynomial AE3h in R2. The contents of registers
R1 and R2 are binary XORed and stored in register R3. A
12:4 priority encoder is used to find out the number of zeros
present in the register R3 before 1. A cyclic shifter is used to
left shift the contents of register R3 by priority encoder output
times and the result is stored in R4. A 2x1 multiplexer is used
to select the initial input or R4, depends on control signal P.
Which is bit wise OR of priority encoder output o[3:0].
Iteration control unit consists a 2x1 multiplexer, used to
select 11(no of zeros appended to the input) or register R7. The
other input to the subtractor is priority encoder output. After
last iteration, the output of the subtractor will be zero. When
the control signal Ld becomes zero, the register R6 [22:11] is
loaded with the contents of R8 and R3 [10:0] respectively.
The control signal Ld = (!(R7[4])&(|R7)), where |(R7)
represents bit wise OR operation. Finally, the content of the
register R6 illustrates the encoded golay codeword. There are
some disadvantages in this architecture. They are
1) When the input MSB is 0: The existing architecture does
not support the message whose MSB is 0. Let us consider the
message to be encoded is 425h.
As there are no leading zeros in residual result, the priority
encoder output will be zero. In this example, the control signal
P is zero, therefore the 2x1 multiplexer selects input again
instead of intermediate result. This process continues without
getting any output. Simulation results are shown in Fig. 3.
Fig. 3. Simulation results when input MSB is ’0’
2) When the priority encoder output is greater than the
required number of shifts: The long division process continues
until all the 11 appended zeros are used. Let us consider the
message to be encoded is A52h.
The residual result obtained in the last stage of long division
process is 0000 0001 0101(R4). As priority encoder output is
7, there will be seven circular shifts on the residual result. Ld
becomes zero and the check bits generated will be 000 0110
0011. Simulation results are shown in Fig. 4. Which is wrong,
when priority encoder output is greater than required number
of shifts, the check bits should be generated from R4 unlike
R3. As there are only three zeros left on right side of the
message , only 3 circular shifts on intermediate result (0000
1010 1000). The MSB is ignored and the resultant check bits
are 000 1010 1000.
2016 Intl. Conference on Advances in Computing, Communications and Informatics (ICACCI), Sept. 21-24, 2016, Jaipur, India
301
3. Fig. 4. Simulation results when priority encoder is greater than required shifts
3) The contents of the register R6 updates after generating
binary golay code: As Ld becomes zero, the value stored in
register R6 gives the encoded binary golay code. But with the
existing architecture, the division process continues as Ld will
be zero for next two to three iterations. With this, the contents
of the required register R6 updates and results in getting wrong
encoded codeword. Simulation results are shown in Fig. 5.
Fig. 5. Simulation results with existing architecture
III. PROPOSED ARCHITECTURES FOR BINARY GOLAY
CODE(G23) AND EXTENDED GOLAY CODE(G24)
A. Proposed Architecture for Binary Golay Code(G23)
To overcome the disadvantages associated with the existing
architecture, a new architecture has been proposed for binary
golay encoder as shown in Fig. 6.
Fig. 6. Proposed architecture for binary Golay code(G23)
Input message and characteristic polynomials are saved in
R8 and R2 respectively. A 2x1 multiplexer is used to select
the register R8 or the content stored in register R4. Initially
the control signal ’Rst’ will be High and through out the
polynomial division process it will be Low. In every step of
long division process, XOR operation happens for modulo 2
subtraction.
The 12:4 priority encoder block is used to find the number
of 0’s present before first 1 in the register R3. A condition
block is used to check whether the priority encoder output is
greater than the value stored in register R5. If the condition
satisfies then the residual result will be circularly left shifted
by R5 times else priority encoder output times and the result
is stored in register R4. As Rst is low, the contents of
register R1 updates with contents of register R4. This process
continues upto last iteration. The control signal for another
2:1 multiplexer is R7[4], Which will be High only when the
priority encoder output is greater than required number of
shifts i.e, R5 (When higher number is subtracted from smaller
number, the resultant outputs sign bit will be high which
indicates a negative number).
Iteration control unit consists of a 2:1 multiplexer and a
subtractor. Initially reset will be high, with which the register
R5 contains 11 and throught the long division process the
register R5 gets amended with the contents of register R7 as
reset is low. Priority encoder output is the another input to
the subtractor and control signal is P. The Cs = ((Ld ==
0)&(R7[4] == 1)&(R5[4] == 0)) controls the loading of R8
and R9[10:0] into register R6, where Ld is (!(R7[4])&(|R7)).
The register R6 is loaded when control signal is high, which
means end of the division process. When Ld =0, R7[4]=1 and
R5[4]=0, R8 contents are shifted into R6[22:11] and R9[10:0]
into R6[10:0]. The register R6 contains the encoded binary
golay code.
Let us consider the message to be encoded is 127h. Registers
R8 and R2 contain 127h and characteristic polynomial AE3h
respectively. Check bits generation using long division process
is shown in Fig.7.
Fig. 7. Example of check bits generation with proposed architecture
2016 Intl. Conference on Advances in Computing, Communications and Informatics (ICACCI), Sept. 21-24, 2016, Jaipur, India
302
4. • Initially reset will be High, with which registers R1 and
R5 contain 127h and 11(01011).
• At first stage of log division process, binary XOR op-
eration occurs. The result does not contain any leading
number of zeros. Hence, priority encoder output will be
zero.
• As the priority encoder output is less than the contents of
the register R5. The condition block fails and the residual
result is circularly left shifted by priority encoder output
times (zero).
• In iteration control block, as the subtractor output is
positive (Which means sign bit is high). Register R4
contains the circularly left shifted intermediate result. As
reset is low throught the process the contents of register
R1 gets updated with R4 and this process continues.
• At last stage of division process, the priority encoder
output is 4 which is greater than the contents of the
register R5 (1), with which subtractor output will be
negative. As condition satisfied, the residual result is
circularly left shifted by R5 times.
• As sign bit is high, R4 contains the intermediate result
and the register R9 gets updated with the contents of
register R4.
• As Ld becomes zero, R7[4]=1 and R5[4]=0, the register
R6[22:11] is updated with R8 contents and R6[10:0] with
R9[10:0]. Which represents the end of the long division
and the register R6 contains the encoded binary golay
codeword(G23).
B. Proposed Architecture for Extended Golay Code(G24)
Fig. 8 shows an architecture for appending an additional
parity bit to the binary golay code(G23) to generate extended
Golay codeword(G24).
Fig. 8. Proposed architecture for extended Golay code
Register R6 contains the 23 bit binary golay code generated
from the proposed architecture. MSB 12 bits of register R6
are stored in register P, whereas register Q contains LSB 11
bits of register R6 appended with a zero(As we are concerned
only with number of 1s, 0 is added to make it as 12 bits
register). wt1 and wt2 calculate the weight of registers P and
Q respectively. Weight of binary golay code i.e. addition of
wt1 and wt2 is saved in register R10. R6’ is appended with
1, wheras R6” is appended 0. The control signal for the 2x1
multiplexer is the least significant bit (LSB) of R10. Register
R11 contains the 24 bit extended golay code (G24). Weight
measurement structure [20] for 12 bit binary input is shown
in Fig. 9. It requires four full adders, two two-bit adders and
one three bit adder.
Fig. 9. Weight Measurement structure for 12 Bit Input
C. Results of Binary Golay Code(G23) and Extended Golay
Code(G24) Architectures
The proposed hardware architectures for both binary go-
lay code (G23) and extended golay code (G24) has been
verified using CADENCE NClaunch tool. The synthesis re-
sults obtained from CADENCE Encounter(R) RTL compiler
RC10.1.304 − V 10.10 − S339 1 using tsmc90.0 technology
illustrate that the total dynamic power used by both encoder
architectures is 188.701 µW,total memory usage is 76952K
and total cell area is 5861 µm2
. Timing slack is 994607 ps,
while rise slew is 55 ps and fall slew is 63 ps. Simulation
results with the proposed architecture are shown in Fig. 10.
R6 and R11 gives the 23 bit binary golay codeword and 24
bit extended golay codeword respectively.
Fig. 10. Simulation results with proposed architecture
2016 Intl. Conference on Advances in Computing, Communications and Informatics (ICACCI), Sept. 21-24, 2016, Jaipur, India
303
5. TABLE I
COMPARISON OF THE VARIOUS ENCODERS
S.no Encoder Parameter Advantages Remarks
1. Hamming Security(!) It can detect upto 2 bit errors or correct 1
bit error
It cannot detect uncorrected errors
2. Reed-
Solomon
Security(!) It can detect and correct multiple symbol
errors
It belongs to non-binary cyclic codes
3. Reed-Muller Security(!) It can detect and correct multiple bit errors
and it is easy to implement
Transmission rate is low
4. Convolutional Security(!) Suitable for very large data streams and easy
to implement using shift registers
Computational complexity increases with
data length and decoding is very complex.
5. Turbo Security(!) It uses interleavers which reduce burst errors
and performs extraordinary at low SNR.
High decoding complexity, high latency
and poor performance at very low BER
6. LDPC Security(!) This code attains performance near the
shannon limit
Complexity is high
7. 8b/10b Security(!) It maps 8 bit symbol to 10 bit symbol in
order to achieve DC balance.
It has been widely used in high speed serial
communication standards.
8. Proposed Security(!) It is the only known code which can correct
up to 3 errors in a 23 bit element.
Proposed architecture outperforms existing
in terms of dynamic power and cell area.
Table I presents the comparison of various encoders. Ham-
ming encoder generates a hamming code of length N with
message length K. Hamming code can detect up to 2 bit errors
or can correct one-bit error. It cannot detect uncorrected errors.
Reed-solomon codes are subset of BCH codes and belongs
to the non-binary cyclic code class. it can detect and correct
multiple symbols errors. Many encoding and decoding archi-
tectures are available for RS codes, implementation complexity
is high.
Reed-Muller codes can detect and correct multiple bit-errors
but transmission rate is low compared to other codes. Reed
muller codes are easy to implement and it can corect more
than t errors. convolutional encoder sequentially convolves
the sequence of information bits. Shift registers are used in
encoder and it is suitable for very large data streams. Turbo
codes are also called as Parallel-concatenated Convolutional
Codes(PCCC). In turbo encoder,there are two convolutional
encoders arranged in parallel. The information bits are scram-
bled using interleaver before entering second encoder. It
performs extraordinary at low SNR but decoding complexity
is high.
LDPC codes are also called as Gallager codes used for
applications such as deep space network and satellite com-
munications. LDPC code is constructed using sparse bipartite
graph. This code attains performance near the shannon limit.
8b/10b encoder encodes 8 bit symbols into 10 bit symbols
to get DC-balance. 8b/10b supports continuous transmission
with balanced number of ones and zeros, It can also detect
single bit errors. This encoder has been widely used in high
speed serial communication standards. Proposed golay code
is the only known code which can correct up to 3 errors
in a 23-bit element. Binary golay encoder encodes a 12 bit
information into 23 bit codeword. A parity bit is added to
make 24 bit codeword called as extended golay codeword.
It can be implemented using LFSR based CRC generation
scheme, but it is not suitable because of high latency. A new
hardware architecture has been proposed and implemented in
this paper.
TABLE II
COMPARISON OF THE PROPOSED ARCHITECTURE WITH
EXISTING
Attribute Existing[20] Proposed
1. When the input MSB is 0 Not Possible Possible
2. When the priority encoder output is
greater than required number of shifts
Not Possible Possible
3. Correct binary golay code generation Not Possible Possible
Table II interprets that the proposed architecture bypasses
the three disadvantages associated with the existing hardware
architecture. The proposed architecture clearly outperforms the
existing hardware architecture.
TABLE III
COMPARISON OF DYNAMIC POWER AND CELL AREA
Attribute Existing[20] Proposed
1. Dynamic Power(µw) 160.82 188.70
2. Total number of cells 214 333
3. Total Cell Area(µm2) 4001.66 5861.12
Table III depicts that the proposed encoder architecture cell
area and dynamic power are slightly higher than the existing
architecture but it overcomes all the disadvantages associated
with the existing architecture. Maximum and minimum fanout
of the proposed encoder architecture are 30 and 0 respectively.
Table IV represents that the proposed encoder architecture
has a latency of maximal 12 clock cycles like [20]. In addition,
the clocking mechanism used for the proposed architecture is
system clock unlike [6].
2016 Intl. Conference on Advances in Computing, Communications and Informatics (ICACCI), Sept. 21-24, 2016, Jaipur, India
304
6. TABLE IV
COMPARISON OF LATENCY AND CLOCKING MECHANSIM
Reference Latency Clocking Mechanism
[6] 23 System Clock + Clock Doubler
[20] 12(Maximum) System Clock
Proposed 12(Maximum) System Clock
IV. CONCLUSION
Proposed architectures for both binary golay code(G23) and
extended golay code(G24) avoids the disadvantages associ-
ated with the existing architectures. This architectures has
been designed and verified using cadence verilog tool. The
proposed encoder architecture uses system clock mechanism
and has a latency of 12 clock cycles maximum. The results
obtained from CADENCE RTL compiler are better than the
recent publications in this field. These hardware architectures
for encoder can be a possible choice for many applications
like ultrasonography, ultrasound imaging and in deep space
missions.
REFERENCES
[1] Shu Lin, Daniel J. Costello,”Error Control Coding”, Second Edi-
tion,ISBN 978-81-317-3440-7, 2011.
[2] M.J.E. Golay, ”Notes on digital coding,” Proc. IRE, volume 37, p.657,
Jun.1949.
[3] X.H.Peng and P.G.Farrell,”On construction of the (24, 12, 8) Golay
codes,” IEEE Trans on Information Theory, vol 52, no 8, pp. 3669-3675.
Aug. 2006.
[4] B. Honary and G. Markarian, ”New simple encoder and trellis decoder
for Golay codes,” Electron Lett, Vol.29, no.25, pp. 2170-2171, Dec.1993.
[5] B. K. Classon, ”Method, system, apparatus, and phone for error control
of Golay encoded data signals,” U.S.Patent 6 199 189, Mar.6, 2001.
[6] M.I. Weng and L.N, Lee, ”Weighted erasure codec for the (24, 12)
extended Golay code,” U.S.Patent 4 397 022, Aug.2, 1983.
[7] M. Spachmann, ”Automatic generation of parallel CRC circuits,”
IEEE.Des.Test Computers, Vol. 18, no.3, pp. 108-114, May 2001.
[8] G. Campobello, G. Patane and M. Russo, ”Parallel CRC Realization,”
IEEE Trans.Comput, Vol. 52, no. 10, pp. 1312-1319, Oct 2013.
[9] R. Nair, G. Ryan, and F. Farzaneh, ”A symbol based algorith for
hardware implementation of cyclic redundancy check(CRC),” In Proc.
VHDL Int. Users Forum, oct. 1997, pp. 82-87.
[10] Pedro Reviriego, Liyi Xiao, Shanshan Liu and Juan Antonio Maestro
”An Efficient Single and Double-Adjacent Error Correcting Parallel
Decoder for the (24,12) Extended Golay Code” IEEE Transactions on
VLSI systems, vol. 24, no. 4, April 2016.
[11] Shin-Yuan Su, and Pai-Chi Li ”Photoacoustic Signal Generation with
Golay Coded Excitation”, IEEE International Ultrasonics Symposium
Proceedings,pp. 2151-2154, 2010.
[12] Satyabrata Sarangi, Pradyut Kumar Sanki, Praful P. Pai, Swapna Baner-
jee ”Comparative Analysis of Golay Code based Excitation and Coherent
Averaging for Non-Invasive Glucose Monitoring System”, International
Symposium on Computer-Based Medical Systems, . IEEE , vol., no.,
pp.485-486,2014.
[13] David Romero-Laorden, Carlos Julian Mart, Oscar Martinez-Graullera,
Montserrat Parrilla-Romero, n-Arguedas ”Application of Golay codes
to improve SNR in coarray based synthetic aperture imaging systems”
SAM Signal Processing Workshop, IEEE, vol., no., pp.325-328,2012.
[14] Aamir Hussain and Mohammad Bilal Malik ”Bi-carrier Golay Code
Based Ranging With Improved Range Resolution”,ISVC, IEEE, PP.1-4,
2010.
[15] Xiaochun Wang, Jianjun Ji, Sheng Zhou, Yanqun Wang ”Ophthal-
mological Ultrasound Biometer Using Golay-Coded Pulse Excitation”,
International Conference on BioMedical Engineering and Informatics,
IEEE,pp.76-80,Oct. 2014.
[16] Sheng Zhou, Qingsheng Ye, Jun YangXiaochun Wang, Jianjun Ji,
Yanqun Wang ”Medical High-frequency Ultrasound Imaging using
Golay-coded Pulse Excitation”, International Conference on BioMedical
Engineering and Informatics, IEEE, pp.71-75,Oct.2012.
[17] Mohd Saiful Dzulkefly Zan, Ahmad Ashrif A. Bakar and Tsuneo
Horiguchi ”Improvement of Signal-to-Noise-Ratio by Combining Walsh
and Golay Codes in Modulating the Pump Light of Phase-Shift
Pulse BOTDA Fiber”, International Conference on Sensing Technology
Sensor,IEEE,pp.269-273,2015.
[18] Fanxin Zeng, Zhenyu Zhang, Xiaoping Zeng and Guixin Xuan ”16-
QAM Golay Complementary Sequence Sets with Arbitrary Lengths”,
IEEE Communication letters, VOL. 17, NO. 6, JUNE 2013.
[19] Yi Hua Chen, Jue Hsuan Hsiao, Pang-Fu Liu, Jheng-Shyuan He ”Golay
(20, 8) C Code Simulation and Implementation in DSP Chip”, IEEE,
pp.4990-4994,Apr.2011.
[20] Satyabrata Sarangi and Swapna Banerjee ”Efficient Hardware Imple-
mentation of Encoder and Decoder for Golay Code”, IEEE Transactions
on VLSI, vol. 23, no. 9, September 2015.
2016 Intl. Conference on Advances in Computing, Communications and Informatics (ICACCI), Sept. 21-24, 2016, Jaipur, India
305