HIGH-SPEED LOW-POWER VITERBI DECODER DESIGN FOR
TCM DECODERS
ABSTRACT
Viterbi Algorithm (VA) decoders are very popular. They are currently
used in about one billion Cell phones. This is probably one of the largest
number in any application. However, the largest current consumer of VA
processor cycles is probably digital video broadcasting. A recent estimate at
Qualcomm is that approximately 1015 bits per second are now being decoded
by the VA in digital TV sets around the world, every second of every day.
INTRODUCTION
 General solutions for Power reduction in VDs could be achieved by
reducing the number of states (for example, reduced-state sequence
decoding (RSSD) M-algorithm and T-algorithm ) or by over-scaling the
supply voltage.
 RSSD is in general not as efficient as the M-algorithm and T -algorithm is
more commonly used than M-algorithm in practical applications, because
the M-algorithm requires a sorting process in a feedback loop while T -
algorithm only searches for the optimal path metric (PM), that is, the
minimum value or the maximum value of all PMs.
INTRODUCTION
 T -algorithm has been shown to be very efficient in reducing the power
consumption. However, searching for the optimal PM in the feedback loop
still reduces the decoding speed.
 To overcome this drawback, two variations of the T -algorithm have been
proposed: the relaxed adaptive VD , which suggests using an estimated
optimal PM, instead of finding the real one each cycle and the limited-
search parallel state VD based on scarce state transition (SST).
OBJECTIVE
 The main aim of the this viterbi decoder is to reduce the power consumption
without degrading the performance.
 For the purpose of the low power consumption, We propose a pre-
computation architecture incorporated with T-algorithm for VD, which can
effectively reduce the power consumption without degrading the decoding
speed much.
VITERBI DECODER
Functional diagram of a viterbi decoder.
 BMU: branch metrics (BMs) are calculated in the BM unit (BMU) the
received symbols. In a TCM decoder this module is replaced by transition
metrics unit (TMU), which is more complex than the BMU.
 ACSU:BMs are fed into the ACSU that recursively computes the path
metrics (PMs) and outputs decision bits for each possible state transition.
 SMU: The decision bits are stored in and retrieved from the survivor-path
memory unit (SMU) in order to decode the source bits along the final
survivor path.
 PMU: The PMs of the current iteration are stored in the PM unit (PMU).
T-algorithm requires extra computation in the ACSU loop for calculating
the optimal PM
Functionality:
IMPLEMENTATION
 The full-trellis VD, the VD with the 2-step pre-computation architecture and one
with the conventional T-algorithm are modeled with Verilog HDL code.
 This is because the former decoder has a much longer critical path and the
synthesis tool took extra measures to improve the clock speed (e.g., using many
standard cells with larger driving strength, duplicating logic and registers to
reduce fan-out and load capacitance, etc.).
 It is clear that the conventional T-algorithm is not suitable for high-speed
applications. If the target throughput is moderately high, the proposed architecture
can operate at a lower supply voltage, which will lead to quadratic power
reduction compared to the conventional scheme (due to much shorter critical
path). Thus i next focus on the power comparison between the full trellis VD and
the proposed scheme.
LITERATURE REVIEW
 I review three most relevant works for low-power Viterbi decoder designs.
Seki, Kubota, Mizoguchi and Kato suggested a scarce state transition (SST)
scheme to reduce the switching activity of a Viterbi decoder. The input is pre-
decoded by a simple and hence, a power efficient decoder. The pre-decoded
sequence, which is not optimal under a noisy channel, is reprocessed by a
Viterbi decoder to improve performance. The authors showed that the pre-
decoded sequence reduces the switching activity of the Viterbi decoder
thereby reducing power dissipation.
(cont’d)
 Kang and Wilson suggested application of existing low-power design
methodologies at different levels. At the architectural level, they suggested
partition of major blocks and memory modules to reduce the power dissipation.
They considered Grey coding for memory addressing, which incurs less switching
compared to binary coding.
 Garrett and Stan suggested a low-power architecture of the soft- output Viterbi
decoder for turbo codes. They proposed an orthogonal access memory structure,
which enables parallel access of sequentially received data. Use of such a memory
structure reduces the switching activity for read and write of survivor path
information.
 All the above works aim to reduce the switching activities of Viterbi decoders,
which is an effective scheme for power reduction.
Advantages:
 The usage of this Viterbi algorithm is found to be advantageous due to its cost
effectiveness in modulated minimize at the same time the functional
performance in some situation would modulate in maintaining the original cost.
Emerging linear functioning of linear pulse distance is due to convenient
source sequence.
CONCLUSION
 The pre-computation architecture that incorporates T-algorithm efficiently
reduces the power consumption of VDs without reducing the decoding
speed appreciably. I have also analyzed the pre-computation algorithm.
 Algorithm is suitable for TCM systems which always employ high-rate
convolutional codes. Finally, I presented a design case. Both the ACSU and
SMU are modified to correctly decode the signal. ASIC synthesis and
power estimation results
REFERENCES
1. J. He, Z. Wang and H. Liu, “An efficient 4-D 8PSK TCM decoder architecture”,
IEEE Trans. VLSI Syst., vol. 18, no. 5, pp. 808-817, May 2010.
2. J. He, H. Liu, Z. Wang, "A fast ACSU architecture for Viterbi decoder using T-
algorithm," in Proc. 43rd IEEE Asilomar Conf. on Signals, Systems
and Computers, pp. 231-235, Nov. 2009.
3. R. A. Abdallah, and N. R. Shanbhag, “Error-resilient low-power Viterbi decoder
architectures,” IEEE Trans. Sig. Proc., vol. 57, No. 12, pp. 4906-4917, Dec. 2009.
4. J. Jin, and C.-Y. Tsui, “Low-power limited-search parallel state Viterbi decoder
implementation based on scarece state transition,” IEEE Trans. VLSI Syst., vol.
15, no. 10, pp.1172-1176, Oct. 2007.
5. F. Sun and T. Zhang, “Low power state-parallel relaxed adaptive Viterbi decoder
design and implementation,” in Proc. IEEE ISCAS, pp. 4811-4814, May, 2006.

High speed low power viterbi decoder design for TCM decoders

  • 1.
    HIGH-SPEED LOW-POWER VITERBIDECODER DESIGN FOR TCM DECODERS
  • 2.
    ABSTRACT Viterbi Algorithm (VA)decoders are very popular. They are currently used in about one billion Cell phones. This is probably one of the largest number in any application. However, the largest current consumer of VA processor cycles is probably digital video broadcasting. A recent estimate at Qualcomm is that approximately 1015 bits per second are now being decoded by the VA in digital TV sets around the world, every second of every day.
  • 3.
    INTRODUCTION  General solutionsfor Power reduction in VDs could be achieved by reducing the number of states (for example, reduced-state sequence decoding (RSSD) M-algorithm and T-algorithm ) or by over-scaling the supply voltage.  RSSD is in general not as efficient as the M-algorithm and T -algorithm is more commonly used than M-algorithm in practical applications, because the M-algorithm requires a sorting process in a feedback loop while T - algorithm only searches for the optimal path metric (PM), that is, the minimum value or the maximum value of all PMs.
  • 4.
    INTRODUCTION  T -algorithmhas been shown to be very efficient in reducing the power consumption. However, searching for the optimal PM in the feedback loop still reduces the decoding speed.  To overcome this drawback, two variations of the T -algorithm have been proposed: the relaxed adaptive VD , which suggests using an estimated optimal PM, instead of finding the real one each cycle and the limited- search parallel state VD based on scarce state transition (SST).
  • 5.
    OBJECTIVE  The mainaim of the this viterbi decoder is to reduce the power consumption without degrading the performance.  For the purpose of the low power consumption, We propose a pre- computation architecture incorporated with T-algorithm for VD, which can effectively reduce the power consumption without degrading the decoding speed much.
  • 6.
  • 7.
     BMU: branchmetrics (BMs) are calculated in the BM unit (BMU) the received symbols. In a TCM decoder this module is replaced by transition metrics unit (TMU), which is more complex than the BMU.  ACSU:BMs are fed into the ACSU that recursively computes the path metrics (PMs) and outputs decision bits for each possible state transition.  SMU: The decision bits are stored in and retrieved from the survivor-path memory unit (SMU) in order to decode the source bits along the final survivor path.  PMU: The PMs of the current iteration are stored in the PM unit (PMU). T-algorithm requires extra computation in the ACSU loop for calculating the optimal PM Functionality:
  • 8.
    IMPLEMENTATION  The full-trellisVD, the VD with the 2-step pre-computation architecture and one with the conventional T-algorithm are modeled with Verilog HDL code.  This is because the former decoder has a much longer critical path and the synthesis tool took extra measures to improve the clock speed (e.g., using many standard cells with larger driving strength, duplicating logic and registers to reduce fan-out and load capacitance, etc.).  It is clear that the conventional T-algorithm is not suitable for high-speed applications. If the target throughput is moderately high, the proposed architecture can operate at a lower supply voltage, which will lead to quadratic power reduction compared to the conventional scheme (due to much shorter critical path). Thus i next focus on the power comparison between the full trellis VD and the proposed scheme.
  • 9.
    LITERATURE REVIEW  Ireview three most relevant works for low-power Viterbi decoder designs. Seki, Kubota, Mizoguchi and Kato suggested a scarce state transition (SST) scheme to reduce the switching activity of a Viterbi decoder. The input is pre- decoded by a simple and hence, a power efficient decoder. The pre-decoded sequence, which is not optimal under a noisy channel, is reprocessed by a Viterbi decoder to improve performance. The authors showed that the pre- decoded sequence reduces the switching activity of the Viterbi decoder thereby reducing power dissipation.
  • 10.
    (cont’d)  Kang andWilson suggested application of existing low-power design methodologies at different levels. At the architectural level, they suggested partition of major blocks and memory modules to reduce the power dissipation. They considered Grey coding for memory addressing, which incurs less switching compared to binary coding.  Garrett and Stan suggested a low-power architecture of the soft- output Viterbi decoder for turbo codes. They proposed an orthogonal access memory structure, which enables parallel access of sequentially received data. Use of such a memory structure reduces the switching activity for read and write of survivor path information.  All the above works aim to reduce the switching activities of Viterbi decoders, which is an effective scheme for power reduction.
  • 11.
    Advantages:  The usageof this Viterbi algorithm is found to be advantageous due to its cost effectiveness in modulated minimize at the same time the functional performance in some situation would modulate in maintaining the original cost. Emerging linear functioning of linear pulse distance is due to convenient source sequence.
  • 12.
    CONCLUSION  The pre-computationarchitecture that incorporates T-algorithm efficiently reduces the power consumption of VDs without reducing the decoding speed appreciably. I have also analyzed the pre-computation algorithm.  Algorithm is suitable for TCM systems which always employ high-rate convolutional codes. Finally, I presented a design case. Both the ACSU and SMU are modified to correctly decode the signal. ASIC synthesis and power estimation results
  • 13.
    REFERENCES 1. J. He,Z. Wang and H. Liu, “An efficient 4-D 8PSK TCM decoder architecture”, IEEE Trans. VLSI Syst., vol. 18, no. 5, pp. 808-817, May 2010. 2. J. He, H. Liu, Z. Wang, "A fast ACSU architecture for Viterbi decoder using T- algorithm," in Proc. 43rd IEEE Asilomar Conf. on Signals, Systems and Computers, pp. 231-235, Nov. 2009. 3. R. A. Abdallah, and N. R. Shanbhag, “Error-resilient low-power Viterbi decoder architectures,” IEEE Trans. Sig. Proc., vol. 57, No. 12, pp. 4906-4917, Dec. 2009. 4. J. Jin, and C.-Y. Tsui, “Low-power limited-search parallel state Viterbi decoder implementation based on scarece state transition,” IEEE Trans. VLSI Syst., vol. 15, no. 10, pp.1172-1176, Oct. 2007. 5. F. Sun and T. Zhang, “Low power state-parallel relaxed adaptive Viterbi decoder design and implementation,” in Proc. IEEE ISCAS, pp. 4811-4814, May, 2006.