Published on

Published in: Education, Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  2. 2. OBJECTIVE  The main aim of the this viterbi decoder is to reduce the power consumption without degrading the performance.  For the purpose of the low power consumption, We propose a precomputation architecture incorporated with T-algorithm for VD, which can effectively reduce the power consumption without degrading the decoding speed much.
  3. 3. INTRODUCTION  General solutions for Power reduction in VDs could be achieved by reducing the number of states (for example, reduced-state sequence decoding (RSSD) M-algorithm and T-algorithm ) or by over-scaling the supply voltage.  RSSD is in general not as efficient as the M-algorithm and T -algorithm is more commonly used than M-algorithm in practical applications, because the M-algorithm requires a sorting process in a feedback loop while T -algorithm only searches for the optimal path metric (PM), that is, the minimum value or the maximum value of all PMs.
  4. 4. Cont’d  T -algorithm has been shown to be very efficient in reducing the power consumption. However, searching for the optimal PM in the feedback loop still reduces the decoding speed.  To overcome this drawback, two variations of the T -algorithm have been proposed: the relaxed adaptive VD , which suggests using an estimated optimal PM, instead of finding the real one each cycle and the limited-search parallel state VD based on scarce state transition (SST).
  5. 5. VITERBI DECODER Functional diagram of a viterbi decoder.
  6. 6. Functionality:  BMU: branch metrics (BMs) are calculated in the BM unit (BMU) the received symbols. In a TCM decoder this module is replaced by transition metrics unit (TMU), which is more complex than the BMU.  ACSU:BMs are fed into the ACSU that recursively computes the path metrics (PMs) and outputs decision bits for each possible state transition.  SMU: The decision bits are stored in and retrieved from the survivor-path memory unit (SMU) in order to decode the source bits along the final survivor path.  PMU: The PMs of the current iteration are stored in the PM unit (PMU). T-algorithm requires extra computation in the ACSU loop for calculating the optimal PM
  7. 7. PRECOMPUTATION ARCHITECTURE A topology of pre-computation pipelining.
  8. 8. OPERATION:
  9. 9. REVIEW OF PREVIOUS WORK ON LOW-POWER VITERBI DECODER DESIGN  I review three most relevant works for low-power Viterbi decoder designs. Seki, Kubota, Mizoguchi and Kato suggested a scarce state transition (SST) scheme to reduce the switching activity of a Viterbi decoder. The input is pre-decoded by a simple and hence, a power efficient decoder. The pre-decoded sequence, which is not optimal under a noisy channel, is reprocessed by a Viterbi decoder to improve performance. The authors showed that the pre-decoded sequence reduces the switching activity of the Viterbi decoder thereby reducing power dissipation.
  10. 10. (cont’d)  Kang and Wilson suggested application of existing low-power design methodologies at different levels. At the architectural level, they suggested partition of major blocks and memory modules to reduce the power dissipation. They considered Grey coding for memory addressing, which incurs less switching compared to binary coding.  Garrett and Stan suggested a low-power architecture of the soft- output Viterbi decoder for turbo codes. They proposed an orthogonal access memory structure, which enables parallel access of sequentially received data. Use of such a memory structure reduces the switching activity for read and write of survivor path information.  All the above works aim to reduce the switching activities of Viterbi decoders, which is an effective scheme for power reduction.
  11. 11. VITERBI DECODER DESIGN VD with 2-step pre-computation T-algorithm.
  12. 12. Cont’d  The minimum value of each BM group (BMG) can be calculated in BMU or TMU and then passed to the “Threshold Generator” unit (TGU) to calculate (PMopt + T). (PMopt + T) and the new PMs are then compared in the “Purge Unit” (PU).  The “MIN 16” unit for finding the minimum value in each cluster is constructed with 2 stages of 4-input comparators. This architecture has been optimized to meet the iteration bound.  Compared with the conventional T-algorithm, the computational overhead of this architecture is 12 addition operations and a comparison.
  13. 13. Architecture of TGU:
  14. 14. IMPLEMENTATION  The full-trellis VD, the VD with the 2-step pre-computation architecture and one with the conventional T-algorithm are modeled with Verilog HDL code.  This is because the former decoder has a much longer critical path and the synthesis tool took extra measures to improve the clock speed (e.g., using many standard cells with larger driving strength, duplicating logic and registers to reduce fan-out and load capacitance, etc.).  It is clear that the conventional T-algorithm is not suitable for high-speed applications. If the target throughput is moderately high, the proposed architecture can operate at a lower supply voltage, which will lead to quadratic power reduction compared to the conventional scheme (due to much shorter critical path). Thus i next focus on the power comparison between the full trellis VD and the proposed scheme.
  15. 15. Advantages:  The usage of this Viterbi algorithm is found to be advantageous due to its cost effectiveness in modulated minimize at the same time the functional performance in some situation would modulate in maintaining the original cost. Emerging linear functioning of linear pulse distance is due to convenient source sequence.
  16. 16. CONCLUSION  The pre-computation architecture that incorporates T-algorithm efficiently reduces the power consumption of VDs without reducing the decoding speed appreciably. I have also analyzed the pre-computation algorithm.  Algorithm is suitable for TCM systems which always employ high-rate convolutional codes. Finally, I presented a design case. Both the ACSU and SMU are modified to correctly decode the signal. ASIC synthesis and power estimation results
  17. 17. REFERENCES 1. J. He, Z. Wang and H. Liu, “An efficient 4-D 8PSK TCM decoder architecture”, IEEE Trans. VLSI Syst., vol. 18, no. 5, pp. 808-817, May 2010. 2. J. He, H. Liu, Z. Wang, "A fast ACSU architecture for Viterbi decoder using Talgorithm," in Proc. 43rd IEEE Asilomar Conf. on Signals, Systems and Computers, pp. 231-235, Nov. 2009. 3. R. A. Abdallah, and N. R. Shanbhag, “Error-resilient low-power Viterbi decoder architectures,” IEEE Trans. Sig. Proc., vol. 57, No. 12, pp. 4906-4917, Dec. 2009. 4. J. Jin, and C.-Y. Tsui, “Low-power limited-search parallel state Viterbi decoder implementation based on scarece state transition,” IEEE Trans. VLSI Syst., vol. 15, no. 10, pp.1172-1176, Oct. 2007. 5. F. Sun and T. Zhang, “Low power state-parallel relaxed adaptive Viterbi decoder design and implementation,” in Proc. IEEE ISCAS, pp. 4811-4814, May, 2006.
  18. 18. Cont’d 6. “Bandwidth-Efficient Modulations”, CCSDS 401(3.3.6) Green Book, April 2003. 7. Francois Chan and David Haccoun, “Adaptive Viterbi decoding of convolutional codes over memoryless channels,” IEEE Trans. Commun., vol. 45, no. 11, pp. 1389-1400, Nov. 1997. 8. J. B. Anderson and E. Offer, “Reduced-state sequence detection with convolutional codes,” IEEE Trans. Inf. Theory, vol. 40, no. 3, pp. 965-972, May 1994. 9. S. J. Simmons, “Breadth-first trellis decoding with adaptive Effort,” IEEE Trans.Commun., vol. 38, no. 1, pp. 3-12, Jan. 1990. 10. Jinjin He, Huaping Liu, Zhonhfeng Wang, Xinming Huang, and Kai Zhang, “HIGHSPEED LOW POWER VITERBI DECODERNDESIGN FOR TCM DECODERS,”IEEE Transaction on VLSI,vol.20,NO 4,APRIL 2012.