Digital Voice Coding:  Vocoding & Techniques (DVC)
DVC © Copyright 2001 Global Wireless Education Consortium All rights reserved.  This module, comprising presentation slides with notes, exercises, projects and Instructor Guide, may not be duplicated in any way without the express written permission of the Global Wireless Education Consortium. The information contained herein is for the personal use of the reader and may not be incorporated in any commercial training materials or for-profit education programs, books, databases, or any kind of software without the written permission of the Global Wireless Education Consortium.  Making copies of this module, or any portion, for any purpose other than your own, is a violation of United States copyright laws.  Trademarked names appear throughout this module.  All trademarked names have been used with the permission of their owners .
DVC Partial support for this curriculum material  was provided by the National Science Foundation's  Course, Curriculum, and Laboratory Improvement Program  under grant DUE-9972380 and  Advanced Technological Education Program  under grant DUE‑9950039.  GWEC EDUCATION PARTNERS: This material is subject to the legal License Agreement signed by your institution.  Please refer to this License Agreement for restrictions of use.
Table of Contents Overview 5 Learning Objectives 7 Pulse Code Modulation 10 Linear Predictive Coders 18 Code Excited Linear Predictive (CELP) Vector Sum Excited Linear Predictive (VSELP) Algebraic Code Excited Linear Predictive (ACELP) Summary 32 Contributors 34
Overview In this module, you will learn about Digital Voice Coding (DVC) techniques that make possible more efficient use of the allotted frequency spectrum for wireless telecommunication.
Overview  (cont.) In this module, you will learn about about Digital Voice Coding (DVC) techniques that makes possible more efficient use of the allotted frequency spectrum for wireless telecommunication.
Learning Objectives After completing this module and its activities, you will be able to: List the process for Analog-to-Digital and Digital-to-Analog conversion (A/D, D/A). Describe the methodology, application, and codebooks associated with Coded Excited Linear Prediction (CELP). Describe the methodology, application, and codebooks associated with Vectored Sum Excited Linear Prediction (VSELP). Describe the methodology, application, and codebooks associated with Algebraic Coded Excited Linear Prediction (ACELP).
Introduction
Introduction Ma & Pa Cellular Company Mobile Switching Center (MSC) PSTN 972 555 1212 Cell Site 1 2 3 4 5 6
Pulse Code Modulation  (PCM)
Analog Signals 0  Figure 1 Figure 2 Time Time +V -V 0  +V -V
Digitization Nyquist’s theorem is basic to the concept of digital sampling. We must sample at twice the  highest frequency  of the fundamental , or carrier  signal Yeah, Jim, Harry Nyquist here.  Yeah, I think I’m onto something.  Yeah, well, I think you have to sample at twice the  highest frequency .  Yeah, I’m pretty sure of it.  Yeah, I know your way is cheaper, but it’s distorted.  Yeah, distorted.  D-I-S-T-O… (sigh) You won’t eat (pop) lunch in this town again, Nyquist (crackle)
Sampling Voice Signals 125 usec. +V -V Sampling Rate = 8,000 sample/sec 0
A-D Conversion 125 usec. Sampling gives us PAM 0  Steps in the voice digitization process: Sampling follows Nyquist theorem Quantization requires sufficient levels for acceptable noise Companding improves efficiency for voice transmission A-law; µ-law At this point, depending on the transmission medium, the bitstreams can be time division, or otherwise, multiplexed; fed through a vocoder to further reduce the bandwidth; and combined with error-recovery codes for robustness. 125 usec. 0001 -0010 Quantization gives us PCM 0  -0001 0010
Encoding PAM to PCM 125 usec. Sampling gives us PAM 0  Steps in the encoding PAM to PCM: Quantify the amplitude level of the specific pulse. Create an eight bit word that represents that value. Where: 42V = 10101010 5V = 10000101 36V = 10100100 -36V = 01000100 -5V =  00000101 -42V = 00101010 125 usec. 5V -42V Quantization gives us PCM 0  -5V 42V 36V -36V
Signal to Noise Quantizing Ratio Where: n  = number of bits in the coded word A  = amplitude of the signal and that point in time A max  = maximum amplitude of the signal
Companding Companding involves compressing larger signals in the coding process so that  the quantization interval increases as the signal increases.  A section of the µ-law compression curve illustrates this process at left.  The compressed signal has increasingly greater quantization intervals as the input signal gets stronger. A/D D/A Compression Linear PCM encoder Linear PCM decoder Expansion Compressed digital code words Slope = 1/8 Slope = 1/4 Slope = 1/2 Slope = 1 Linear signal Compressed signal
Linear Predictive Coders (CELP) (VSELP) (ACELP)
Codebooks and Vocoders Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Codebook  2 X H L X + b (n) L Codebook  1 X ex(n) A(z) synthesis  filter spectral  postfilter output  speech I  i  q  2 q  1 q Long Term  Filter State
Vocoders The basic goal of a vocoder is to encode only perceptually important aspects of speech with fewer bits than PCM. Types of vocoders include: Channel vocoders Formant  vocoders Linear predictive coders In wireless communications, we are primarily interested in the last class.
VSELP Vocoder IS-136 systems use a type of code-excited, linear predictive vocoder called a VSELP vocoder. Codebook  2 X H L X + b (n) L Codebook  1 X ex(n) A(z) synthesis  filter spectral  postfilter output  speech I  i  q  2 q  1 q Long Term  Filter State
VSELP Vocoder The basic data rate of the speech codec is 7950 bps. There are 159 bits per speech frame (20 msec) for the speech codec. These 159 bits are allocated as shown above.
ACELP Vocoder - • • • Digital Output LPC anal., quantiz. and interp. Perceptual weighting Open Loop Pitch analysis Adaptive Codebook Algebraic Codebook Gain VQ MSE Search Synthesis filter Perceptual weighting MTPX Input speech Unquantized LPC info To Past excitation T k g p g c LPC info Gains (7+7) Pitch lag (8+5+1) Index (16+16) LPC info (19+1)
IS-136 Channel Coding In IS-136, class 1 bits are convolutionally coded and interleaved with class 2 bits and transmitted over two time slots. 12 most perceptually significant bits 77 7 Coded Class-1 bits 178 82 260 260 Class-1 bits Class-2 bits Speech Coder 7-bit CRC Computation Rate 1/2 Convolutional Coding Voice Cipher 2-Slot Interleaver Speech frames x and y Speech frames y and z
VSELP Bit Class Assignment For the various parameters produced by the VSELP vocoder, bits are classified according to the table at left. Class 1 and Class 2 bits are designated. Class 1 bits will have convolutional coding applied. Twelve of the Class 1 bits are designated the most perceptually significant and will have a 7-bit CRC computed for them. Class 2 bits are transmitted without any error protection.
Convolutional Encoding The convolutional coding process takes the Class 1 bits and processes them in order to protect them from error. There are 32 states in the convolutional code, each with a defined output for a given input. For each input Class 1 bit, two output bits are produced. The bits are ordered according to information in the spec. The CRC is also included in this calculation. 77 Class 1 bits plus CRC and tail bits in (89 total) Ordering of bits Application of generator polynomials 178 convolutionally coded bits out
Interleaving In the interleaving process, the bits are spread across two speech frames to guard against instances of Rayleigh fading and interference. Class 2 bits are mixed with coded Class 1 bits. In the interleaving array at left, X is the previous speech frame and y is the current speech frame.
Other Vocoder Types Various vocoder types: IS-136 specifies use of two vocoders and has “hooks” for incorporation of others: VSELP:  a “codebook” vocoder operating at 8 kb/s ACELP:  an improvement on VSELP, designed to perform better in the presence of impairments IS-95 uses QCELP - a “variable rate” codebook vocoder  ranges from 13kb/s to 1kb/s (average 4kb/s) takes advantage of natural pauses in speech Research continues into specification of “half rate” vocoders. Pitch Filter Formant Filter DSP QCELP VOCODER Coded Result 20ms Sample Feedback Loop Codebook
Codebook Development A Book of “Voices”: Speech patterns were sampled: English: more than 60%  Spanish: approximately 38% Other languages These patterns were cataloged  2 Volumes 156 bit line items Codebook
Codebook Content Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Each codebook is made up of two volumes Each volume has multiple pages Each page contains multiple line items Each line item is a bit pattern based on random samples of speech patterns. Volume I Volume 2
Digital Format Elements Each VSELP or ACELP encoded signal consists of two basic elements: The conversation data itself (7.4Kb/s) The overhead information needed to  process the signal (8.8Kb/s)
Summary
Summary Ma & Pa Cellular Company Mobile Switching Center (MSC) PSTN 972 555 1212 Cell Site 1 2 3 4 5 6
Industry Contributors AT&T Wireless ( http://www.attwireless.com ) Ericsson ( http://www.ericsson.com ) Nortel Networks ( http://www.nortelnetworks.com ) Telcordia Technologies, Inc. ( http://www.telcordia.com ) The following organizations provided materials and resource support for this module:
Individual Contributors The following individuals and their organization or institution provided materials, resources, and development input for this module: Mr. Jeff Cobb Verizon Wireless http://www.verizonwireless.com Dr.  Philip DiPiazza   Florida Institute of Technology   http://www.fit. edu / Dr. Kaveh Heidary Alabama A&M University http://aamu. edu Mr. John Wakeman Award Solutions, Inc. http://www.awardsolutions.com

Dvc

  • 1.
    Digital Voice Coding: Vocoding & Techniques (DVC)
  • 2.
    DVC © Copyright2001 Global Wireless Education Consortium All rights reserved. This module, comprising presentation slides with notes, exercises, projects and Instructor Guide, may not be duplicated in any way without the express written permission of the Global Wireless Education Consortium. The information contained herein is for the personal use of the reader and may not be incorporated in any commercial training materials or for-profit education programs, books, databases, or any kind of software without the written permission of the Global Wireless Education Consortium. Making copies of this module, or any portion, for any purpose other than your own, is a violation of United States copyright laws. Trademarked names appear throughout this module. All trademarked names have been used with the permission of their owners .
  • 3.
    DVC Partial supportfor this curriculum material was provided by the National Science Foundation's Course, Curriculum, and Laboratory Improvement Program under grant DUE-9972380 and Advanced Technological Education Program under grant DUE‑9950039. GWEC EDUCATION PARTNERS: This material is subject to the legal License Agreement signed by your institution. Please refer to this License Agreement for restrictions of use.
  • 4.
    Table of ContentsOverview 5 Learning Objectives 7 Pulse Code Modulation 10 Linear Predictive Coders 18 Code Excited Linear Predictive (CELP) Vector Sum Excited Linear Predictive (VSELP) Algebraic Code Excited Linear Predictive (ACELP) Summary 32 Contributors 34
  • 5.
    Overview In thismodule, you will learn about Digital Voice Coding (DVC) techniques that make possible more efficient use of the allotted frequency spectrum for wireless telecommunication.
  • 6.
    Overview (cont.)In this module, you will learn about about Digital Voice Coding (DVC) techniques that makes possible more efficient use of the allotted frequency spectrum for wireless telecommunication.
  • 7.
    Learning Objectives Aftercompleting this module and its activities, you will be able to: List the process for Analog-to-Digital and Digital-to-Analog conversion (A/D, D/A). Describe the methodology, application, and codebooks associated with Coded Excited Linear Prediction (CELP). Describe the methodology, application, and codebooks associated with Vectored Sum Excited Linear Prediction (VSELP). Describe the methodology, application, and codebooks associated with Algebraic Coded Excited Linear Prediction (ACELP).
  • 8.
  • 9.
    Introduction Ma &Pa Cellular Company Mobile Switching Center (MSC) PSTN 972 555 1212 Cell Site 1 2 3 4 5 6
  • 10.
  • 11.
    Analog Signals 0 Figure 1 Figure 2 Time Time +V -V 0 +V -V
  • 12.
    Digitization Nyquist’s theoremis basic to the concept of digital sampling. We must sample at twice the highest frequency of the fundamental , or carrier signal Yeah, Jim, Harry Nyquist here. Yeah, I think I’m onto something. Yeah, well, I think you have to sample at twice the highest frequency . Yeah, I’m pretty sure of it. Yeah, I know your way is cheaper, but it’s distorted. Yeah, distorted. D-I-S-T-O… (sigh) You won’t eat (pop) lunch in this town again, Nyquist (crackle)
  • 13.
    Sampling Voice Signals125 usec. +V -V Sampling Rate = 8,000 sample/sec 0
  • 14.
    A-D Conversion 125usec. Sampling gives us PAM 0 Steps in the voice digitization process: Sampling follows Nyquist theorem Quantization requires sufficient levels for acceptable noise Companding improves efficiency for voice transmission A-law; µ-law At this point, depending on the transmission medium, the bitstreams can be time division, or otherwise, multiplexed; fed through a vocoder to further reduce the bandwidth; and combined with error-recovery codes for robustness. 125 usec. 0001 -0010 Quantization gives us PCM 0 -0001 0010
  • 15.
    Encoding PAM toPCM 125 usec. Sampling gives us PAM 0 Steps in the encoding PAM to PCM: Quantify the amplitude level of the specific pulse. Create an eight bit word that represents that value. Where: 42V = 10101010 5V = 10000101 36V = 10100100 -36V = 01000100 -5V = 00000101 -42V = 00101010 125 usec. 5V -42V Quantization gives us PCM 0 -5V 42V 36V -36V
  • 16.
    Signal to NoiseQuantizing Ratio Where: n = number of bits in the coded word A = amplitude of the signal and that point in time A max = maximum amplitude of the signal
  • 17.
    Companding Companding involvescompressing larger signals in the coding process so that the quantization interval increases as the signal increases. A section of the µ-law compression curve illustrates this process at left. The compressed signal has increasingly greater quantization intervals as the input signal gets stronger. A/D D/A Compression Linear PCM encoder Linear PCM decoder Expansion Compressed digital code words Slope = 1/8 Slope = 1/4 Slope = 1/2 Slope = 1 Linear signal Compressed signal
  • 18.
    Linear Predictive Coders(CELP) (VSELP) (ACELP)
  • 19.
    Codebooks and VocodersXxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Codebook 2 X H L X + b (n) L Codebook 1 X ex(n) A(z) synthesis filter spectral postfilter output speech I  i  q  2 q  1 q Long Term Filter State
  • 20.
    Vocoders The basicgoal of a vocoder is to encode only perceptually important aspects of speech with fewer bits than PCM. Types of vocoders include: Channel vocoders Formant vocoders Linear predictive coders In wireless communications, we are primarily interested in the last class.
  • 21.
    VSELP Vocoder IS-136systems use a type of code-excited, linear predictive vocoder called a VSELP vocoder. Codebook 2 X H L X + b (n) L Codebook 1 X ex(n) A(z) synthesis filter spectral postfilter output speech I  i  q  2 q  1 q Long Term Filter State
  • 22.
    VSELP Vocoder Thebasic data rate of the speech codec is 7950 bps. There are 159 bits per speech frame (20 msec) for the speech codec. These 159 bits are allocated as shown above.
  • 23.
    ACELP Vocoder -• • • Digital Output LPC anal., quantiz. and interp. Perceptual weighting Open Loop Pitch analysis Adaptive Codebook Algebraic Codebook Gain VQ MSE Search Synthesis filter Perceptual weighting MTPX Input speech Unquantized LPC info To Past excitation T k g p g c LPC info Gains (7+7) Pitch lag (8+5+1) Index (16+16) LPC info (19+1)
  • 24.
    IS-136 Channel CodingIn IS-136, class 1 bits are convolutionally coded and interleaved with class 2 bits and transmitted over two time slots. 12 most perceptually significant bits 77 7 Coded Class-1 bits 178 82 260 260 Class-1 bits Class-2 bits Speech Coder 7-bit CRC Computation Rate 1/2 Convolutional Coding Voice Cipher 2-Slot Interleaver Speech frames x and y Speech frames y and z
  • 25.
    VSELP Bit ClassAssignment For the various parameters produced by the VSELP vocoder, bits are classified according to the table at left. Class 1 and Class 2 bits are designated. Class 1 bits will have convolutional coding applied. Twelve of the Class 1 bits are designated the most perceptually significant and will have a 7-bit CRC computed for them. Class 2 bits are transmitted without any error protection.
  • 26.
    Convolutional Encoding Theconvolutional coding process takes the Class 1 bits and processes them in order to protect them from error. There are 32 states in the convolutional code, each with a defined output for a given input. For each input Class 1 bit, two output bits are produced. The bits are ordered according to information in the spec. The CRC is also included in this calculation. 77 Class 1 bits plus CRC and tail bits in (89 total) Ordering of bits Application of generator polynomials 178 convolutionally coded bits out
  • 27.
    Interleaving In theinterleaving process, the bits are spread across two speech frames to guard against instances of Rayleigh fading and interference. Class 2 bits are mixed with coded Class 1 bits. In the interleaving array at left, X is the previous speech frame and y is the current speech frame.
  • 28.
    Other Vocoder TypesVarious vocoder types: IS-136 specifies use of two vocoders and has “hooks” for incorporation of others: VSELP: a “codebook” vocoder operating at 8 kb/s ACELP: an improvement on VSELP, designed to perform better in the presence of impairments IS-95 uses QCELP - a “variable rate” codebook vocoder ranges from 13kb/s to 1kb/s (average 4kb/s) takes advantage of natural pauses in speech Research continues into specification of “half rate” vocoders. Pitch Filter Formant Filter DSP QCELP VOCODER Coded Result 20ms Sample Feedback Loop Codebook
  • 29.
    Codebook Development ABook of “Voices”: Speech patterns were sampled: English: more than 60% Spanish: approximately 38% Other languages These patterns were cataloged 2 Volumes 156 bit line items Codebook
  • 30.
    Codebook Content XxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxXxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Each codebook is made up of two volumes Each volume has multiple pages Each page contains multiple line items Each line item is a bit pattern based on random samples of speech patterns. Volume I Volume 2
  • 31.
    Digital Format ElementsEach VSELP or ACELP encoded signal consists of two basic elements: The conversation data itself (7.4Kb/s) The overhead information needed to process the signal (8.8Kb/s)
  • 32.
  • 33.
    Summary Ma &Pa Cellular Company Mobile Switching Center (MSC) PSTN 972 555 1212 Cell Site 1 2 3 4 5 6
  • 34.
    Industry Contributors AT&TWireless ( http://www.attwireless.com ) Ericsson ( http://www.ericsson.com ) Nortel Networks ( http://www.nortelnetworks.com ) Telcordia Technologies, Inc. ( http://www.telcordia.com ) The following organizations provided materials and resource support for this module:
  • 35.
    Individual Contributors Thefollowing individuals and their organization or institution provided materials, resources, and development input for this module: Mr. Jeff Cobb Verizon Wireless http://www.verizonwireless.com Dr. Philip DiPiazza Florida Institute of Technology http://www.fit. edu / Dr. Kaveh Heidary Alabama A&M University http://aamu. edu Mr. John Wakeman Award Solutions, Inc. http://www.awardsolutions.com

Editor's Notes

  • #3 Page Digital Voice Coding Student Notes Global Wireless Education Consortium  © Copyright 2001 Global Wireless Education Consortium All rights reserved. This module, comprising presentation slides with notes, exercises, projects and Instructor Guide, may not be duplicated in any way without the express written permission of the Global Wireless Education Consortium. The information contained herein is for the personal use of the reader and may not be incorporated in any commercial training materials or for-profit education programs, books, databases, or any kind of software without the written permission of the Global Wireless Education Consortium. Making copies of this module, or any portion, for any purpose other than your own, is a violation of United States copyright laws.     Trademarked names appear throughout this module. All trademarked names have been used with the permission of their owners .
  • #4 Page Digital Voice Coding Student Notes Global Wireless Education Consortium  Partial support for this curriculum material was provided by the National Science Foundation's Course, Curriculum, and Laboratory Improvement Program under grant DUE-9972380 and Advanced Technological Education Program under grant DUE‑9950039.    GWEC EDUCATION PARTNERS: This material is subject to the legal License Agreement signed by your institution. Please refer to this License Agreement for restrictions of use.  
  • #5 Page Digital Voice Coding Student Notes Global Wireless Education Consortium 
  • #6 As the world now knows, digital cellular is not synonymous with excellent voice quality. Although, if properly implemented, it can be described as having “very good voice quality,” but the reasons for switching to digital have never been to improve voice quality. There are many valid reasons for a conversion to a digital cellular system. However, the standards for digital audio and for digital cellular are very different. Digital audio is designed to be recovered from a nearly perfect transmission medium and to faithfully reproduce sounds up to 20 kHz in frequency . Since most cellular calls are going to be packed into a telephone line with a frequency response that cuts off just above 3.5 kHz , there is no need to support 20 kHz transmission. Digital wireless systems have the following advantages: Digital systems allow a much greater efficiency in the use of radio spectrum. Many digital processing techniques can be used to allow multiple conversations to be held within the same channel bandwidth as a single analog conversation might be contained. These techniques cannot be employed with analog signals. A digital system is inherently more secure than an analog one. Scanners cannot recover digital transmissions, and further encryption is easily effected by scrambling the bits. Digital transmission facilitates introduction of new features into the cellular system via higher capacity control channels. There is a great deal of redundancy in speech just like there is a great deal of redundancy in the written word. Clearly you could eliminate many of the letters in this paragraph and even many words and still convey the same meaning. Converting to digital format allows us to do the computation necessary to remove the redundancies in a speech signal thereby allowing speech to be sent with fewer bits. This is very difficult if not impossible to do with an analog signal. Removing redundancy is the same as compression.
  • #7 The real issue is that when we remove redundancy the signal is less robust. Since there are fewer bits, each bit means more. So if the environment (noise, radio interference, etc.) causes a bit to be lost it has a much greater effect on speech quality than if the speech was uncompressed. There are many techniques that can be employed to minimize the effects of the environment but, in general they require adding bits (which is the same as increasing bandwidth). The bottom line is that system designers strive to optimize performance against two conflicting needs, good speech quality and minimum bandwidth. There is an inherent “tradeoff” between speech quality and bandwidth. There are a number of techniques that can be used to improve spectrum efficiency. This module will focuses on one class of techniques, namely Digital Voice (or Speech) Coding to improve spectrum efficiency (or bandwidth utilization). While speech coding is a vast subject and there are many different techniques, this module will focus on the few that are employed in modern digital cellular systems. Pulse Code Modulation (PCM) is one of the earliest voice coding techniques used in “toll” or long distance calling that strove to put as many calls on a “wire” as possible. Although PCM is not employed in a cellular systems for transmitting voice on a radio link , it is a worldwide standard for transmission in wired communications and provides an excellent basis for discussing speech compression techniques. So we will discuss PCM first and introduce a class of speech coders called Linear Predictive Coders (LPC) that allow a much greater spectrum efficiency.
  • #8 After completing this module, you will be able to: List the process for Analog-to-Digital and Digital-to-Analog conversion (A/D, D/A). Describe the methodology, application, and codebooks associated with Coded Excited Linear Prediction (CELP). Describe the methodology, application, and codebooks associated with Vectored Sum Excited Linear Prediction (VSELP). Describe the methodology, application, and codebooks associated with Algebraic Coded Excited Linear Prediction (ACELP).
  • #10 The slide above shows a complete wireless telephone system. Calls can be made from the landline telephone to the cell phone, from the cell phone to the landline phone, or from cell phone to cell phone. The major components of this system are: Landline telephone Public Switched Telephone Network (PSTN) – in this scenario it is the local telephone switch, but, it can also represent the world wide telephone system. This interconnection between the PSTN and the Mobile Switching Center, could be either a T-1 (copper cable) or microwave link. Mobile Switching Center – The telephone switch that provides the interconnectivity between the PSTN and cellular telephones. Cell Site – the location of the radios and antennas that will provide cellular coverage for a specific area. Cellular Telephone The connection we are going to be interested in for this module is from the MSC to the Cell Site. The type of speech encoding employed in this connection depends on the type of cell phone being linked to (analog or digital). In this module we will be discussing two types of speech coders: Wave Coders – Wave coders work to reproduce the time waveform of the speech signal as faithfully as possible and are designed to be source independent and therefore are able to code a wide variety of signals. The wave coder discussed in this module will be Pulse Code Modulation (PCM). Vocoders – Voice Coders (Vocoders) achieve a very high level of spectrum efficiency and are based on a prior knowledge about the signal being coded, therefore they are very signal specific. The Vocoders discussed in this module will be in the Linear Predictive Coder family.
  • #12 Figure 1 is a plot/sketch of 1 cycle of a “sine wave”. It is a function everyone learned about in high school trigonometry. Here we are depicting the fundamental component of which all sounds (and much much more) are composed. Speech, music, noise, etc. are all made up of “sine waves” or Sinusoids. The sinusoid in the picture is just one example of the many different sinusoids that make up any sound. A sine wave can be expressed mathematically as: A*sin(2*pi*f*t). In this expression, “A” is the amplitude which defines value of the peaks. The term, pi, of course is the the constant you used to use when calculating the circumference of a circle in middle school. The term, f, is the frequency of the sine wave. Specifically it is the number of complete cycles that occur in one second. So to depict a sinusoid of 1,000 cycles per second we would plot a thousand of the signals in figure 1 for a horizontal axis value of 1 second. By convention, engineers have decided to abbreviate the term “cycles per second” with the term Hertz ( in honor of the great scientist) or simply Hz. So a frequency of a 1,000 cycles per second signal is stated as 1,000 Hz. Human speech is made up of many sinusoids each having different amplitudes and frequencies. Typically, human speech contains component frequencies as low as 300 Hz and may contain frequencies as high as 15 KHz (the K in KHz stands for “Kilo” which means thousand, so this means 15 thousand Hertz.). The second figure is an actual sample of human speech. Note how random the variations are. If one could perform a careful analysis of this signal they could list all of the sine waves that are present that make this signal look the way it does. Each pitch in the speech pattern produces a sinusoid of different frequency while the intensity, or volume, of the pitch determines the amplitude of the sinusoid. An analog signal, at an audio frequency, works fine for local telephone service. It is what is used on the wire between a landline telephone and the local telephone office. However, is has proved to be too inefficient for use between telephone offices where there was a need to send many conversations over a single connection. Digitizing the signal provided the solution.
  • #13 The idea of creating a digital signal, consisting of highs and lows, 1s or 0s, did not originate with with just one person. Many attempts had been made, with some degree of success, to digitize the human voice. Most of these attempts did not result in a significantly better quality signal and were quickly abandoned. Digital had to be more noise free than analog because the main goal was whether or not the “1” or the “0” could be detected. In 1933, Harry Nyquist presented a theory that turned out to be not only valid, but the basis of all digitized voice today. The whole DS-1 format used on T-1 lines is based around Nyquist’s theory. However, like many innovators, Nyquist’s theory was not readily accepted. Very simply, he said we could sample the analog signal, digitize it at the transmitting end, send the digitized signal, then reassemble it at the receiving end with enough fidelity that it would sound like the original voice. By employing Nyquist’s concept, multiple signals could be sent over a single line that would have had to be dedicated to a single caller before. The sending of multiple signals over a single line is known as multiplexing.
  • #14 Nyquist derived the minimum sampling frequency required to extract all the information in a time-varying waveform . Nyquist’s theory specifies that to properly code an analog signal of bandwidth “W” with basic Pulse Code Modulated signal (PCM ) techniques, we need the signal at a rate that is twice that of the highest frequency being sampled. In telecommunications, we limit the voice transmission to a bandwidth of 250 Hz to 4000 Hz . For voice, band-limited to a nominal 4000 Hz as the highest frequency, we need 8000 samples per second . For an audio waveform sampled at 8000 Hz, there is a time interval between samples of 125 microseconds (  s ) . This simply means that regardless of the specific frequency of the moment, we will take a sample every 125 microseconds. Because of this, there is a strong possibility that there will not be a sample taken at the exact beginning or ending of a particular cycle.
  • #15 Quantizing The sampling process results in a Pulse Amplitude Modulated signal (PAM) . Note that the resultant waveform appears as a train of pulses, amplitude modulated. PAM is a type of pulse modulation where the pulse amplitude is proportional to the modulating signal’s amplitude. That sample is then encoded into a binary value. The binary digital word is then transmitted. This process is known as quantizing . At the receive end, a decoder evaluates each binary value and reconstructs the audio waveform. It’s not a whole lot different than the connect the dots game. The device that performs this process is known as a Codec (or sometimes as a quantizer). The name Codec is derived from its function; coder (analog to PCM), decoder (PCM to analog) . The CCITT (Consultative Committee on International Telephone and Telegraph) Recommendation G.711 established two “laws” (recommended standards) for this process. The North American standard for quantizing and decoding a signal's amplitude is µ-Law , while in Europe the A-Law is used. These laws define how many quantizing levels are used and how they are arranged. For our discussion, we will focus on the µ-Law of North America. More information on both laws can be found in Lee, W.C.Y., Mobile Communications Design Fundamentals, New York, NY: Wiley-Interscience.
  • #16 As mentioned earlier in this module, the codec is derived from its function; coder (analog to PCM), decoder (PCM to analog). The process is as follows: Coder: Analog signal is sampled 8000 times in one second. These samples result in pulses, spaced at 125 microseconds , referred to as a PAM signal. An amplitude is identified for each pulse and an 8 bit word is created to represent each pulse. Example: One pulse = 42 V 42 V in an eight bit binary word = 10101010 where: 1st digit on the right = 1 also called Least Significant Bit (LSB) 2nd from the right = 2 3rd = 4 4 th = 8 5 th = 16 6 th = 32 7 th = 64 8 th = Most Significant Bit (MSB) and is used to denote polarity (1 = positive, 0 = negative) therefore: 10101010 = 32 + 8 + 2 = 42 The Decoder works in the same manner, but in reverse. NOTE: Some texts may show the bits in a layout reversed to the what is shown here (i.e. MSB is on the right, LSB on the left. This is usually when tracking the flow direction of bits).
  • #17 In a uniform PCM system the size of every quantization interval (the resolution) is determined by the Signal to Noise Quantizing Ratio (SNQR) requirement of the lowest level to be encoded. As seen from the equation below, this means that SNQR increases with the signal amplitude A , producing unneeded quality for large signals, in general makes uniform PCM inefficient. A more efficient method is seen in the next slide.
  • #18 A more efficient coding procedure is achieved if the quantization intervals are not uniform but are allowed to increase with the sample value. This method is called Companding . Companding (Compression – Expansion) is a PCM technique that compresses the binary signal using a mathematical algorithm. With this technique fewer bits per sample provide a specified SNQR for small signals and an adequate dynamic range for large signals. To implement the technique, successively larger input signal intervals are compressed into constant length quantization intervals; the larger the sample value, the more it is compressed before encoding. The process of first compressing and then expanding a signal is referred to as companding. The µ-law companding algorithm is used in North America and Japan, and the A-law algorithm everywhere else. The idea of companding is shown above. The graph indicates the compression that occurs, increasing for larger signals. The curve is approximated by chords of increasingly greater slope for each equal quantization interval. Notice on the vertical axis that each interval is of equal value. This is the compressed signal. On the horizontal axis, the second interval is greater than the first by a factor of 2, the third by a factor of 4 and the fourth by a factor of 8. The first slope of the signal is identified by the ratio of the first interval (slope = 1) to each succeeding interval (slope = 1/8 or 1 to 8).
  • #19 Linear Predictive Coding (LPC). A particular form of redundancy removal, where a "linear predictor" is created to remove the redundancy due to vocal tract (or other signal generation) effects. LPC is commonly used in a speech coder because this method allows for good tracking of the redundancy introduced by the human vocal tract. While there are various types of Linear Predictive Coders, this discussion will center around the following three: CELP - Code-Excited Linear Predictive coding. VSELP - Vector-Sum Excited Linear Predictive coding. ACELP – Algebraic Code-Excited Linear Predictive coding. For a more complete analysis of Linear Predictive Coding, see Speech Compression at www.data-compression.com/speech.html.
  • #20 The PCM model works excellent for wireline telecommunication. It even works very well for analog cellular. Each 64 Kb/s signal per channel of the T-1 was converted back to an analog signal at the cell site via a D/A converter and transmitted nicely as an FM signal on a 30 KHz radio channel. However, this soon proved to be very limited for cellular in terms of the ability to provide service to a rapidly growing customer base. There is only so much of the frequency spectrum that could be set aside for cellular and PCS and it was being consumed at a high rate. Everyone knew that digital was the answer, but the question was, which digital methodology was the best one. After much research and many failures, it was decided to use a method that centered around the use of code books , vocoders , and, specifically, linear predictive coders . In the 800MHz cellular spectrum, this allowed taking the standard 64Kb/s channel that could only accommodate one analog conversation and use it for three separate digital conversations, which is a much more efficient use of the spectrum. The linear predictive coder model is based on the ability to remove various redundancies of speech and the use of code books, as shown above. Consider this analogy: if you wanted to convey information that is contained on specific pages of a particular document, you could take the time and read aloud those pages to the designated recipient or you could reference the location and let the recipient read those pages for himself. The latter would take much less time than the former. This is basically how code books are used. Sending the “addresses” for the speech patterns takes much less spectrum than sending that actual speech pattern. This is discussed in more detail in the following pages.
  • #21 Channel vocoders were developed beginning in the late 1920s. In the channel vocoder, a bank of bandpass filters separates speech energy into subbands that are full-wave rectified and filtered to determine relative power levels. The resulting levels are encoded and transmitted to the destination. “Voiced or unvoiced” speech excitation and pitch are also measured and are used to synthesize speech in the decoder via a frequency domain model of the vocal tract transfer function . Formant vocoders depend on the fact that speech energy tends to be concentrated at three or four peaks called formants . The formant vocoder determines the location and amplitude of the formants and transmits the information instead of the entire spectrum envelope. Thus a lower bit rate can be utilized. Linear predictive coders extract perceptually significant features of speech directly from a time waveform rather than from a frequency spectrum as the previous two types do. A time-varying model of the vocal tract excitation and transfer function are created, and a synthesizer in the receiving terminal recreates the speech by passing the specified excitation through a mathematical model of the vocal tract. It is the linear predictive coder that we will focus on in this module.
  • #22 The speech coding algorithm described in IS-136 , the most current standard in use, is a member of a class of speech codecs known as C ode E xcited L inear P redictive coding (CELP) . These coders use “codebooks” to vector quantize the excitation (residual) signal . The speech coding algorithm is a variation on CELP called V ector- S um E xcited L inear P redictive coding (VSELP) . VSELP uses a codebook which has a predefined structure such that the computations required for the codebook search process can be reduced. The VSELP speech coder, which is defined in detail in IS-136A.2, has a sampling rate of 7950 bps. The coder breaks speech into frames ; each frame is 20 ms long and contains 160 symbols. Each frame is further divided into subframes 40 samples (5 ms) long. At the base station end, speech is already digitized, coming from the Public Switched Telephone Network ( PSTN ) . At the mobile station, however, analog speech must be converted to uniform PCM. Preceeding the speech coder are voice processing stages : L evel adjustment B andpass filtering Analog to Digital (A /D ) conversion Note in the diagram above that the first part of the decoder generates pulse excitation and the second part synthesizes speech waveforms. The various parameters shown are transmitted for each 20 ms frame. Recall that the whole reason for this approach is that the parameter transmission requires less bandwidth than if PCM were transmitted.
  • #23 Delays inherent in the air interface specification may exceed 100 msec, so echo cancellation is necessary in design of IS-136 systems. The function of the bandpass filter mentioned previously is to avoid aliasing distortion of the input signal. The attenuation of the filter complies with ITU Red Book G.714 specifications. The A/D function produces digitized audio: B y direct conversion analog to a uniform PCM format with a minimum resolution of 13 bits B y converting analog to an 8 bit/µ-law format followed by a µ-law/uniform code conversion. The A/D conversion is based on the standard 8 bit/µ-law codec specified in ITU Red Book G.711. The mobile station echo return loss must have a minimum value of 45 dB. This requirement must be met by all types of mobile stations at their nominal volume setting.
  • #24 The VSELP algorithm has served North American cellular for several years, but it suffers from some deficiencies. In the presence of interference, artifacts occur which distort the reproduced speech. IS-54 (forerunner of IS-136) designers have found that the C/I ratio required for adequate speech is at least that of AMPS, which is to say, 18 dB of separation. IS-136 includes provision for the phones and base stations to recognize the existence of other vocoder types. The first of these will be that specified in IS-641, the ACELP vocoder, for Algebraic Code-Excited Linear Predictive . While still an 8-kb/s vocoder, like the VSELP, the ACELP was designed to be more robust in the presence of co-channel interference . Tests demonstrated at recent CTIA meetings have shown considerable improvement with ACELP implementation, and new phones are incorporating it in their designs. IS-641 is a North American standard for encoding 64 kb/s u-law/A-law sampled speech signals down to 7.4 kb/s for transmission over IS-136A digital cellular channels. IS-641 provides approximately 4 kHz of speech bandwidth and has an algorithmic delay of 25ms. It requires encoding frames of 160 linear PCM samples into 148 bits which are packed into ten 16-bit code words. A good description of an ACELP implementation is contained in a paper by Salami, et al, “A Toll Quality 8 kb/s Speech Codec for the Personal Communications System (PCS),” in the August, 1994 IEEE Transactions on Vehicular Technology.
  • #25 The channel error control for VSELP in IS-136 employs three schemes to protect against channel errors. The first is to use a rate one-half convolutional code to protect the more vulnerable bits of the speech codec data stream. The second technique interleaves the transmitted data for each speech codec frame over two time slots to protect against Rayleigh fading. The third technique employs the use of a cyclic redundancy check (CRC) over some of the most perceptually significant bits of the speech codec output. After the error correction is applied at the receiver, these cyclic redundancy bits are checked to see if the most perceptually significant bits were received properly.
  • #26 The first step in the error correction process is the separation of the 159-bit speech codec frame’s information into class 1 and class 2 bits. There are 77 class 1 bits and 82 class 2 bits in the 159-bit speech codec frame. Convolutional coding is applied to the 77 class 1 bits. A 7-bit CRC is used for error detection purposes and is computed over the 12 most perceptually significant bits of the class 1 bits for each frame. Class 2 bits are transmitted without any error protectio n.
  • #27 Of the 89 bits input to the convolutional coder, 77 are class 1 bits. Other bits are reserved for the CRC for the frame, or are filled with zeros corresponding to 5 tail bits (since the CRC is 7 bits). The bits are rearranged in a certain array as defined in the spec ification. The convolutional encoding use s a rate 1/2, memory order 5 code (R = 1/2, m = 5). There are 32 states in this code, five memory elements. The notation for the generator polynomials, g 0 (D) and g 1 (D), that follow are defined by Shu Lin and Daniel Costello in Error Control Coding: Fundamentals and Applications , Prentice-Hall, April 1983 on page 330. The polynomials are defined as: g 0 (D) = 1 + D + D 3 + D 5 g 1 (D) = 1 + D 2 + D 3 + D 4 + D 5 The output from the convolutional coder alternates between these two polynomials starting with g 0 (D) being the first in each time slot. The free coefficient in the above equations is the most significant bit ( MSB ) . Initially the encoder’s memory elements are cleared . T he encoder starts at state 0 and the bits in the class 1 buffer are read in starting at 0 through 88. Sequentially the output from g 0 (D) and g 1 (D) are referred to as cc 0 [i] and cc 1 [i], respectively. For each input bit, CL1[i], the two output bits, cc 0 [i] and cc 1 [i], are produced. The order, i, the bits are placed into CL1[i] is indicated in a table in the spec ification .
  • #28 Before transmission, VSELP encoded speech data is interleaved over two time slots with the speech data from adjacent speech frames. Each time slot thus contains information from two speech codec frames. The speech data is placed into a rectangular interleaving array as shown above. The speech data is entered into the interleaving array column-wise. The two speech frames are referred to as x and y where x is the previous speech frame and y is the present or most recent speech frame. The data , which can have encryption applied or be plain text , is placed into the interleaving array in a way that intermixes the class 2 bits from the speech codec with the convolutionally coded class 1 bits. The class 2 bits are sequentially placed into the array and occupy the following numbered locations in the interleaving array: 0, 26, 52, 78 93 through 129 130, 156, 182, 208 223 through 259 The class 1 bits occupy the rest of the interleaving array and are also sequentially placed into the array.
  • #29 Summarizing, a codebook vocoder stores a collection of arbitrary waveform segments (a set of digitized vocal patterns) in digital form. The QCELP (Qualcomm code excited linear predictive) , shown above, is technically not part of this module, however, it is another vocoder that uses basically the same codebook as all the others. It is the one used by CDMA, therefore it is worth mentioning here. For VSELP, within a 20ms sample time, the vocoder, through approximation based upon previous samples, approximates as closely as possible a code representation of the sample signal. The pitch filter models the periodic pulse train coming from the vocal cords during voiced speech. A formant filter models the characteristics of the vocal tract. It has resonant frequencies near the resonant frequencies of the original speech caused by the vocal tract filtering. The vocoder sends instructions concerning lookup of patterns to the receiver, causing the remote vocoder to recreate speech.
  • #30 The codebook is a real, two-volume book that contains samplings of voice speech patterns. These patterns have been cataloged and are used in the coding and decoding (VSELP, ACELP) of the PCM signal. Random samples were made by interviewing and recording thousands of people from different locations saying the same phrases. The idea was based on the concept that all speech is made up of sound bites such as: Th St Ku Qu Wa To etc. There are literally thousand upon thousands of different sounds that are used to make words. If these sounds could be captured and cataloged, then all we would have to do is reference the catalog and recreate the same sequence of sounds at the other end, and should have the same words appear.
  • #31 Each volume of the codebook contains hundreds of pages and each page contains hundreds of line items. This works very similar to an instructor referencing a text book that every student has. The instructor can tell the student to go to chapter 5, page 10, paragraph 4, sentence 3. Assuming they all have the same edition of the book, they will all be reading the exact same thing. At the transmitting end, the vocoder follows this process: Strip off the first 156 bits of the PCM signal as it enters the vocoder Identify the location, in the codebook, of the bit pattern that most closely resemble the stripped off bits. Encode the address of each line in eight bit words using VSELP or ACELP. Send via the chosen modulation method. Each 64Kb/s channels is now represented by 7.4Kb/s At the receiving end, the vocoder follows this process: Receive the 7.4Kb/s encoded data. Read the addresses. Go to the correct volume and locate the identified line item. Reassemble a 64Kb/s PCM bit pattern. Convert to analog signal for the human ear.
  • #32 At this point, there is still one question remaining. If we can represent one 64Kb/s conversation with a 7.4Kb/s signal, then why can we only get three conversations in the 64Kb/s channel? In order to explain this, we must first go back to the analog transmitted signal. In analog, it is not necessary for the human ear to hear every single piece of the signal because our minds will fill in the blanks. In cellular we make use of this phenomenon to send control information back and forth between the cell phone and the cell site. We use a method called blank and burst . We literally blank out the conversation, extremely briefly, and burst in the control messages. This period is so short that, in most cases, it is never detected by the human ear. However, digital is much more sensitive and cannot handle even these momentary lapses. Digital also requires a great deal more control time than analog. Therefore, the control, or overhead, messages are incorporated into the signal itself. It needs approximately 8.8 Kb/s of signal. When added to the 7.4Kb/s, this gives us 16.2Kb/s per conversation. That means that in a 64Kb/s channel, we can fit three conversation, but not quite four.
  • #34 In the slide above, this path is followed: Telephone to Public Switched Telephone Network (PSTN) is analog. PSTN converts the Analog voice to Pulse Code Modulation (PCM) using an A/D converter. PSTN sends the 64Kb/s PCM signal to the MSC on a T-1 (could also use microwave). Each T-1 channel has 1 conversation. There are 24 channels per T-1 At the MSC, the 64Kb/s is encoded using the VSELP or ACELP vocoder and sent to the cell site. Now there are 3 conversations in each T-1 channel. At the cell site, the encoded message is transmitted to the appropriate cell phone. At the cell phone, the message is decoded back to PCM, then converted back to analog. From the cell phone back to the land phone, the process is just reversed.
  • #35 Page Digital Voice Coding Student Notes Global Wireless Education Consortium 
  • #36 Page Digital Voice Coding Student Notes Global Wireless Education Consortium 