16. Signal to Noise Quantizing Ratio Where: n = number of bits in the coded word A = amplitude of the signal and that point in time A max = maximum amplitude of the signal
21. VSELP Vocoder IS-136 systems use a type of code-excited, linear predictive vocoder called a VSELP vocoder. Codebook 2 X H L X + b (n) L Codebook 1 X ex(n) A(z) synthesis filter spectral postfilter output speech I i q 2 q 1 q Long Term Filter State
22. VSELP Vocoder The basic data rate of the speech codec is 7950 bps. There are 159 bits per speech frame (20 msec) for the speech codec. These 159 bits are allocated as shown above.
23. ACELP Vocoder - • • • Digital Output LPC anal., quantiz. and interp. Perceptual weighting Open Loop Pitch analysis Adaptive Codebook Algebraic Codebook Gain VQ MSE Search Synthesis filter Perceptual weighting MTPX Input speech Unquantized LPC info To Past excitation T k g p g c LPC info Gains (7+7) Pitch lag (8+5+1) Index (16+16) LPC info (19+1)
Page Digital Voice Coding Student Notes Global Wireless Education Consortium Partial support for this curriculum material was provided by the National Science Foundation's Course, Curriculum, and Laboratory Improvement Program under grant DUE-9972380 and Advanced Technological Education Program under grant DUE‑9950039. GWEC EDUCATION PARTNERS: This material is subject to the legal License Agreement signed by your institution. Please refer to this License Agreement for restrictions of use.
Page Digital Voice Coding Student Notes Global Wireless Education Consortium
As the world now knows, digital cellular is not synonymous with excellent voice quality. Although, if properly implemented, it can be described as having “very good voice quality,” but the reasons for switching to digital have never been to improve voice quality. There are many valid reasons for a conversion to a digital cellular system. However, the standards for digital audio and for digital cellular are very different. Digital audio is designed to be recovered from a nearly perfect transmission medium and to faithfully reproduce sounds up to 20 kHz in frequency . Since most cellular calls are going to be packed into a telephone line with a frequency response that cuts off just above 3.5 kHz , there is no need to support 20 kHz transmission. Digital wireless systems have the following advantages: Digital systems allow a much greater efficiency in the use of radio spectrum. Many digital processing techniques can be used to allow multiple conversations to be held within the same channel bandwidth as a single analog conversation might be contained. These techniques cannot be employed with analog signals. A digital system is inherently more secure than an analog one. Scanners cannot recover digital transmissions, and further encryption is easily effected by scrambling the bits. Digital transmission facilitates introduction of new features into the cellular system via higher capacity control channels. There is a great deal of redundancy in speech just like there is a great deal of redundancy in the written word. Clearly you could eliminate many of the letters in this paragraph and even many words and still convey the same meaning. Converting to digital format allows us to do the computation necessary to remove the redundancies in a speech signal thereby allowing speech to be sent with fewer bits. This is very difficult if not impossible to do with an analog signal. Removing redundancy is the same as compression.
The real issue is that when we remove redundancy the signal is less robust. Since there are fewer bits, each bit means more. So if the environment (noise, radio interference, etc.) causes a bit to be lost it has a much greater effect on speech quality than if the speech was uncompressed. There are many techniques that can be employed to minimize the effects of the environment but, in general they require adding bits (which is the same as increasing bandwidth). The bottom line is that system designers strive to optimize performance against two conflicting needs, good speech quality and minimum bandwidth. There is an inherent “tradeoff” between speech quality and bandwidth. There are a number of techniques that can be used to improve spectrum efficiency. This module will focuses on one class of techniques, namely Digital Voice (or Speech) Coding to improve spectrum efficiency (or bandwidth utilization). While speech coding is a vast subject and there are many different techniques, this module will focus on the few that are employed in modern digital cellular systems. Pulse Code Modulation (PCM) is one of the earliest voice coding techniques used in “toll” or long distance calling that strove to put as many calls on a “wire” as possible. Although PCM is not employed in a cellular systems for transmitting voice on a radio link , it is a worldwide standard for transmission in wired communications and provides an excellent basis for discussing speech compression techniques. So we will discuss PCM first and introduce a class of speech coders called Linear Predictive Coders (LPC) that allow a much greater spectrum efficiency.
After completing this module, you will be able to: List the process for Analog-to-Digital and Digital-to-Analog conversion (A/D, D/A). Describe the methodology, application, and codebooks associated with Coded Excited Linear Prediction (CELP). Describe the methodology, application, and codebooks associated with Vectored Sum Excited Linear Prediction (VSELP). Describe the methodology, application, and codebooks associated with Algebraic Coded Excited Linear Prediction (ACELP).
The slide above shows a complete wireless telephone system. Calls can be made from the landline telephone to the cell phone, from the cell phone to the landline phone, or from cell phone to cell phone. The major components of this system are: Landline telephone Public Switched Telephone Network (PSTN) – in this scenario it is the local telephone switch, but, it can also represent the world wide telephone system. This interconnection between the PSTN and the Mobile Switching Center, could be either a T-1 (copper cable) or microwave link. Mobile Switching Center – The telephone switch that provides the interconnectivity between the PSTN and cellular telephones. Cell Site – the location of the radios and antennas that will provide cellular coverage for a specific area. Cellular Telephone The connection we are going to be interested in for this module is from the MSC to the Cell Site. The type of speech encoding employed in this connection depends on the type of cell phone being linked to (analog or digital). In this module we will be discussing two types of speech coders: Wave Coders – Wave coders work to reproduce the time waveform of the speech signal as faithfully as possible and are designed to be source independent and therefore are able to code a wide variety of signals. The wave coder discussed in this module will be Pulse Code Modulation (PCM). Vocoders – Voice Coders (Vocoders) achieve a very high level of spectrum efficiency and are based on a prior knowledge about the signal being coded, therefore they are very signal specific. The Vocoders discussed in this module will be in the Linear Predictive Coder family.
Figure 1 is a plot/sketch of 1 cycle of a “sine wave”. It is a function everyone learned about in high school trigonometry. Here we are depicting the fundamental component of which all sounds (and much much more) are composed. Speech, music, noise, etc. are all made up of “sine waves” or Sinusoids. The sinusoid in the picture is just one example of the many different sinusoids that make up any sound. A sine wave can be expressed mathematically as: A*sin(2*pi*f*t). In this expression, “A” is the amplitude which defines value of the peaks. The term, pi, of course is the the constant you used to use when calculating the circumference of a circle in middle school. The term, f, is the frequency of the sine wave. Specifically it is the number of complete cycles that occur in one second. So to depict a sinusoid of 1,000 cycles per second we would plot a thousand of the signals in figure 1 for a horizontal axis value of 1 second. By convention, engineers have decided to abbreviate the term “cycles per second” with the term Hertz ( in honor of the great scientist) or simply Hz. So a frequency of a 1,000 cycles per second signal is stated as 1,000 Hz. Human speech is made up of many sinusoids each having different amplitudes and frequencies. Typically, human speech contains component frequencies as low as 300 Hz and may contain frequencies as high as 15 KHz (the K in KHz stands for “Kilo” which means thousand, so this means 15 thousand Hertz.). The second figure is an actual sample of human speech. Note how random the variations are. If one could perform a careful analysis of this signal they could list all of the sine waves that are present that make this signal look the way it does. Each pitch in the speech pattern produces a sinusoid of different frequency while the intensity, or volume, of the pitch determines the amplitude of the sinusoid. An analog signal, at an audio frequency, works fine for local telephone service. It is what is used on the wire between a landline telephone and the local telephone office. However, is has proved to be too inefficient for use between telephone offices where there was a need to send many conversations over a single connection. Digitizing the signal provided the solution.
The idea of creating a digital signal, consisting of highs and lows, 1s or 0s, did not originate with with just one person. Many attempts had been made, with some degree of success, to digitize the human voice. Most of these attempts did not result in a significantly better quality signal and were quickly abandoned. Digital had to be more noise free than analog because the main goal was whether or not the “1” or the “0” could be detected. In 1933, Harry Nyquist presented a theory that turned out to be not only valid, but the basis of all digitized voice today. The whole DS-1 format used on T-1 lines is based around Nyquist’s theory. However, like many innovators, Nyquist’s theory was not readily accepted. Very simply, he said we could sample the analog signal, digitize it at the transmitting end, send the digitized signal, then reassemble it at the receiving end with enough fidelity that it would sound like the original voice. By employing Nyquist’s concept, multiple signals could be sent over a single line that would have had to be dedicated to a single caller before. The sending of multiple signals over a single line is known as multiplexing.
Nyquist derived the minimum sampling frequency required to extract all the information in a time-varying waveform . Nyquist’s theory specifies that to properly code an analog signal of bandwidth “W” with basic Pulse Code Modulated signal (PCM ) techniques, we need the signal at a rate that is twice that of the highest frequency being sampled. In telecommunications, we limit the voice transmission to a bandwidth of 250 Hz to 4000 Hz . For voice, band-limited to a nominal 4000 Hz as the highest frequency, we need 8000 samples per second . For an audio waveform sampled at 8000 Hz, there is a time interval between samples of 125 microseconds ( s ) . This simply means that regardless of the specific frequency of the moment, we will take a sample every 125 microseconds. Because of this, there is a strong possibility that there will not be a sample taken at the exact beginning or ending of a particular cycle.
Quantizing The sampling process results in a Pulse Amplitude Modulated signal (PAM) . Note that the resultant waveform appears as a train of pulses, amplitude modulated. PAM is a type of pulse modulation where the pulse amplitude is proportional to the modulating signal’s amplitude. That sample is then encoded into a binary value. The binary digital word is then transmitted. This process is known as quantizing . At the receive end, a decoder evaluates each binary value and reconstructs the audio waveform. It’s not a whole lot different than the connect the dots game. The device that performs this process is known as a Codec (or sometimes as a quantizer). The name Codec is derived from its function; coder (analog to PCM), decoder (PCM to analog) . The CCITT (Consultative Committee on International Telephone and Telegraph) Recommendation G.711 established two “laws” (recommended standards) for this process. The North American standard for quantizing and decoding a signal's amplitude is µ-Law , while in Europe the A-Law is used. These laws define how many quantizing levels are used and how they are arranged. For our discussion, we will focus on the µ-Law of North America. More information on both laws can be found in Lee, W.C.Y., Mobile Communications Design Fundamentals, New York, NY: Wiley-Interscience.
As mentioned earlier in this module, the codec is derived from its function; coder (analog to PCM), decoder (PCM to analog). The process is as follows: Coder: Analog signal is sampled 8000 times in one second. These samples result in pulses, spaced at 125 microseconds , referred to as a PAM signal. An amplitude is identified for each pulse and an 8 bit word is created to represent each pulse. Example: One pulse = 42 V 42 V in an eight bit binary word = 10101010 where: 1st digit on the right = 1 also called Least Significant Bit (LSB) 2nd from the right = 2 3rd = 4 4 th = 8 5 th = 16 6 th = 32 7 th = 64 8 th = Most Significant Bit (MSB) and is used to denote polarity (1 = positive, 0 = negative) therefore: 10101010 = 32 + 8 + 2 = 42 The Decoder works in the same manner, but in reverse. NOTE: Some texts may show the bits in a layout reversed to the what is shown here (i.e. MSB is on the right, LSB on the left. This is usually when tracking the flow direction of bits).
In a uniform PCM system the size of every quantization interval (the resolution) is determined by the Signal to Noise Quantizing Ratio (SNQR) requirement of the lowest level to be encoded. As seen from the equation below, this means that SNQR increases with the signal amplitude A , producing unneeded quality for large signals, in general makes uniform PCM inefficient. A more efficient method is seen in the next slide.
A more efficient coding procedure is achieved if the quantization intervals are not uniform but are allowed to increase with the sample value. This method is called Companding . Companding (Compression – Expansion) is a PCM technique that compresses the binary signal using a mathematical algorithm. With this technique fewer bits per sample provide a specified SNQR for small signals and an adequate dynamic range for large signals. To implement the technique, successively larger input signal intervals are compressed into constant length quantization intervals; the larger the sample value, the more it is compressed before encoding. The process of first compressing and then expanding a signal is referred to as companding. The µ-law companding algorithm is used in North America and Japan, and the A-law algorithm everywhere else. The idea of companding is shown above. The graph indicates the compression that occurs, increasing for larger signals. The curve is approximated by chords of increasingly greater slope for each equal quantization interval. Notice on the vertical axis that each interval is of equal value. This is the compressed signal. On the horizontal axis, the second interval is greater than the first by a factor of 2, the third by a factor of 4 and the fourth by a factor of 8. The first slope of the signal is identified by the ratio of the first interval (slope = 1) to each succeeding interval (slope = 1/8 or 1 to 8).
Linear Predictive Coding (LPC). A particular form of redundancy removal, where a "linear predictor" is created to remove the redundancy due to vocal tract (or other signal generation) effects. LPC is commonly used in a speech coder because this method allows for good tracking of the redundancy introduced by the human vocal tract. While there are various types of Linear Predictive Coders, this discussion will center around the following three: CELP - Code-Excited Linear Predictive coding. VSELP - Vector-Sum Excited Linear Predictive coding. ACELP – Algebraic Code-Excited Linear Predictive coding. For a more complete analysis of Linear Predictive Coding, see Speech Compression at www.data-compression.com/speech.html.
The PCM model works excellent for wireline telecommunication. It even works very well for analog cellular. Each 64 Kb/s signal per channel of the T-1 was converted back to an analog signal at the cell site via a D/A converter and transmitted nicely as an FM signal on a 30 KHz radio channel. However, this soon proved to be very limited for cellular in terms of the ability to provide service to a rapidly growing customer base. There is only so much of the frequency spectrum that could be set aside for cellular and PCS and it was being consumed at a high rate. Everyone knew that digital was the answer, but the question was, which digital methodology was the best one. After much research and many failures, it was decided to use a method that centered around the use of code books , vocoders , and, specifically, linear predictive coders . In the 800MHz cellular spectrum, this allowed taking the standard 64Kb/s channel that could only accommodate one analog conversation and use it for three separate digital conversations, which is a much more efficient use of the spectrum. The linear predictive coder model is based on the ability to remove various redundancies of speech and the use of code books, as shown above. Consider this analogy: if you wanted to convey information that is contained on specific pages of a particular document, you could take the time and read aloud those pages to the designated recipient or you could reference the location and let the recipient read those pages for himself. The latter would take much less time than the former. This is basically how code books are used. Sending the “addresses” for the speech patterns takes much less spectrum than sending that actual speech pattern. This is discussed in more detail in the following pages.
Channel vocoders were developed beginning in the late 1920s. In the channel vocoder, a bank of bandpass filters separates speech energy into subbands that are full-wave rectified and filtered to determine relative power levels. The resulting levels are encoded and transmitted to the destination. “Voiced or unvoiced” speech excitation and pitch are also measured and are used to synthesize speech in the decoder via a frequency domain model of the vocal tract transfer function . Formant vocoders depend on the fact that speech energy tends to be concentrated at three or four peaks called formants . The formant vocoder determines the location and amplitude of the formants and transmits the information instead of the entire spectrum envelope. Thus a lower bit rate can be utilized. Linear predictive coders extract perceptually significant features of speech directly from a time waveform rather than from a frequency spectrum as the previous two types do. A time-varying model of the vocal tract excitation and transfer function are created, and a synthesizer in the receiving terminal recreates the speech by passing the specified excitation through a mathematical model of the vocal tract. It is the linear predictive coder that we will focus on in this module.
The speech coding algorithm described in IS-136 , the most current standard in use, is a member of a class of speech codecs known as C ode E xcited L inear P redictive coding (CELP) . These coders use “codebooks” to vector quantize the excitation (residual) signal . The speech coding algorithm is a variation on CELP called V ector- S um E xcited L inear P redictive coding (VSELP) . VSELP uses a codebook which has a predefined structure such that the computations required for the codebook search process can be reduced. The VSELP speech coder, which is defined in detail in IS-136A.2, has a sampling rate of 7950 bps. The coder breaks speech into frames ; each frame is 20 ms long and contains 160 symbols. Each frame is further divided into subframes 40 samples (5 ms) long. At the base station end, speech is already digitized, coming from the Public Switched Telephone Network ( PSTN ) . At the mobile station, however, analog speech must be converted to uniform PCM. Preceeding the speech coder are voice processing stages : L evel adjustment B andpass filtering Analog to Digital (A /D ) conversion Note in the diagram above that the first part of the decoder generates pulse excitation and the second part synthesizes speech waveforms. The various parameters shown are transmitted for each 20 ms frame. Recall that the whole reason for this approach is that the parameter transmission requires less bandwidth than if PCM were transmitted.
Delays inherent in the air interface specification may exceed 100 msec, so echo cancellation is necessary in design of IS-136 systems. The function of the bandpass filter mentioned previously is to avoid aliasing distortion of the input signal. The attenuation of the filter complies with ITU Red Book G.714 specifications. The A/D function produces digitized audio: B y direct conversion analog to a uniform PCM format with a minimum resolution of 13 bits B y converting analog to an 8 bit/µ-law format followed by a µ-law/uniform code conversion. The A/D conversion is based on the standard 8 bit/µ-law codec specified in ITU Red Book G.711. The mobile station echo return loss must have a minimum value of 45 dB. This requirement must be met by all types of mobile stations at their nominal volume setting.
The VSELP algorithm has served North American cellular for several years, but it suffers from some deficiencies. In the presence of interference, artifacts occur which distort the reproduced speech. IS-54 (forerunner of IS-136) designers have found that the C/I ratio required for adequate speech is at least that of AMPS, which is to say, 18 dB of separation. IS-136 includes provision for the phones and base stations to recognize the existence of other vocoder types. The first of these will be that specified in IS-641, the ACELP vocoder, for Algebraic Code-Excited Linear Predictive . While still an 8-kb/s vocoder, like the VSELP, the ACELP was designed to be more robust in the presence of co-channel interference . Tests demonstrated at recent CTIA meetings have shown considerable improvement with ACELP implementation, and new phones are incorporating it in their designs. IS-641 is a North American standard for encoding 64 kb/s u-law/A-law sampled speech signals down to 7.4 kb/s for transmission over IS-136A digital cellular channels. IS-641 provides approximately 4 kHz of speech bandwidth and has an algorithmic delay of 25ms. It requires encoding frames of 160 linear PCM samples into 148 bits which are packed into ten 16-bit code words. A good description of an ACELP implementation is contained in a paper by Salami, et al, “A Toll Quality 8 kb/s Speech Codec for the Personal Communications System (PCS),” in the August, 1994 IEEE Transactions on Vehicular Technology.
The channel error control for VSELP in IS-136 employs three schemes to protect against channel errors. The first is to use a rate one-half convolutional code to protect the more vulnerable bits of the speech codec data stream. The second technique interleaves the transmitted data for each speech codec frame over two time slots to protect against Rayleigh fading. The third technique employs the use of a cyclic redundancy check (CRC) over some of the most perceptually significant bits of the speech codec output. After the error correction is applied at the receiver, these cyclic redundancy bits are checked to see if the most perceptually significant bits were received properly.
The first step in the error correction process is the separation of the 159-bit speech codec frame’s information into class 1 and class 2 bits. There are 77 class 1 bits and 82 class 2 bits in the 159-bit speech codec frame. Convolutional coding is applied to the 77 class 1 bits. A 7-bit CRC is used for error detection purposes and is computed over the 12 most perceptually significant bits of the class 1 bits for each frame. Class 2 bits are transmitted without any error protectio n.
Of the 89 bits input to the convolutional coder, 77 are class 1 bits. Other bits are reserved for the CRC for the frame, or are filled with zeros corresponding to 5 tail bits (since the CRC is 7 bits). The bits are rearranged in a certain array as defined in the spec ification. The convolutional encoding use s a rate 1/2, memory order 5 code (R = 1/2, m = 5). There are 32 states in this code, five memory elements. The notation for the generator polynomials, g 0 (D) and g 1 (D), that follow are defined by Shu Lin and Daniel Costello in Error Control Coding: Fundamentals and Applications , Prentice-Hall, April 1983 on page 330. The polynomials are defined as: g 0 (D) = 1 + D + D 3 + D 5 g 1 (D) = 1 + D 2 + D 3 + D 4 + D 5 The output from the convolutional coder alternates between these two polynomials starting with g 0 (D) being the first in each time slot. The free coefficient in the above equations is the most significant bit ( MSB ) . Initially the encoder’s memory elements are cleared . T he encoder starts at state 0 and the bits in the class 1 buffer are read in starting at 0 through 88. Sequentially the output from g 0 (D) and g 1 (D) are referred to as cc 0 [i] and cc 1 [i], respectively. For each input bit, CL1[i], the two output bits, cc 0 [i] and cc 1 [i], are produced. The order, i, the bits are placed into CL1[i] is indicated in a table in the spec ification .
Before transmission, VSELP encoded speech data is interleaved over two time slots with the speech data from adjacent speech frames. Each time slot thus contains information from two speech codec frames. The speech data is placed into a rectangular interleaving array as shown above. The speech data is entered into the interleaving array column-wise. The two speech frames are referred to as x and y where x is the previous speech frame and y is the present or most recent speech frame. The data , which can have encryption applied or be plain text , is placed into the interleaving array in a way that intermixes the class 2 bits from the speech codec with the convolutionally coded class 1 bits. The class 2 bits are sequentially placed into the array and occupy the following numbered locations in the interleaving array: 0, 26, 52, 78 93 through 129 130, 156, 182, 208 223 through 259 The class 1 bits occupy the rest of the interleaving array and are also sequentially placed into the array.
Summarizing, a codebook vocoder stores a collection of arbitrary waveform segments (a set of digitized vocal patterns) in digital form. The QCELP (Qualcomm code excited linear predictive) , shown above, is technically not part of this module, however, it is another vocoder that uses basically the same codebook as all the others. It is the one used by CDMA, therefore it is worth mentioning here. For VSELP, within a 20ms sample time, the vocoder, through approximation based upon previous samples, approximates as closely as possible a code representation of the sample signal. The pitch filter models the periodic pulse train coming from the vocal cords during voiced speech. A formant filter models the characteristics of the vocal tract. It has resonant frequencies near the resonant frequencies of the original speech caused by the vocal tract filtering. The vocoder sends instructions concerning lookup of patterns to the receiver, causing the remote vocoder to recreate speech.
The codebook is a real, two-volume book that contains samplings of voice speech patterns. These patterns have been cataloged and are used in the coding and decoding (VSELP, ACELP) of the PCM signal. Random samples were made by interviewing and recording thousands of people from different locations saying the same phrases. The idea was based on the concept that all speech is made up of sound bites such as: Th St Ku Qu Wa To etc. There are literally thousand upon thousands of different sounds that are used to make words. If these sounds could be captured and cataloged, then all we would have to do is reference the catalog and recreate the same sequence of sounds at the other end, and should have the same words appear.
Each volume of the codebook contains hundreds of pages and each page contains hundreds of line items. This works very similar to an instructor referencing a text book that every student has. The instructor can tell the student to go to chapter 5, page 10, paragraph 4, sentence 3. Assuming they all have the same edition of the book, they will all be reading the exact same thing. At the transmitting end, the vocoder follows this process: Strip off the first 156 bits of the PCM signal as it enters the vocoder Identify the location, in the codebook, of the bit pattern that most closely resemble the stripped off bits. Encode the address of each line in eight bit words using VSELP or ACELP. Send via the chosen modulation method. Each 64Kb/s channels is now represented by 7.4Kb/s At the receiving end, the vocoder follows this process: Receive the 7.4Kb/s encoded data. Read the addresses. Go to the correct volume and locate the identified line item. Reassemble a 64Kb/s PCM bit pattern. Convert to analog signal for the human ear.
At this point, there is still one question remaining. If we can represent one 64Kb/s conversation with a 7.4Kb/s signal, then why can we only get three conversations in the 64Kb/s channel? In order to explain this, we must first go back to the analog transmitted signal. In analog, it is not necessary for the human ear to hear every single piece of the signal because our minds will fill in the blanks. In cellular we make use of this phenomenon to send control information back and forth between the cell phone and the cell site. We use a method called blank and burst . We literally blank out the conversation, extremely briefly, and burst in the control messages. This period is so short that, in most cases, it is never detected by the human ear. However, digital is much more sensitive and cannot handle even these momentary lapses. Digital also requires a great deal more control time than analog. Therefore, the control, or overhead, messages are incorporated into the signal itself. It needs approximately 8.8 Kb/s of signal. When added to the 7.4Kb/s, this gives us 16.2Kb/s per conversation. That means that in a 64Kb/s channel, we can fit three conversation, but not quite four.
In the slide above, this path is followed: Telephone to Public Switched Telephone Network (PSTN) is analog. PSTN converts the Analog voice to Pulse Code Modulation (PCM) using an A/D converter. PSTN sends the 64Kb/s PCM signal to the MSC on a T-1 (could also use microwave). Each T-1 channel has 1 conversation. There are 24 channels per T-1 At the MSC, the 64Kb/s is encoded using the VSELP or ACELP vocoder and sent to the cell site. Now there are 3 conversations in each T-1 channel. At the cell site, the encoded message is transmitted to the appropriate cell phone. At the cell phone, the message is decoded back to PCM, then converted back to analog. From the cell phone back to the land phone, the process is just reversed.
Page Digital Voice Coding Student Notes Global Wireless Education Consortium
Page Digital Voice Coding Student Notes Global Wireless Education Consortium