1. DSP Algorithms for the Dual-tone Digit Receiver
Renshou Dai
July 24, 1998
Abstract
This document describes the algorithms and code structures of the various digit
receivers that have been built for the PDS, ATME and cellular responders. The generic
digit receiver built for PDS is designed for fast and reliable detection of any type of dual-
tone digits (within 8 ms) and quick analysis of the signal for accurate measurement of
frequencies and levels (within 16 ms) using corrected Fourier transform method. The
MF digit receiver built for ATME is a special case of the generic digit receiver, in
which the special characteristic of MF frequencies in relation to the sampling rate has
been taken advantage of, and the computational cost is significantly reduced but with
additional caution paid to the echo canceller’s effect. The MF/DTMF digit receiver
built for cellular responders is another special case, in which the computational cost
is slightly reduced, but with far more reliability-strengthening measures being taken
to counter against the prevalent signal distortion caused by vocoders. This document
serves as a reference for future maintenance and improvement on the digit receiver
module. It is also a reference for those who would like to know the DSP algorithm
details.
1 Introduction
Successful detection of dual-tone digits is crucial to many test functions implemented on
Sage products. For example, in ATME [1, 2] and ROTL tests, the responder and director
communicate with each other through MF/DTMF digits. In PDS (protocol-development-
system) [3], detecting, analyzing and interpreting the dual-tone digits are the essential task
of the system. In fact, the digit receiver module is one of the most fundamental low-level
DSP module that’s been frequently used. But unlike many other low-level measurement
modules (such as single tone or noise measurement), where the reliability issue does not bear
any ’fatal’ consequence. That is, when the measurement module does not work properly,
it merely gives inaccurate results, which may or may not be noticed depending on how
’sophisticated’ the customer is. It will not stop the test. The reliability of the digit receiver,
however, often carries fatal consequence in that once it fails, the whole test fails (stops). For
this reason, I personally think reliability should be the top priority when designing a digit
receiver. But previous Sage engineers obviously did not adhere to this principle [4]. The
old digit receiver (in file DGR.C), for example, has very low reliability as exemplified by the
case of ATME test on 930s and ROTL tests on cellular responders. In the ATME case, the
1
2. echo canceller’s in the network distorted the digit signal. In the cellular responder case, the
vocoders distorted the digit signal. In both cases, the old digit receiver failed to detect the
digit from the distorted signal, which is not surprising after considering its algorithm. For
the PDS system, the old digit receiver took over 55ms to recognize a digit, which is too slow
for some applications such as the R2 compelled digit sequence sending and receiving.
A Conventional digit receiver consists of an envelop (power) detector, and a ’bunch’ of
band-pass filters [5]. The envelop detector detects the the digit signal onset and departure.
The bandpass filters determines which two frequencies the digit signal are actually composed
of. This is also the idea used in the old digit receiver code [4] except that the bandpass
filters are replaced with a 256-point FFT (which is technically equivalent to a set of 128 FIR
bandpass filters). More precise frequency information is determined by interpolation based
on a design idea shown in [6]. With SPOX math, FFT is a simple function call.
Although simple, general and intuitive, this ’conventional’ algorithm [4] lack reliability,
efficiency and speed. Firstly, with this algorithm, the digit’s on and off is determined solely by
the signal power (envelop) level. This will be problematic if the channel has high attenuation,
or if the channel is very noisy (high-power ambient noise), or if there are some interfering
tones (as in ATME case), or if the digit envelop is unsteady (as through vocoders). All these
scenarios can either mis-trigger the digit receiver, or cause the digit receiver to miss a digit
or doubly detect a single digit. Secondly, the FFT itself is also problematic when there is
high distortion on the signal (as in the vocoder case) or when there are interfering tones (in
ATME case). Vocoder’s distortion on the signal can cause the FFT algorithm to ’over-detect’
the spectral bins, that is, a single tone signal is detected as a multi-tone signal due to the
vocoder’s spectral distortion and inappropriate spectral sorting used along with the FFT
algorithm. Thirdly, the frequency interpolation idea itself is also questionable. McKee’s [6]
idea only applies to the single tone signal. For dual-tone digit signal, the interpolation
formula is inaccurate due to the dual-tone’s mutual interference. Although it’s unlikely that
this inaccuracy will cause the failure of the digit receiver, it does affect the signal analysis
accuracy. The last problem with this algorithm is its long inherent delay. With 256-point
FFT, this algorithm has to wait until all 256 data points (32 ms at 8KHz sampling rate)
become available.
Prompted by all these problems, I redesigned the digit receiver algorithms, and rewrote
four sets of digit receivers for four different applications. The new digit receivers emphasize
on reliability and computational efficiency. The algorithm is a natural optimal solution to
the dual-tone detection problem, which essentially becomes a ’corrected’ Fourier transform
method. This algorithm optimally extracts crucial information from a short segment of data
by automatically deconvolving out the windowing effect caused by the finite data length. In
fact, this corrected Fourier transform method not only applies to dual-tone digit detection,
it also applies to multi-tone signaling detection and PSK (phase-shift-keying) demodulation
(optimal coherent phase detector).
In the following sections, first I lay out the essential mathematical principles of the
algorithm. From here, we come up with a very generic dual-tone digit receiver and signal
analyzer, which can replace the old digit receiver designed by previous Sage engineer(s).
This new digit receiver works much faster and more reliably. This has been implemented in
the current PDS system (930i). For the special case of MF and DTMF digits, the algorithm
can be further simplified, as in the ATME case and the cellular responder case, which are
2
3. explained separately following the generic digit receiver.
2 The Mathematical Principles
2.1 The Optimal Digit Detection Criterion
What is the optimal criterion to decide if the incoming signal contains a valid dual-tone digit?
The conventional band-pass filtering approach bases the decision on the absolute power level
of each single tone component. This approach will fail if the digit signal experiences high
attenuation or if the power of some non-digit ’noise’ signal is so high that it’s mistaken for
a digit. As in all optimal filter design, the optimal criterion should be based solely on the
signal-to-noise ratio, not on any absolute power level. To be more specific, we examine a
sampled dual-tone digit signal with additive noise:
x(n) = A1 cos(nω1 + φ1) + A2 cos(nω2 + φ2) + w(n) (1)
= a1 cos(nω1) + b1 sin(ω1) + a2 cos(ω2) + b2 sin(ω2) + w(n)
where w(n) is the noise term and ω1,2 are the two digital (normalized) angular frequencies.
The optimal criterion to qualify x(n) as a valid dual-tone digit signal should be based on
how much of the signal energy of x(n) is concentrated on the two frequency components. In
other words, the following energy ratio is the optimal criterion for an arbitrary data length
of N:
R =
N−1
n=0 [A1 cos(nω1 + φ1) + A2 cos(nω2 + φ2)]2
N−1
n=0 x2(n)
(2)
Assuming the noise w(n) and the dual-tone components are uncorrelated within data length
of N, then the ratio R has a simple relation with the signal-to-noise ratio:
R =
SNR
1 + SNR
(3)
where SNR is the signal-to-noise ratio in linear scale. For a noise-free dual-tone digit signal,
the ratio is R = 1. To tolerate some noise, we can set the ratio threshold at, say, 0.9 or
0.8. Using the ratio defined above as a criterion is simple and optimal, and it eliminates
the ’awkward’ spectral spikiness test as used in the previous digit receiver code ’DGR.C’ [4].
As shown in equation 3, the ratio is solely dependent on the signal-to-noise ratio, not on
any absolute power level. It applies to the signal at any power level as long as the signal
dominates over the noise (otherwise, we have to say it is noise).
To discriminate against the single-tone signal with frequency ω1 or ω2, we need another
twist ratio to further confirm if this is a valid dual-tone digit. This twist ratio is:
TR =
max(A2
1, A2
2)
min(A2
1, A2
2)
=
max(a2
1 + b2
1, a2
2 + b2
2)
min(a2
1 + b2
1, a2
2 + b2
2)
(4)
This twist ratio helps discriminate against the single-tone non-digit signal (set TR less than
5, for instance).
3
4. In conclusion, the digit detection problem becomes the following hypothesis testing:
Digit Detected
?
=
TRUE, if R > Rthrsh and TR < TRthrsh
FALSE, Otherwise
(5)
where Rthrsh is the threshold value for ratio R and TRthrsh is the threshold value for
the twist ratio TR. Typical value of Rthrsh is between 0.7 and 0.95, and typical value for
TRthrsh is between 5 and 25.
The next question is of course, how to find R and TR. In the following, we compare two
approaches, and choose the one that can optimally separate the noise term w(n) from the
dual-tone components in equation 1.
2.2 The Fourier Transform Approach
For a buffer of data x(n), n = 0, 1, . . . , N − 1, its discrete Fourier transform is defined as:
X(ω) =
N−1
n=0
x(n)e−jnω
(6)
When calculated at a specific frequency point ω0, the essence of equation 6 is to band-pass
the signal x(n) through an FIR band-pass filter defined by exp(−jω0n). When N = 2k
,
and the transform is calculated at the frequency points of ωk = 2πk/N, k = 0, 1, . . . , N − 1,
equation 6 can be implemented through FFT after taking advantage of the periodicity and
symmetricity of the function exp(−jnωk).
For any single-tone or multi-tone detection task, FFT is not an efficient approach since
the frequency points dictated by FFT are usually different from the frequency points we
desire. To detect an MF digit, for example, a 256-point (as in [4]) FFT calculates the
Fourier transform at 128 frequencies fk = 8000k/256, k = 0, 1, . . . , 127. But none of these
frequency points coincide with the six desired frequency points (700, 900, . . . , 1700Hz). The
correct frequencies are found by interpolation. For the task of digit detection, this whole
process is a waste of computation time. First, we do not need 256 data points to detect
a digit. Second, we only need to calculate the Fourier transform at six desired frequency
points, instead of on all the 128 points.
In short, a more efficient digit receiver is to analyze the incoming signal at targeted
frequency points with regular Fourier transform, instead of indiscriminately using FFT. To
detect a digit, we only need the power level at the targeted frequencies. The initial phase
information of the incoming data is irrelevant. That is to say, in equation 6, we only need
|X(ω)|, not the complex X(ω) itself. Having this in mind, the computation load of equation 6
can be reduced by half through the following change (assume N is even):
|X(ω)| = |ej(N−1)/2ω
N−1
n=0
x(n)e−j(n−(N−1)/2)ω
|
= |
N/2
n=0
[(x(N/2 − n − 1) + x(N/2 + n)) cos((n + 1/2)ω) +
j(x(N/2 − n − 1) − x(N/2 + n)) sin((n + 1/2)ω)]| (7)
4
5. 1300 1400 1500 1600 1700 1800 1900 2000
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Regular Fourier Transform
1300 1400 1500 1600 1700 1800 1900 2000
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Corrected Fourier Transform
Frequency in Hz
Figure 1: Top: Spectrum of an R2 forward digit signal through regular Fourier
transform. Bottom: Same spectrum after using corrected Fourier transform.
The ’odd-looking’ index inside the cos() and sin() functions are to make these two functions
even and odd symmetric relative to the central point of the data length, therefor reducing
the number of multiplication operations by half as shown in equation 7. In fact, this ’minor’
change also greatly reduces the computational load for the corrected Fourier transform as
will be explained later.
For an arbitrary data length N, the regular Fourier transform approach cannot provide
accurate estimate of the two ratios R and TR we are looking for, because this approach is
unable to completely separate the dual-tone components from the noise term in equation 1
due to the spectral leakage problem. An example is shown in Figure 1, where a 32-point
Fourier transform is performed over an R2 forward digit signal at six targeted R2 frequencies.
The two correct frequencies are 1500Hz and 1740Hz. The top part of the figure shows the
results obtained through equation 7. It is obvious that the regular Fourier transform without
correction cannot resolve these two ’closely-spaced’ bins with 32 data points. The lower
figure shows the perfect result through the process of corrected Fourier transform, which is
explained in the next section.
2.3 Introduce Corrected Fourier Transform
To correctly estimate the two ratios R and TR with arbitrary data length, we need to correct
for the spectral leakage caused by the finite data-length (windowing) effect. The corrected
Fourier transform described below is designed toward this goal. In fact, this method not only
applies to dual-tone digit detection, it also applies to any other type of single or multi-tone
detection and PSK demodulation problems using an arbitrarily short segment of data. To
make the analysis more manageable, we use the matrix and vector notations from here on.
First, in connection with equation 7 we define an incoming data vector:
x = [x(N/2 − 1), . . . , x(0), x(N/2 + 1), . . . , x(N − 1)]T
(8)
5
6. The strange index ordering has to do with equation 7. It has another advantage in making
the matrix symmetric and diagonal, which will soon be clear. We also define a parameter
vector:
a = [a1, a2, b1, b2]T
(9)
And, in connection with equations 7 and 9, we define an N × 4 matrix:
H =
CN/2×2 SN/2×2
CN/2×2 −SN/2×2
(10)
where the C matrix is:
C =
cos(1/2ω1) cos(1/2ω2)
cos(3/2ω1) cos(3/2ω2)
...
...
cos((N − 1)/2ω1) cos((N − 1)/2ω2)
(11)
and likewise, the S matrix is:
S =
sin(1/2ω1) sin(1/2ω2)
sin(3/2ω1) sin(3/2ω2)
...
...
sin((N − 1)/2ω1) sin((N − 1)/2ω2)
(12)
Then for every incoming data vector, we have an error vector (which is equal to the noise
term w(n) in equation 1):
e = Ha − x (13)
The parameters a are obtained by minimizing the following total error:
min
a1,a2,b1,b2
J = eT
e = aT
HT
Ha − 2xT
Ha + xT
x (14)
After minimization, the optimal solution of the vector a is:
a = (HT
H)−1
HT
x (15)
Substituting equation 15 back to equation 14, the total minimized error becomes:
J = xT
x − xT
H(HT
H)−1
HT
x (16)
In equation 16, notice that the first term xT
x is the total energy of the incoming data,
the second term xT
H(HT
H)−1
HT
x is the energy concentrated on the two target frequencies
ω1,2. So the ratio R defined in equation 2 can be calculated as:
R =
xT
H(HT
H)−1
HT
x
xT x
(17)
The twist ratio can be easily obtained after calculating A2
1,2 = a2
1,2 + b2
1,2.
In fact, equation 15 deserves more comment. The solution a = (HT
H)−1
HT
x contains
two parts. The HT
x part is a regular Fourier transform that is identical to equation 7.
The other part (HT
H)−1
is a correction term that deconvolves out the spectral leakage
caused by finite data length (the windowing effect). This correction term re-concentrates
the spectral energy back to two bins, whereas the regular Fourier transform spreads the single
bin energy out onto neighboring bins (see Figure 1). Unlike those obtained by ’empirical’
curve fitting [6], this correction term is a natural solution to the optimization problem.
6
7. 2.4 Estimate the Frequencies and Levels
On deriving the above equations, we have assumed the incoming digit signal has correct
frequency values conforming to the standard frequency table. That is, for MF digit, for
example, the frequencies are strictly two of the six frequencies of 700, 900, . . . , 1700Hz. But
some signal generator may have some frequency errors that we need to measure. This is
actually part of the PDS’s digit analyzing function. As long as the frequency errors are
within 50Hz, for instance, the digit can still be correctly detected using the above scheme
(the value R will be smaller than the case when the frequencies are correct, but still larger
than the case when the signal is not a valid dual-tone digit signal). Skipping over the detailed
derivations, the actual frequencies (standard frequency plus errors) of a detected digit are
obtained through the following iterative steps:
1. With the current frequency values w1,2, form the N × 4 matrix H as shown in equa-
tions 10, 11 and 12.
2. Obtain the parameter vector a = [a1, a2, b1, b2]T
by a = (HT
H)−1
HT
x to estimate the
level of each tone.
3. Calculate the error vector: e = Ha − x and J = eT
e.
4. Update the frequencies according to the following:
ω1
ω2 k+1
=
ω1
ω2 k
−
∂2J
∂ω2
1
∂2J
∂ω1∂ω2
∂2J
∂ω2∂ω1
∂2J
∂ω2
2
−1
∂J
∂ω1
∂J
∂ω2
(18)
5. If ωk+1
1,2 and ωk
1,2 are close enough, stop the iteration. Otherwise, go back to the first
step.
The above steps are Gauss-Newton style iterations obtained through non-linear optimization
procedures. Numerical tests have shown that the convergence rate is very fast, and three
iterations is usually far enough for reaching the accuracy of less than 1Hz as shown in
Figure 2.
For completeness, calculations of the Jacobian matrix and vector are listed below:
e = Ha − x
J = eT
e
∂J
∂ω1,2
= 2eT ∂e
∂ω1,2
= 2eT ∂H
∂ω1,2
a (19)
∂2
J
∂ω2
1,2
= 2
∂eT
∂ω1,2
∂e
∂ω1,2
+ 2eT ∂2
e
∂ω2
1,2
= 2aT
(
∂H
∂ω1,2
)T ∂H
∂ω1,2
a + 2eT ∂2
H
∂ω2
1,2
a
∂2
J
∂ω1∂ω2
= 2
∂eT
∂ω2
∂e
∂ω1
= 2aT
(
∂H
∂ω2
)T ∂H
∂ω1
a
The partial derivative of the matrix H over ω1,2 is performed element by element. The
answers are quite obvious and trivial, and therefore, they are not listed here.
7
8. 770 772 774 776 778 780 782 784 786 788 790
1010
1012
1014
1016
1018
1020
1022
1024
1026
1028
1030
Error surface contour and convergence trajectory
Lower frequency
Higherfrequency
3
2
1
Figure 2: Contour of the total error surface and the convergence trajectory in-
dicated by ’*’. The iteration starts at initial frequencies of f1 = 770Hz and
f2 = 1030Hz. With one iteration, the frequencies are updated to within 1Hz
range of the correct frequencies. The second iteration reached the correct
frequencies of f1 = 780Hz and f2 = 1020Hz.
3 Digit Type: Triangular and Rectangular
These two types of dual-tone digits cover all the digits used by Sage Products. The frequency
table and digit numbering of triangular type of digits are shown in Table 1.
high frequency fj
f0 + ∆f f0 + 2∆f f0 + 3∆f f0 + 4∆f f0 + 5∆f
f0 1 2 4 7 11
low f0 + ∆f 3 5 8 12
freq f0 + 2∆f 6 9 13
fi f0 + 3∆f 10 14
f0 + 4∆f 15
Table 1: Frequency table and digit numbering for triangular type of digits.
The triangular type of digits cover MF, R2F (R2 Forward) and R2R (R2 Reverse) digits.
As shown in the table, two frequencies, f0 and ∆f, suffice to provide full information on
all the frequencies. For example, for MF digits, f0 = 700Hz and ∆f = 200Hz. For R2F,
f0 = 1380Hz, and ∆f = 120Hz. For R2R, f0 = 1140Hz, and ∆f = −120Hz. Taking
the low frequency index as i = 0, 1, 2, 3, 4 and high frequency index as j = 1, 2, 3, 4, 5 such
that fi = f0 + i∆f, and fj = f0 + j∆f, then the digit number can be easily calculated as
digitnumber = j(j − 1)/2 + i + 1. This relation is used in the code to relate the frequencies
to the digit numbers. One can easily add in any user defined triangular type of digits by
providing the f0 and ∆f information to the digit receiver code.
8
9. Rectangular type of digit uses 8 frequencies (4 low and 4 high) to form 16 digits. The fre-
quency table and digit numbering is shown in Table 2. Obviously, DTMF digit falls into this
category. For DTMF, the 8 frequencies are f1,2,...,8 = [697, 770, 852, 941, 1209, 1336, 1477, 1633].
One can easily add in any user-defined rectangular type of digits by providing the DGR2.C
code these similar 8 frequencies. Given two frequencies fi and fj, where i = 1, 2, 3, 4 and
j = 5, 6, 7, 8, the digit number can be calculated as digitnumber = 4(i − 1) + j − 4.
high frequency
f5 f6 f7 f8
f1 1 2 3 4
low f2 5 6 7 8
freq f3 9 10 11 12
f4 13 14 15 16
Table 2: Frequency table and digit numbering for rectangular type of digits
The numbering schemes shown on the above two tables are mostly for internal computa-
tional convenience. If the user prefers a different numbering scheme, a C-array can be built
to map between the different numbering schemes.
4 Optimizing the Computations
The mathematical procedures outlined in the previous section, although succinct, require
significant computational optimization for time critical real-time implementation.
4.1 Computation Optimization for Digit Detection
On detecting a digit, the essential computations are in equations 15 and 17. At first glance,
it appears that these computations need to be repeated for every digit, which implies, 15
times for triangular type of digit, and 16 times for the rectangular digits. But in fact, most
of the computations are redundant. To see this, let’s denote y = HT
x and G = HT
H. Then
the energy ratio R becomes:
R =
yT
G−1
y
xT x
(20)
The 4 × 1 vector y is actually the result of the regular Fourier transform, and it only has
12 different values for triangular digits and 16 different values for rectangular digits. Take
triangular digits as an example, the first two components of y, y0 and y1 are two of the
following six possible values:
yi
0,1 =
N/2−1
n=0
[(x(N/2 − 1 − n) + x(N/2 + n)) cos((n + 1/2)ωi)] (21)
9
10. with i = 0, 1, 2, 3, 4, 5. Likewise, the other two components y2 and y3 are two of the following
six possible components:
yi
2,3 =
N/2−1
n=0
[(x(N/2 − 1 − n) − x(N/2 + n)) sin((n + 1/2)ωi)] (22)
with i = 0, 1, 2, 3, 4, 5.
Corresponding to each digit with two frequencies ωi,j, the G matrix is a diagonal sparse
matrix, due to the special indexing used in the matrix H:
G−1
= (HT
H)−1
=
Gc−1
2×2 2×2
2×2 Gs−1
2×2
(23)
where the Gc is (K = n + 1/2):
Gc = 2
N/2−1
0 cos2
(Kωi)
N/2−1
0 cos(Kωi) cos(Kωj)
N/2−1
0 cos(Kωi) cos(Kωj)
N/2−1
0 cos2
(Kωj)
(24)
and the matrix Gs is:
Gs = 2
N/2−1
0 sin2
(Kωi)
N/2−1
0 sin(Kωi) sin(Kωj)
N/2−1
0 sin(Kωi) sin(Kωj)
N/2−1
0 sin2
(Kωj)
(25)
Inverting a 2 × 2 matrix is trivial, therefore no explicit expressions for Gc−1
and Gs−1
are
given here.
Although the ratio calculation in equation 20 needs to be performed for each digit, the
computational load is quite small considering the sparse and symmetric nature of the matrix
G−1
in equation 23. In fact, the numerator of equation 20 involves only six terms:
yT
G−1
y = Gc−1
1,1y2
0 + 2Gc−1
1,2y0y1 + Gc−1
2,2y2
1 + Gs−1
1,1y2
2 + 2Gs−1
1,2y2y3 + Gs−1
2,2y2
3 (26)
The denominator of equation 20 only needs to be calculated once for every buffer of data.
The division operation in equation 20 was actually not carried out since division operation
is quite expensive. Since the goal is to test if R > Rthrsh, its equivalent form yT
G−1
y >
Rthrsh × xT
x is actually used. This form does not involve the division computation.
The H matrix and G−1
are precalculated and stored in memory. Because of the symmetry,
only half of H needs to be calculated and stored. For G−1
, it only has six unique non-zero
values that need to be stored despite the fact that it is a 4 × 4 matrix.
From the 15 ratios calculated for all 15 digits, the largest one is sorted out and compared
with the ratio threshold Rthrsh (0.9, for example). If MAX(Ri) > Rthrsh, then we decide
a possible digit may have been detected, and further calculate the twist ratio TR through
the following procedures:
a = G−1
y
A2
1 = a2
0 + a2
2
A2
2 = a2
1 + a2
3 (27)
TR =
MAX(A2
1, A2
2)
MIN(A2
1, A2
2)
If TR is less than the twist ratio threshold TRthrsh (10, for instance), then a valid digit has
been detected. The digit number has already been recorded when sorting the ratios.
10
11. 4.2 Optimizing the Computations for Frequency and Level Esti-
mations
As outlined in previous sections, the algorithm for detecting a digit is quite straightforward
and computationally efficient. The signal analysis part (the part for frequency and level
estimations), however, is more complicated as shown by equations 18 and 19.
The computational load appears to come from all the matrix and vector operations in
equation 19. But considering the fact that the partial derivatives of H to ω1,2 are only half
full (that is, half of the elements are zeros), and that all the relevant matrices are symmetric,
the computational load is not that significant as it appears to be. The following procedure
is one way to calculate each element of the Jacob matrix and vector (K = n + 1/2):
y1,2 = a1,2 cos(Kω1,2)
y3,4 = a3,4 sin(Kω1,2)
z1,2 = a1,2 sin(Kω1,2)
z3,4 = a3,4 cos(Kω1,2)
e(n) =
x(N/2 − 1 − n) − y1 − y2 − y3 − y4, n < N/2
x(N/2 + n) − y1 − y2 + y2 + y3, n ≥ N/2
∂e(n)
∂ω1
=
(z1 − z3)K, n < N/2
(z1 + z3)K, n ≥ N/2
∂e(n)
∂ω2
=
(z2 − z4)K, n < N/2
(z2 + z4)K, n ≥ N/2
(28)
∂2
e(n)
∂ω2
1
=
(z1 + z3)K2
, n < N/2
(z1 − z3)K2
, n ≥ N/2
∂2
e(n)
∂ω2
2
=
(z2 + z4)K2
, n < N/2
(z2 − z4)K2
, n ≥ N/2
∂J
∂ω1,2
=
N−1
n=0
e(n)
∂e(n)
∂ω1,2
∂2
J
∂ω2
1,2
=
N−1
n=0
(
∂e(n)
∂ω1,2
)2
+ e(n)
∂2
e(n)
∂ω2
1,2
∂2
J
∂ω1∂ω2
=
N−1
n=0
∂e(n)
∂ω1
∂e(n)
∂ω2
Comparing the terms in equation 28 with those in equation 19, we find all the 2 factors
have been dropped. This is because in equation 18, a constant factor that multiplies every
element of the Jacob matrix and vector is automatically canceled out.
The real computational load comes from generating the matrix H (sine and cosine tables
as used by the yi and zi terms in equation 28) for each updated frequency pairs. There is
no way to precalculate them because the updated frequencies are unknown beforehand. For
each updated frequency pair ωk+1
1,2 , the task is essentially to generate the following cosine and
sine tables quickly at real run-time:
Ck+1
1,2 (n) = cos((n + 1/2)ωk+1
1,2 )
11
12. Sk+1
1,2 (n) = sin((n + 1/2)ωk+1
1,2 ) (29)
n = 0, 1, 2, . . . , N/2 − 1
This may become a problem because the TI’s run-time support libraries for the cosine and
sine functions are computationally very expensive. So we need to generate the sine and
cosine tables without calling sin() and cos() functions, but to take advantage of the fact that
the DSP processor has fast one-cycle float-point multiplication and addition operation. To
achieve this goal, we notice that an arbitrary sampled sine wave data s(n) = A cos(nω + φ)
always satisfies the following second order difference equation:
s(n) = 2 cos(ω)s(n − 1) − s(n − 2) (30)
Keep in mind that the solution to a second order difference equation can be completely
determined once the two initial values s(0) and s(1) are determined. That is to say, once
we know s(0) = A cos(φ), s(1) = A cos(ω + φ) and cs2 = 2 cos(ω), the rest of s(n) can be
generated from equation 30, which only involves one multiplication and addition for each
data point that can be accomplished with a one-cycle instruction. To apply this principle
to equation 29, it requires calling the cosine and sine functions 10 times to initialize the
values Ck+1
1,2 (n) = cos((n + 1/2)ωk+1
1,2 ), Sk+1
1,2 (n) = sin((n + 1/2)ωk+1
1,2 ), n = 0, 1 and cs1 =
2 cos(ωk+1
1 ), cs2 = cos(ωk+1
2 ), which is still quite expensive in terms of computation. Now we
seek to totally eliminate the sine and cosine function calls. To do this, we notice that:
ω1
ω2
k+1
=
ω1
ω2
k
−
∂2J
∂ω2
1
∂2J
∂ω1∂ω2
∂2J
∂ω2∂ω1
∂2J
∂ω2
2
−1
∂J
∂ω1
∂J
∂ω2
=
ωk
1 − δω1
ωk
2 − δω2
(31)
where δω1,2 are the incremental frequency correction terms, which are typically quite small.
Therefore the following polynomial approximation are accurate enough (with an error less
than 10−10
):
cos(δω1,2/2) = 1 −
(δω1,2)2
222!
+
(δω1,2)4
244!
(32)
sin(δω1,2/2) = δω1,2/2 −
(δω1,2)3
233!
+
(δω1,2)5
255!
Since the Ck
(n) and Sk
(n) values are already available, the initial values of the Ck+1
(n)
and Sk+1
(n) can then be calculated in the following way:
Ck+1
(0) = Ck
(0) cos(δω1,2/2) + Sk
(0) sin δω1,2/2)
Sk+1
(0) = Sk
(0) cos(δω1,2/2) − Ck
(0) sin δω1,2/2)
cs21,2 = cos(ωk+1
1,2 ) = 2 cos2
(δω1,2/2) − 1 (33)
ss21,2 = sin(ωk+1
1,2 ) = 2 sin(δω1,2/2) cos(δω1,2/2)
Ck+1
(1) = cs21,2 cos(δω1,2/2) − ss21,2 sin(δω1,2/2)
Sk+1
(1) = ss21,2 cos(δω1,2/2) + cs21,2 sin(δω1,2/2)
After obtaining the initial values, the rest of the sine and cosine tables are generated through
equation 30. These computations have been implemented in the digit receiver code.
12
13. Figure 3: Program flow diagram for the ’DGR2 exec()’ function.
In fact, what the ’DGR2 exec()’ function really does is to implement a state machine.
13
14. Figure 4: The state machine transition diagram implemented inside the function
’DGR2 exec()’. The beginning of a digit is detected when the state goes from
DOFF to QON. The end of a digit is detected when the state goes from DON
to DOFF.
The ’DGR2 exec()’ calls several private functions that are worth some commenting here.
The ’Envelop()’ simply calculates the total energy of the incoming signal and compares
it with the preset minimum energy threshold level. If the data energy is higher than the
threshold, it returns TRUE. The ’Spsort()’ function detects the digit. It first calculates the
regular Fourier transform at six or eight frequency points to obtain the y vector. It then
calculates the energy ratio for each digit and sort out the largest one. If the ratio is higher
than the threshold, it then go ahead calculating the twist ratio. If the twist ratio is also
higher than the threshold, the function then returns TRUE indicating a valid digit has been
detected. The ’Fanalyze()’ function analyzes the incoming digit signal to obtain the actual
frequencies and levels. It strictly follows the iterative procedures outlined in the previous
sections. Notice that, although the new code ’DGR2.C’ still uses the same global constants
as used by the old ’DGR.C’, the prefix ’DGR ’ to every constant has been changed to prefix
’DGR2 ’ to avoid unnecessary compiler errors and also to maintain the consistencies within
the new code.
14
15. 6 The MF Digit Receiver for ATME
In the case of ATME test, the goal of the digit receiver is only to detect MF digit without
analyzing the signal. This greatly simplifies the task. Furthermore, the MF frequencies
(700, 900, . . . , 1700Hz) have a special relation with the commonly used sampling rate of
8000Hz or 16000Hz. Take the 8KHz sampling rate as an example. In this case, all the
frequency components of MF digits form an orthogonal set with a data length of 80 samples.
This means the followings are true:
79
n=0
cos(nωi) cos(nωj) = 40δij
79
n=0
sin(nωi) sin(nωj) = 40δij (34)
79
n=0
cos(nωi) sin(nωj) = 0
where ωi,j = 2π{700, 900, 1100, 1300, 1500, 1700}/8000. This orthogonality makes the correc-
tion matrix G = HT
H a purely diagonal matrix (that is, Gi,j = N/2δij). Furthermore, notice
that all these frequency components have odd symmetry between the first 40 data points and
the second 40 data points. Without further due, a simple, efficient and yet highly reliable
MF digit digit receiver algorithm can be condensed into the following few computational
steps:
1. For every incoming buffer of data x(n) with length of 10 ms (80 samples for 8KHz
sampling rate and 160 samples for 16KHz), calculate another data array of half the
original length and using it to calculate the envelop:
y(n) = x(n) − x(n + N/2)
en =
N/2−1
n=0
y2
(n) (35)
2. If en is greater than the envelop threshold, which is preset based on the minimum
absolute power level of the digit signal, then calculates the ’biased’ Fourier transform
on the six MF frequencies according to the following:
se =
N
2
N−1
n=0
x2
(n)
Y Ri =
N/2−1
n=0
y(n) cos(nωi) (36)
Y Ii =
N/2−1
n=0
y(n) sin(nωi)
SPi = Y R2
i + Y I2
i
(37)
15
16. 3. From the six SPi, sort out the two largest ones, and designate them as SP0 and SP1.
Then calculate the testing energy ratio and twist ratio according to:
R = (SP0 + SP1)/se
TR =
MAX(SP0, SP1)
MIN(SP0, SP1)
(38)
4. If R is greater than the threshold (0.8, for example), and TR is less than the twist
ratio (5, for instance), then a valid digit has been recognized.
The ’comb filtering’ and envelop detecting operations in equation 35 seem quite trivial, but
these simple operations not only speed up the subsequent Fourier transform calculations by
halving the multiplications, they also provide the ability to suppress the interfering measuring
tones such as 400, 1020, 2800Hz. These ’even’ tones are present in the ATME testing, and
due to the complications from the echo cancellers, these tone are sometimes preceding or
mixed into the digit signals. Without the special operation in equation 35, the digits may
be easily missed as shown in Figure 5.
100 150 200 250 300
−3
−2
−1
0
1
2
3
Data samples
Comparing the envelop detectors
Sam Sin
Renshou Dai
MF 2 with 700 and 1300 Hz
1KHz tone
Figure 5: Compare the special envelop detector in equation 35 used by the author
with the regular envelop detector used by Sam Sin. The 1KHz tone preceding
the MF digit signal failed the regular envelop detector, but not the special
envelop detector with comb filtering.
As shown in equation 36, the Fourier transform was performed over the comb-filtered data
vector y(n), not on the original data vector x(n). This ’biased’ Fourier transform method
not only speeds up the computation, it also automatically suppresses the even tones that are
mixed into the digit signal by the echo cancellers. Figure 6 shows the comparison between
the regular FFT results and the biased Fourier transform results.
This MF digit receiver is now residing in the ATME.C file due to its simplicity and close
ties to the ATME code. In case the future interest arises, the code can be easily made
into an independent module for MF digit detection. The major function of the receiver is
’MFDGR exec()’ which simply implements the state machine transitions shown in Figure 7.
16
17. 0 200 400 600 800 1000 1200 1400 1600 1800 2000
0
0.1
0.2
0.3
0.4
0.5
Spectrum obtained from conventional 256−point FFT
Normalizedpower
0 200 400 600 800 1000 1200 1400 1600 1800 2000
0
0.1
0.2
0.3
0.4
0.5
Spectrum from 80−point biased Fourier transform
Frequency in Hz
Figure 6: Examine the spectrum of an MF digit signal (700, 1300Hz) mixed with
1020Hz measuring tone using FFT and biased Fourier transform in equa-
tion 36. The FFT-based approach detected three tones, therefor dropped
(missed) this digit, whereas the biased Fourier transform approach sees only
two tones and successfully detected the digit.
7 The MF/DTMF Digit Receiver for Tests through
Vocoders
The data compression nature of the vocoders causes significant spectral distortion and twist
on dual-tone signals [7]. To successfully detect the MF/DTMF digits transmitted through
vocoders, two special digit receiver (one for MF and one for DTMF) have been built for
the cellular responders (named as MFDGR.C and DTMFDGR.C). These two digit receivers
resemble the MF digit receiver for ATME, but with significant improvements to strengthen
the reliability. Strictly speaking, although the correction matrix G = HT
H is purely di-
agonal for MF digits for a 10 ms long data vector, the correction matrix for DTMF digits
is not ’purely’ diagonal. But since the two frequencies used by each DTMF digit are quite
far apart from each other (the closest space is 268Hz whereas the closest space for R2 is
120Hz), the off-diagonal elements are much smaller than the diagonal terms. Therefore the
correction matrix is not necessary (or more correctly speaking, the off-diagonal elements can
be neglected). This will not cause any ’fatal errors’ in the energy ratio R’s calculation since
the task is only to detect the digit (as long as R > Rthrsh, who cares whether R = 0.9 or
R = 0.85). In fact, the signal distortion caused by the vocoders is much more significant
than the spectral leakage problems. Therefore there is no need to maintain the absolute
accuracy for an already distorted signal.
To properly accommodate for the signal distortion of vocoders, the energy threshold
Rthrhs needs to be set very low (0.5 for instance, to tolerate more noise), whereas the twist
ratio TRratio needs to be set very high (25, for example, to tolerate more spectral twist).
Lowering down the ’admission criterion’ of course will sometimes cause the digit receiver to
be mis-triggered. To combat this problem, two more states are added into the state transition
diagram implemented inside the ’exec()’ function as shown in Figure 8. The added two more
states are used to confirm the beginning and ending of a digit. This means that a valid
17
18. Figure 7: State machine transition diagram implemented inside the
’MFDGR exec()’ function used by ATME test. The beginning of a digit
is detected when the state transits from DOFF to DON. The end of a digit
is detected when the state changes from DON to DOFF.
digit is not registered until the same digit has been detected on two consecutive buffers of
data. Statistically, the probability that a noise data be mis-detected as the same digit for
two consecutive buffers of data is much smaller than the probability that one buffer of noise
data be mis-detected as a valid digit.
Besides the reliability-strengthening at the state-machine level, the reliability is also
strengthened at the Fourier transform and energy ratio calculation and testing level. To see
how this is done, we again use the MF digits as an example. The Fourier transforms are
not only performed on the six original frequency points of 700, 900, . . . , 1700Hz, but also
performed on the extra seven frequency points of 600, 800, 1000, 1200, 1400, 1600, 1800Hz.
During the spectral sorting, a spectral bin is not validated unless it greater than the energy
level at two neighboring extra bins. In a sense, this guarantees the ’spikiness’ of the selected
bin. And indeed, it helps make a digit still detectable despite the spectral spreading or leak-
age caused by the vocoders. To give a more specific example, an MF digit 1 has frequencies
of 700 and 900Hz, whereas an MF digit 2 has frequencies of 700 and 1100Hz. When the MF
digit 2 signal passes through a vocoder, the signal energy at 900Hz bin may be greater than
the energy level on 1100Hz due to the vocoder’s effect on twisting the 700Hz and 1100Hz
tones and spreading the 700Hz tone energy onto the 900Hz bin. So without special measures
being taken, the MF digit 2 can be easily mistaken as MF digit 1. This scenario actually
happens a lot for VSELP and ACELP vocoders. By validating a spectral bin by checking if
the bin level is greater than its two neighboring bins, the MF digit 2 will not be mis-detected
as digit 1, because the bin level at 900Hz is smaller than the bin level at 800Hz, therefor, it
is not a valid spectral bin, and it will not mask the valid bin on 1100Hz although it is higher
than the 1100Hz bin.
The internal private functions have the same names and functionalities as those in the
generic digit receiver used by PDS. The ’exec’ function consumes one buffer of data and
18
19. Figure 8: The state transition diagram implemented inside the ’exec()’ functions
of the both the MF and DTMF digit receivers built for cellular responders.
It has two more states compared with that in Figure 7
changes states according to the state transition diagram shown in Figure ??. The ’Envelop()’
function calculates the signal energy envelop and ’Spsort()’ does the real Fourier transform
and spectral sorting. As a complete reference, the following steps re-summarize the essential
part of the ’Envelop()’ and ’Spsort()’ functions, which form essential part of the algorithm:
1. For an incoming data buffer of 10 ms long, calculates its total energy level:
se = N/2
N−1
n=0
x2
(n) (39)
If se is greater than the preset threshold, the ’Envelop()’ returns TRUE.
2. Calculate the Fourier transform on the original spectral bins plus the extra intermediate
bins according to the following:
Y Ri =
N−1
n=0
x(n) cos(nωi)
Y Ii =
N−1
n=0
x(n) sin(nωi) (40)
SPi = Y R2
i + Y I2
i
19
20. where ωi is defined as:
ωi =
2π[600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500
2π[647, 697, 734, 770, 811, 852, 897, 941, 1075, 1209, 1273,
1600, 1700, 1800]/Samplingrate, For MF
1336, 1407, 1477, 1555, 1633, 1683]/Samplingrate, For DTMF
(41)
Notice that for both MF and DTMF digits, the original frequency bins have odd-
numbered indices (1,3,5,etc.), whereas the inserted extra bins are on even-numbered
indices (0,2,4,etc.).
3. From the odd-number-indexed spectral bins, sort out the qualified bins. The qualified
bins are defined as:
SP2i+2 < SP2i+1 > SP2i (42)
4. From the qualified bins, sort out the two largest ones and designate them as SP2i+1
and SP2j+1, the energy ratio R and twist ratio TR are calculated as:
R =
SP2i+1 + SP2j+1
se
TR =
MAX(SP2i+1, SP2j+1)
MIN(SP2i+1, SP2j+1)
(43)
If R > Rthrsh and TR < TRthrsh, the ’Spsort()’ function returns TRUE.
References
[1] Renshou Dai, “ATME Specifications and Test Plan”, Sage Internal Document, Feb.,
1998.
[2] Renshou Dai, “The Technical Requirements Document for ATME Test System”, Sage
Internal Document, April, 1998.
[3] Peter Lindes, “The SML, Protocol Development System on the Sage 930i, Design Spec-
ification,” Sage Internal Document, June, 1997.
[4] Sam Sin, “The Digit Receiver Module”, Sage internal DSP documents, 1994.
[5] Craig Marven, “General-Purpose Tone Decoding and DTMF Detection,”, Digital Signal
Processing Applications with the TMS320 Family, vol. 2, pp. 423-484.
[6] Derek McKee, “Interpolation Increases FFT Resolution”, Design Ideas, EDN No. 8, pp.
282, 1990.
[7] Renshou Dai, “Some Research and Simulation Results on Vocoders”, Sage Internal
Document, May, 1998.
20