The document discusses multi-dimensional approaches to measuring voice quality over packet networks. It describes measuring three key dimensions: delay, echo, and clarity. Delay is one of the most challenging aspects, as real-time voice requires low delay. The document outlines various sources of delay in voice over packet (VoP) networks and techniques for measuring round-trip and one-way delay using specialized instruments and digital signal processing algorithms.
Overview of VoIP (Voice over IP) and FoIP (Fax over IP) technologies like Session Initiation Protocol and H.323.
Even though voice over IP (VoIP) was hailed as a technological innovation, the idea to transport real-time traffic over TCP/IP networks was not new back in the 1990s when VoIP started being deployed in networks. Chapter 2.5 of the venerable RFC793 (TCP) shows both data oriented application traffic as well as voice being transported over IP based networks.
Nevertheless, VoIP puts high demands on signal and protocol processing capabilities so it became possible at reasonable costs only in the 1990s.
VoIP can be roughly split into two main functions. Signaling protocols like SIP (Session Initiation Protocol), H.323 and MGCP/H.248 are used to establish a conference session and the data path for transporting real-time voice data packets. SIP has largely supplanted H.323 in recent years to its simpler structure and packet sequences. MGCP and H.248 are mostly used in carrier backbone networks.
Protocols like RTP (Real Time Protocol) transport voice packets and provide the necessary information for receivers to equalize packet flow variations to provide a smooth playback of the original voice signal.
Voice codecs are one of the core functions of the data path. Voice compression reduces the bandwidth required to transport voice over an IP based network. Compression may be less of a concern in local area networks with gigabit speeds, on slower links like 3G (UMTS, LTE) it still makes a lot of sense.
The algorithms used in different codecs make use of various characteristics of the characteristics of human speech recognition. Redundant information is removed from the signals thus slightly reducing the quality, but greatly reducing the required bandwidth.
In VoIP networks, the echo problem is typically compounded by the increased delay incurred by packetization of voice signals. To counteract the echo problem, VoIP gear (hard phones, soft phones, gateways) include echo cancelers to remove echo signals from the transmit signal.
To transport facsimile over an IP based network, even more technology is needed. Facsimile protocols are very susceptible to delay and delay variation and thus need more compensation algorithms. Protocols like T.38 terminate facsimile protocols like T.30 (analog facsimile) and transport the fax images as digitized pictures over IP based networks.
Interest towards speech coding & standardization:
– World wide growth in communication networks
– Emergence of new multimedia applications
– Advances in Very Large-Scale Integration (VLSI)
devices
• Standardization
– International Telecommunications Union (ITU)
– European Telecom. Standards Institute (ETSI)
– International Standards Organization (ISO)
– Telecommunication Industry Association (TIA), NA
– R&D Center for Radio systems (RCR), Japan
Overview of VoIP (Voice over IP) and FoIP (Fax over IP) technologies like Session Initiation Protocol and H.323.
Even though voice over IP (VoIP) was hailed as a technological innovation, the idea to transport real-time traffic over TCP/IP networks was not new back in the 1990s when VoIP started being deployed in networks. Chapter 2.5 of the venerable RFC793 (TCP) shows both data oriented application traffic as well as voice being transported over IP based networks.
Nevertheless, VoIP puts high demands on signal and protocol processing capabilities so it became possible at reasonable costs only in the 1990s.
VoIP can be roughly split into two main functions. Signaling protocols like SIP (Session Initiation Protocol), H.323 and MGCP/H.248 are used to establish a conference session and the data path for transporting real-time voice data packets. SIP has largely supplanted H.323 in recent years to its simpler structure and packet sequences. MGCP and H.248 are mostly used in carrier backbone networks.
Protocols like RTP (Real Time Protocol) transport voice packets and provide the necessary information for receivers to equalize packet flow variations to provide a smooth playback of the original voice signal.
Voice codecs are one of the core functions of the data path. Voice compression reduces the bandwidth required to transport voice over an IP based network. Compression may be less of a concern in local area networks with gigabit speeds, on slower links like 3G (UMTS, LTE) it still makes a lot of sense.
The algorithms used in different codecs make use of various characteristics of the characteristics of human speech recognition. Redundant information is removed from the signals thus slightly reducing the quality, but greatly reducing the required bandwidth.
In VoIP networks, the echo problem is typically compounded by the increased delay incurred by packetization of voice signals. To counteract the echo problem, VoIP gear (hard phones, soft phones, gateways) include echo cancelers to remove echo signals from the transmit signal.
To transport facsimile over an IP based network, even more technology is needed. Facsimile protocols are very susceptible to delay and delay variation and thus need more compensation algorithms. Protocols like T.38 terminate facsimile protocols like T.30 (analog facsimile) and transport the fax images as digitized pictures over IP based networks.
Interest towards speech coding & standardization:
– World wide growth in communication networks
– Emergence of new multimedia applications
– Advances in Very Large-Scale Integration (VLSI)
devices
• Standardization
– International Telecommunications Union (ITU)
– European Telecom. Standards Institute (ETSI)
– International Standards Organization (ISO)
– Telecommunication Industry Association (TIA), NA
– R&D Center for Radio systems (RCR), Japan
AQuA Voice Quality Testing. Competitive Alternative For Pesq (P 862)Sevana Oü
AQuA (Audio Quality Analyzer) is a competitive alternative for existing quality testing models.
AQuA is available for all platforms (32bit and 64bit machines, Windows, Linux, MAC OS)
AQuA has a competitive computational performance
AQuA does not have annual royalty fee
AQuA has the most attractive pricing on the market
AQuA is already used in VoIP PBXs and other systems
Software Defined Radio Engineering course samplerJim Jenkins
This 3-day course is designed for digital signal processing engineers, RF system engineers, and managers who wish to enhance their understanding of this rapidly emerging technology. Most topics include carefully described design analysis, alternative approaches, performance analysis, and references to published research results. Many topics are illustrated by Matlab simulation demos. An extensive bibliography is included.
MaxEye Technologies enables multiple Digital Video and Audio broadcasting standards using single hardware platform. This reduces testing time and cost significantly compared to traditional testing.
Spectrum management best practices in a Gigabit wireless worldCisco Canada
With the introduction of 802.11ac the news is full of the potential for Gigabit networking. Very few of us will have the luxury of running a network that strictly supports 802.11ac and that means a mixed environment for most of us. Get the facts on what 802.11ac means to you, how to evaluate using 20, 40, 80 or 160 Mhz OBSS/Channels. How does RRM's DCA handle a mixed environment and what performance considerations do you need to consider to make decisions that make the best of the spectrum you have today and in the future. What is in the future for our spectrum? To learn more please visit our website here: http://www.cisco.com/ca/
AQuA Voice Quality Testing. Competitive Alternative For Pesq (P 862)Sevana Oü
AQuA (Audio Quality Analyzer) is a competitive alternative for existing quality testing models.
AQuA is available for all platforms (32bit and 64bit machines, Windows, Linux, MAC OS)
AQuA has a competitive computational performance
AQuA does not have annual royalty fee
AQuA has the most attractive pricing on the market
AQuA is already used in VoIP PBXs and other systems
Software Defined Radio Engineering course samplerJim Jenkins
This 3-day course is designed for digital signal processing engineers, RF system engineers, and managers who wish to enhance their understanding of this rapidly emerging technology. Most topics include carefully described design analysis, alternative approaches, performance analysis, and references to published research results. Many topics are illustrated by Matlab simulation demos. An extensive bibliography is included.
MaxEye Technologies enables multiple Digital Video and Audio broadcasting standards using single hardware platform. This reduces testing time and cost significantly compared to traditional testing.
Spectrum management best practices in a Gigabit wireless worldCisco Canada
With the introduction of 802.11ac the news is full of the potential for Gigabit networking. Very few of us will have the luxury of running a network that strictly supports 802.11ac and that means a mixed environment for most of us. Get the facts on what 802.11ac means to you, how to evaluate using 20, 40, 80 or 160 Mhz OBSS/Channels. How does RRM's DCA handle a mixed environment and what performance considerations do you need to consider to make decisions that make the best of the spectrum you have today and in the future. What is in the future for our spectrum? To learn more please visit our website here: http://www.cisco.com/ca/
Practical Fundamentals of Voice over IP (VoIP) for Engineers and TechniciansLiving Online
This manual provides solid practical advice on application, implementation and, most importantly, troubleshooting Voice Over IP (VOIP) systems.
MORE INFORMATION: http://www.idc-online.com/content/practical-fundamentals-voice-over-ip-voip-21?id=151
Audio video ethernet (avb cobra net dante)Jeff Green
AVB fits low-cost, small-form-factor products such as this microphone. The overall trend is that music no longer lives on shelves or in CD racks, but in hard drives in home computers, and increasingly in the cloud. This brings about its own unique problems, not in the encoding system used, or the storage technology, but in distributing the audio from the storage media to the speakers. AVB features are all enabled by a global and port level configuration. Connecting these elements is the AVB-enabled switch (in the graphic above, the Extreme Networks® Summit® X440.) The role of the switch is to provide support for the control protocols: AVB is Ethernet’s next stage of convergence, delivering pitch perfect audio and crystal clear video seamlessly over the network
IP/Ethernet is bringing simplicity and features to audio and video as it has brought to services like VoIP, Storage and many more
High quality, perfectly synchronized A/V until now has been difficult to maintain
Standards work by the IEEE and the AVB standard changes everything, creating interoperability and mass-marketing equipment pricing
Benefits of AVB - Delivers predictable latency and precise synchronization, maximizing the functionality of AV – time synchronization and quality or service
Reduced complexity and Ease of use through interoperability between devices
Streamlines complex network set-up and management, the Infrastructure negotiates and manages the network for optimal prioritized media transport
AV traffic can co-exist with non-AV traffic on same Ethernet infrastructure
Role based control at the XYZ Account - XYZ Account can identify devices and apply policies based on device type all the way down to the port and or the AP. Policies can dynamically change based on the device a user is connecting with and where that user is located. Extreme Networks provides infrastructure to deliver customizable prioritization and scalable capacity via configurable and built-in intelligence, ensuring a comprehensive, superior quality experience. Furthermore, when deployed with Extreme Wireless XYZ Account can configure the network to ensure applications receive the bandwidth they require, while still limiting or preventing high speed streaming of music of video or even games.
9.) audio video ethernet (avb cobra net dante)Jeff Green
Replacing a crossbar switch with ‘virtual’ IP packet switching - The ability to expand video-over-IP systems ‘one piece at a time’ and the decentralized nature of the matrix makes the technology very compelling for any size or scope of AV project.. AV-over-IP is the transport of AV signals over a standard Ethernet network, including…
HD Video (e.g. HDMI, DVI)
Audio
Control Signals (e.g. IR)
Peripheral Signals (e.g. USB)
Does Dante require special switches? No. We strongly recommend that Gigabit switches be used due to the clear advantages in performance and scalability.
Does Dante require a dedicated network infrastructure? No, a dedicated network infrastructure is not required. Dante-enabled devices can happily coexist with other equipment making use of the network, such as general purpose PCs sending and receiving email and other data.
Does Dante require any special network infrastructure? No, special network infrastructure is not required. Since Dante is based upon universally accepted networking standards, Dante-enabled devices can be connected using inexpensive off-the-shelf Ethernet switches and cabling.
What features are important when purchasing a switch? Dante makes use of standard Voice over IP (VoIP) Quality of Service (QoS) switch features, to prioritize clock sync and audio traffic over other network traffic. VoIP QoS features are available in a variety of inexpensive and enterprise Ethernet switches. Any switches with the following features should be appropriate for use with Dante:
Gigabit ports for inter-switch connections
Quality of Service (QoS) with 4 queues
Diffserv (DSCP) QoS, with strict priority
Totally new to AV over IT? This may help. If you have worked with any of the popular protocols, your time is better spent in other sessions. AV over IT methods vary in application of OSI model. Audio Networking - One RJ45 and CAT5 cable for dozens of signal paths. Switches can provide hardware time stamping which allows synchronization, offsets, and corrections. All covered in IEEE 1588.
Ethernet Timing & Priority Standards - All audio over Ethernet protocols require Priority, Sequence, & Sync
Differentiated Services / Quality of Service (DiffServ, QoS)
Priority by data type (Clock Sync and Audio Packets over Email)
Traffic prioritized based upon tags in IP Header (Layer 3)
Priority number assigned by manage switch to each packet
Real-time Transport Protocol (RTP)
Keeps data sequenced in the right order
Time stamp on UDP header
Works with RTCP (Real Time Control Protocol) for QoS and Sync
Variation: RTSP (Real Time Streaming Protocol) works on TCP and not UDP
Does not reserve resources or provide for quality of service
Precision Timing Protocol (PTP)
IEEE 1588
Sub-microsecond accuracy to synchronize subnets
Layer 2 - Switches provide hardware-based time stamping
VoIP Monitoring and Analysis - Still Top of Mind in Network Performance Monit...Savvius, Inc
With over 10 years of deployment history, VoIP is the primary voice solution for just about every company in existence - large, medium, or small. But even with all that history, recent research from TRAC shows that VoIP is still the number one IT initiative impacting network performance. And with the growth of 802.11 and Wi-Fi enabled smart phones, the use of voice over Wi-Fi (VoFi) promises to increase the volume of VoIP traffic even more. Analyzing VoIP traffic alone is not enough. VoIP analysis must be part of your overall network performance analysis. After all, VoIP is just another data type on your network, and according to TRAC, it is impacting your network performance, so you must monitor and analyze the network as a whole, including voice and video over IP. Join us to see how easy it is to capture and analyze voice, video, and data traffic simultaneously, allowing you to pinpoint the impact of each data type on your overall network performance.
2. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 2
Purposes of this seminar:
• Understand what impairments affect voice quality
• Learn techniques to measure those impairments
• Apply your measurements to:
1. improve Quality-of-Service (QoS)
2. arbitrate Service-Level-Agreement (SLA)
3. optimize network configuration
4. troubleshoot network components
3. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 3
The converged voice network
• Network backbone is migrating to packet-switching,
bringing with it new problems.
• Despite convergence, voice services are getting more
chaotic, further complicating QoS issue.
• The ultimate QoS metric for converged network
should be measured from end audio terminals.
Ethernet
PSTN
switch
Telephone
PBX
analog loop
ISDN/T1
Telephone
MSC
Base
Station
Cell
Phone
ISDN/T1
I
A
DTelephone
D
S
L
A
M
ATM
switch
xDSL, Cable
MMDS,
LMDS etc
VoIP Server
IP phone
Media
Gateway
soft
switch
SS7
Packet Network Cloud
Figure 1: One side of the simplified converged voice network
4. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 4
Why VoP? Pros of VoP:
• Packet data network uses bandwidth more efficiently
than circuit-switched PSTN.
• Data traffic surpassed voice on landline network.
Same will happen on wireless network.
• In “old” days, data were carried over PSTN. But
PSTN is not suitable for multi-media data stream.
• Now the trend is reversing. Voice is being carried
over packet-switched data network.
• VoP also has the potential to offer sophisticated
service features.
5. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 5
VoP: Voice-over-Packet or Voice-of-Problem?
• Packet-switching was designed for delay-insensitive
bursty data traffic.
• Interactive real-time voice conversation requires:
1. low delay which requires RTP/UDP and precludes TCP;
2. low jitter which necessitates de-jitter buffering with long
delay;
3. no long-delayed echoes which implies more stress on echo
cancellers;
4. constant bandwidth that entices voice compression and
silence suppression with degrading clarity.
• The connectionless, best-effort IP in particular has
no QoS control to guarantee voice stream delivery.
• VoP is by no means an easy and trivial task. VoP
requires intensive testing at all stages.
6. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 6
VoIP’s View of the OSI 7-layer Model
• For the convenience of later discussions, let’s review
the Open-System-Interconnect 7-layer model and its
relevance to VoIP application.
7: Application
Voice processings and playout
Voice compression
Silence suppression
Echo cancellation
AC signaling detection/generation
etc
6: Presentation
5: Session
Voice frame re-assembly
Connection/termination
RTP/RTCP wrapping of voice frame
4: Transport UDP (versus TCP) transport control
3: Network IP layer packetization,addressing,routing
2: Data Link ATM/Ethernet/Frame Relay/etc
1: Physical DS1,DS3,OC1,OC3,OC12,OC48,OC192
over Fiber/Coax/UTP/Wireless
Figure 2: OSI 7-layer model and its relation to VoIP application
7. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 7
Forward voice processing impairments
• At Tx side, voice signal is impaired by:
1. analog distortion that affects voice level;
2. digitization that introduces quantization noise;
3. compression that degrades clarity and naturalness;
4. silence suppression that may result in voice clipping;
5. compression and packetization that introduce long delay.
Voice/Audio
Signal
PCM line card:
A/D, G.711
logarithmic encoding
10010011
Quantization
Noise
PCM byte
VAD
Voice
Clipping
Vocoder
Compression
Silence
Suppression
Clarity
Degradation
and delay
Noise
"fidelity
"
1001100.......01
compressed
voice frame
IP-layer
packetization
Delay
IP Header IP data/payload ATM
Switch
Voice Frame/IP
Packet
fragmentation
HDR Data HDR Data
DS1/DS3/OC1/
OC3/OC12/
OC48/OC192
Line Driver
RF/xDSL/Cable
modems Tx
Figure 3: Forward voice processing path
8. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 8
Reverse voice processing impairments
• At Rx side, voice signal is impaired by:
1. packet loss that results in voice frame erasure;
2. jitter buffering that introduces long delay;
3. buffer resizing that causes jerking/gapping voice jitters;
4. CNG that generates “uncomfortable” noise.
Front-end receivers
at DS1/DS3/
OC1..OC192 rate
or
RF/cable/xDSL
modem receivers
HDR Data
ATM Cell
ATM
Switch IP Hdr IP payload
Packet
Jitter Buffer De-packetization,
reassemble voice
frames, Lost/
delayed packet
handling
Vocoder
Decoder
Comfort
Noise
Generatoer
10010011
PCM Line
Driver.
Audio Playout
Unit
PCM stream
Errored/Lost
Frame
concealment
Noise level
and
Noise
Spectrum
Long delay
and
Voice jitter
Output speech
signal
Figure 4: Reverse voice processing path
9. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 9
Voice quality: a simplified 3D model
• Voice quality is a combined effect of the following:
1. Delay;
2. Echo;
3. Clarity.
• The 3 dimensions are orthogonal, therefore, each
of them needs to be characterized individually.
Round-trip
delay
Echo(es)
Clarity
Voice Quality
Plane
Figure 5: A 3D view of voice quality
10. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 10
VQ dimension I: delay
• Interactive conversation requires short delay. If
short-delay were not required, VoP would be an
easy task;
• In fact, short-delay is the most challenging require-
ment for VoP application. The fundamental task of
VoP is to balance the conflicting needs for shorter
delay and fewer packet loss;
• Long delay affects both listener and talker:
1. Long delay causes hesitation and over-talk. A caller starts noticing
delay when it exceeds 250 ms. ITU-T G.114 specifies the maximum
desired round-trip delay as 300 ms. A delay over 500 ms will make
phone conversation impractical.
2. Long delay also exacerbates echo problems as will be shown later.
0 100 200 300 400 500 600 700 800
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
MOS vs delay
MOS
Round−trip delay in ms
Figure 6: Hypothetical round-trip delay impact on MOS
11. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 11
Causes of long delay in VoP
• Network delay. This includes the physical trans-
mission (propagation) delay, and the store-forward-
based routing/switching delay that depends on traf-
fic congestion.
• Jitter-buffering delay. Jitter-buffer is needed to
counter against inherent packet-network jitters. Jitter-
buffering entails voice delay.
• Packet processing delay due to packetization, packet/cell
segmentation and re-assembly etc.
• Voice processing delay due to voice compression
(vocoders), silence suppression (VAD) and even
echo cancellation etc.
12. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 12
Measure round-trip delay
• Only round-trip delay needs to be measured as far
as VQ is concerned;
• Generally, two instruments are needed, especially
for real network test across different geographic lo-
cations;
• Once call connection is established, one instrument
starts sending special test signal on its Tx path,
and keeps measuring the delayed signal on its Rx
path. The other instrument simply provides a sig-
nal loopback with known fixed delay.
DSP-capable
round-trip
delay
measurement
instrument
Network
Cloud
Tx
Rx
Signal
loopback
device
with
fixed
known
delay
Delay
to be
measured
Figure 7: Configuration of measuring round-trip delay
13. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 13
Measure one-way delay
• Occasionally, one-way delay also needs to be mea-
sured. One-way delay can be measured in two
ways:
1. Simple one-box solution. If the test ports (call originator
and call terminator) are co-located (as in a lab environ-
ment), then a single test instrument can be used. This
instrument must be capable of both originating and ter-
minating a call. The Tx then sends test signal, and the
Rx measures the delay. The delay is the one-way delay
from Tx to Rx.
2. Expensive two-box solution. If one-way delay needs to be
measured through a network with test ports located at
large distance from each other, then two instruments are
needed and the two instruments must be synchronized to
a common clock (via GPS, for example).
DSP-capable
one-way
delay
measurement
instrument
Network
Cloud
Tx
Rx
Delay
to be
measured
Figure 8: Configuration of measuring one-way delay in lab
14. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 14
DSP-based delay measurement
• Delay measurement is far more than a simple network ping.
“PING” only accounts for network delay. Other significant
delays caused by voice processing, jitter buffering and pack-
etization are not accounted for. A real delay measurement
must be performed end-to-end with audio signal employing
DSP algorithm.
• The DSP algorithm must be cross-correlation-based for max-
imum reliability across “harsh” VoP environment. The test
signal must possess the following attributes:
1. the signal must be voice-like in PSD so that it can go through the
voice transmission media and lossy vocoders without too much dis-
tortions.
2. The signal must have special auto-correlation property so as to achieve
long range measurement with fine resolution.
3. The cross correlation must be performed over large signal frames
(which requires more real-time computations) to overcome potential
packet loss effect.
0 50 100 150 200 250
−2
0
2
Tx signal
0 50 100 150 200 250
−2
0
2
Rx signal
0 50 100 150 200 250
0
0.5
1
Cross correlation
Time in ms
Figure 9: An exemplary DSP algorithm for delay measurement
15. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 15
VQ dimension II: echo
• Echo is a problem unique to voice communication.
• Echo has two dimensions: delay and level.
• VoP exacerbates echo problem due to:
1. longer delay. Long-delayed echo is more annoying.
2. performance limitation of embedded echo cancellers.
3. natural echo attenuation disappears on digital loop.
• Audible echoes are absolutely unacceptable in a
phone conversation. Measures must be taken to
control echo to meet the ITU-T G.131 requirement.
50 100 150 200 250 300 350 400 450 500 550 600
−60
−55
−50
−45
−40
−35
−30
−25
−20
−15
−10
Limiting Case
Acceptable
Echo Tolerance Curve
RelativeEchoLevel
Round−trip delay in ms
Figure 10: Echo tolerance curve adapted from ITU-T G.131. Vertical axis is the
echo level in dB, and horizontal axis is the echo delay (round-trip) in ms. Any
combination of echo level and delay must fall below the limiting case in order
to meet the G.131 requirement.
16. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 16
Causes of echoes
• Echo will exist as long as there is one last 2-w ana-
log phone in the world.
• Analog 4-w to 2-w hybrid at the end switch causes
echo. No matter how well the hybrid is balanced,
each actual 2-wire loop will have different impedances
depending on its length, load-coil and the phones
plugged in.
• Malfunctioning echo cancellers.
• Acoustic feedback on certain speaker phones, small
cellular phones and digital phones etc.
• Echo mostly affects talker. Once talker echo is un-
der control, the listener echo will automatically dis-
appear.
Analog 2-w
loop
Class 5
switch
Hybrid
Echo source
4-w analog
Network
Echo
Canceller
Digital T1/E1
Network
Figure 11: Source of echo as a result of 4-w to 2-w hybrid
17. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 17
Measure echoes with Echo Sounder
• Echoes must be measured from the telephony audio
interface employing complex DSP algorithm.
• Sage’s Echo Sounder measures echo delay and echo
level of multiple echoes using a Code Domain Re-
flectometry based DSP algorithm. The algorithm
is super robust that it can tolerate 50% packet loss
and work under strong interference and harsh com-
pression.
• Echo Sounder is also an ideal tool to measure 2-way
and 1-way delays.
0 50 100 150 200 250 300
−100
−90
−80
−70
−60
−50
−40
−30
−20
−10
0
Echoes detected by Echo Sounder
EcholevelindB
Echo delay in ms
Reference signal
First echo
Second echo
Figure 12: The TDR-equivalent results obtained through Echo Sounder that uses
CDR (Code-Domain-Reflectometry)
18. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 18
ITU-T G.168 echo canceller tests
• G.168 specifies a total of 19 tests for echo canceller
design verification. Most notable tests are:
1. Steady-state cancellation depth and capacity (tail length).
2. Cancellation convergence time.
3. Double-talk detection.
• G.168 requires two test instruments. One performs
G.168 test. The other one is an Echo Generator
that generates multiple echoes with selectable lev-
els and delays and additional double talk emulation
capability.
Echo Canceller
under test
Echo
Generator
with
Double-talk
emulation
G.168 test suite
or
Echo Sounder
Figure 13: G.168 echo canceller performance test configuration
19. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 19
ITU-T O.22 ATME tests
• ITU-T O.22 ATME also recommends a suite of au-
tomatic tests to verify the performances of echo
cancellers installed in a real live network:
1. Cancellation depth and tail length.
2. EC disabling and enabling.
3. Echo level discrimination (quasi-double-talk test).
• ATME also performs some transmission tests:
1. Attenuations at 3 frequencies.
2. Quiet noise, notch noise, and S/TD.
3. DS0 BERT: 56Kbps Bit-Error-Rate-Test.
• In a sense, ATME replaces 105 tests.
ATME
Director
Near-end
Echo
Canceller
under
test
Far-end
Echo
Canceller
under
test
ATME
Responder
Network
Figure 14: ATME test configuration. ATME stands for Automatic Test Mea-
surement System
20. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 20
VQ dimension III: clarity
• Voice clarity itself is an m-D metric determined
both static and dynamic impairments.
• Static impairments are due to voice processing:
1. Lossy compression by low-bit-rate vocoders;
2. Voice clipping by VADs or vocoders;
3. Attenuation distortion that results in improper voice level;
4. Improper noise level due to CNG, BER and cross-talk.
• Dynamic impairments are unique to VoP:
1. Packet/cell/frame loss. Packet loss is largely a result of dynamic
traffic congestion.
2. Voice jitter. This is a sudden delay variation as a result of dynamic
resizing of jitter buffers.
• So, measuring clarity alone is an m-D problem.
21. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 21
Measure static impairments
• Psychoacoustic models. These models measure the
perceptual degradation, not the exact dimension of
each impairment:
1. ITU-T P.861 PSQM: excellent for lossy compression.
2. BT PAMS: strong in jitter-removal and equalization.
3. Draft P.862 PESQ: hybrid of PSQM and PAMS.
4. P.861 MNBs: less popular plain mathematical model.
5. Other models based on pattern recognition, neural net-
work and cepstral distance etc.
6. None of these models can account for echo and delay ef-
fects. Their validity on dynamic impairments (packet loss
and jitter) requires further study.
• Non-psychoacoustic approach:
1. attenuation distortion with 23-tone or 3-tone.
2. lossy compression and noise with SNR.
3. voice clipping and noise level with PVIT.
22. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 22
“PVITing” dynamic impairments
• No panacea model can summarize all impairments into one
magic number such as MOS. The actual dimension of each
impairment needs to be measured.
• Packet loss measured from the end audio-terminal is most
relevant. Packet loss measured at bit, byte and packet-layers
is hard to correlate to end-user perception.
• Voice jitter is not related to network jitter, and must be
measured from audio terminal.
• Sage’s PVIT (Packet-Voice-Impairment-Test) measures packet
loss and voice jitter from end audio interface (analog 2-w, 4-w
and T1/E1). It also measures clipping and comfort noise.
500 1000 1500 2000 2500
−1.5
−1
−0.5
0
0.5
1
1.5
Original voice signal
500 1000 1500 2000 2500
−1.5
−1
−0.5
0
0.5
1
1.5
Delayed and impaired signal
Packet loss
Missing noise
and jitter
voice clipping
voice samples
Figure 15: Sample of packet loss, jitter, improper silence noise voice clipping
23. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 23
SMOS for real live network testing
• For end-to-end test through a real network that
has both static (such as compression) and dynamic
impairments (such as voice jitters), SMOS provides
a complete solution.
• SMOS consists of accurate psychoacoustic core and
robust de-jittering and equalization schemes.
• SMOS performs automated real-time measurements
with robust in-band telemetry and synchronization
that makes it particularly useful for real live net-
work testing.
• SMOS provides a complete picture of your network
performance by measuring:
1. MOS, Mean-Opinion-Score in scale of 1 to 5.
2. Round-trip delay.
3. Codec type (G711PCM, G726ADPCM and Vocoder etc).
4. Voice jitters.
5. Voice level change.
6. Effective bandwidth (attenuation distortion).
7. Silence noise level.
24. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 24
One more dimension: signaling and protocol
• Signaling refers to the signal exchanges among CPEs and switches in
order to set up and tear down a call. Protocols dictate the meaning of
those signals.
• Signaling can range from DC (off-hook current), to AC (DTMF/MF/R2
digits and call progress tones), to in-band bits (CAS-based T1 robbed-
bits) and bytes (CCS-based E1), to packets (ISDN and SS7) and to cur-
rent message-based stateless protocols (SIP, MGCP, H.323 etc). In-depth
discussion of signaling and protocol is beyond the scope of this seminar.
• But DTMF digit is a special exception. DTMF digits are not only used
for call addressing before call connection, they are also used for IVR sys-
tems during the call. VoP-based CPE, IAD and Gateway need to detect
the DTMF digits during the call, specially encode them, and “faithfully”
regenerate the digits at the other end. Such DTMF digits handling ca-
pability of course needs to be tested with DTMF transmission distortion
analysis. Important parameters are: DTMF dual-tone frequency accu-
racy, level bias, and ON-OFF duration.
• Other QoS aspects are dial tone delay (how fast one can hear the dial
tone) and call connection time (how long it takes to route the call).
0 20 40 60 80 100 120 140 160 180
−3
−2
−1
0
1
2
3
DTMF digit sequence
Time in ms
ON
OFF OFF
DTMF digit 1
0 200 400 600 800 1000 1200 1400 1600 1800 2000
0
50
100
150
Frequency−domain analysis of digit 1
Frequency in Hz
Figure 16: DTMF digit transmission distortion measurement
25. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 25
Application example I: QoS monitoring
• Unlike PSTN, the QoS of a VoP application de-
pends on network traffic, which varies from time
to time during a day.
• By continuously “PVITing” the traffic associated
dynamic impairments (packet loss and voice jitter)
on a given call, one can obtain crucial statistics
about network usage and QoS. The results can be
used to improve data traffic control/routing and
prioritization schemes so as to improve QoS with-
out sacrificing overall network through-put (or us-
age).
0 5 10 15 20
0
2
4
6
8
10
12
Hypothetical dynamic impairments vs local time
Occurenceofpacketlossorvoicejitterperminute
Local time in hour
Figure 17: Continuous monitoring of traffic-sensitive dynamic impairments
(packet loss and jitter) with PVIT (Packet-Voice-Impairment-Test) for 24
hours
26. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 26
Application example II: SLA arbitration
• SLA (Service-Level-Agreement): the essence of SLA
is a contract between service providers and cus-
tomers that quantitatively specifies a reasonable
set of QoS parameters in accordance with the fees
paid.
• QoS is a multi-dimensional metric. Each of its di-
mension (delay, echo, clarity, dynamic impairments
and signaling) must be measured and tabulated to
check the conformance to prescribed contract.
QoS dimension SLA-dictated target
Round-trip delay < 250 ms
Talker echo level < −45 dB
Clarity (MOS) > 4.0
Packet loss and jitter < 5%
Dial tone delay < 500 ms
Call connection time < 3 s
Table 1: Hypothetical list of SLA (Service-Level-Agreement) elements
27. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 27
App. example III: configuration optimization
• A voice gateway or IAD typically has the following
configurable parameters:
1. Vocoding schemes such as G.711 PCM, G.726 ADPCM,
G.729 CS-ACELP and G.723.1 MP-MLQ.
2. Disable/enable VAD for silence suppression.
3. Packet/frame size.
4. Jitter buffer size. Static or dynamic re-sizing.
5. Disable/enable echo canceller and non-linear processor.
6. Configure echo canceller capacity (tail length) and den-
sity (number of voice channels to be cancelled).
• The optimal configuration is achieved when the fol-
lowing goals are met:
1. reasonably short delay (< 250 ms, for example).
2. absence of audible echoes.
3. decent voice intelligibility (MOS> 4.0, for example).
4. No excessive packet loss and voice jitters (< 3%, for ex-
ample).
• Of course, a test equipment is needed to verify the
configuration optimality.
28. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 28
App. example IV: troubleshooting guidelines
• Follow these steps when using Sage’s testing tech-
nologies.
1. Perform SMOS test and examine MOS number and codec
type.
2. If MOS number falls short of theoretical expectation,
check the analog-type of distortion indicated by effective
bandwidth, silence noise and signal loss or gain.
3. If no analog-type of distortion, perform PVIT test to
measure the amount of dynamic impairments such as
packet loss and jitters.
4. If no dynamic impairments are found, the codec imple-
mentation should be verified. For PCM and ADPCM,
use 23-tone’s SNR. For G729 and G723.1 vocoders, use
SMOS or PSQM’s MOS number along with the codec
detection.
5. Even if MOS is crystal clear, check the total amount of
jitters reported by SMOS and make sure, the less the
better.
6. Check the round-trip delay and make sure the delay is
less than 300 ms. If longer, the configurations need to be
optimized.
7. Once delay is longer than 50 ms, echo becomes a concern.
One should perform Echo Sounder test to make sure there
are no audible echoes, or the echo level and delay are
within ITU specifications.
29. Multi-dimensional Approach to VoX Voice Quality Measurement. Sage Instruments 29
Conclusions
• QoS is an m-D problem that requires m-D approach.
• 3 key dimensions of voice quality are delay, echo and clarity.
Each dimension needs to be characterized individually.
• Short delay is a challenging requirement for VoP application.
• Long delay in VoP exacerbates echo problem, which deserves
special care in testing.
• Clarity itself is an m-D metric determined by both static
impairments and dynamic impairments.
• Static impairments such as voice compression, clipping and
improper noise level can be measured through psychoacoustic
models.
• Dynamic impairments such as packet loss and voice jitter
need to be precisely determined from audio interface to help
optimize network usage and improve voice quality.
• DTMF digit transmission distortion needs to be analyzed to
guarantee the IVR functionality. Short dial tone delay and
short call connection time present the impression of better
service availability.
• Sage’s main test features, SMOS/PSQM, PVIT, Echo Sounder,
23-tone etc provide all dimensions of voice quality. Results
from these tests also provide enough information for calcu-
lating ITU G.107 E-model’s R rating.