2. Which Applications? �
i
�TEKELEC
• Conferencing:
• Audio/video communication and application sharing
• First multicast session IETF 1992
• Many-to-many scenarios
• Media Broadcast
• Internet TV and radio
• One to many scenario
• Gaming
• Many to many
□□□ FOR W HAT'S NEXT
3. What is needed? �
i
�TEKELEC
• Efficient transport:
• enable real time transmission.
■ avoid sending the same content more than once.
• Best transport depends on available bandwidth and technology.
• Audio processing:
• How to ensure Audio/Video Quality?
• How to Mix the streams?
• Conference setup:
• who is allowed to start a conference?
■ how fast can a conference be initiated?
• Security and privacy:
■ How to prevent not-wanted people from joining?
■ How to secure the exchanged content?
• Floor control:
■ How to maintain some talking order?
□□□ FOR W HAT'S NEXT
4. How to Realize? Centralized i TEKELEC
• All register at a central point
• All send to central point
• Central point forwards to others
• Simple to implement
• Single point of failure
• High bandwidth consumption at center point
• Must receive N flows
□-
□
• High processing overhead at center point
• Must decode N flows mix the flows and encode N flows
• With no mixing the central point would send Nx(N-1) flows
• Appropriate for small to medium sized conferences
• Simple to manage and administer:
• Allows access control and secure communication
• Allows usage monitoring
• Support floor control
• Most widely used scenario
• No need to change end systems
• Tightly coupled: Some instances know all information about all participants at all times
□
□□□ FOR W HAT'S NEXT
5. How to Realize? Full Mesh i TEKELEC
□
• All establish a connection to each other
• All can send directly to the others
• Each host will need to maintain N connections
• Outgoing bandwidth:
• Send N copies of each packet
• simple voice session with 64kb/s would translate to 64xN kb/s
• Incoming bandwidth:
• If silence suppression is used then only active speakers send data
• In case of video lots of bandwidth might be consumed
• Unless only active speakers send video
• Floor control only possible with cooperating users
• Security: simple! do not send data to members you do not
trust
• End systems need to mix the traffic -more complex end
systems
□□□ FOR W HAT'S NEXT
6. IP Multicast i TEKELEC
• Why?
• Most group communication applications are based on top of unicast sessions.
• By unicast, each single packet has a unique receipient.
• How?
• Enhance the network with support for group communication
• Optimal distribution is delegated to the network routers instead of end systems
• Receivers inform the network of their wish to receive the data of a communication
session
• Senders send a single copy which is distributed to all receivers
□□□ FOR W HAT'S NEXT
7. How to Realize? End point based i TEKELEC
• All establish a connection to the chosen mixer.
• Outgoing bandwidth at the mixer end point:
• Send N copies of each packet
• simple voice session with 64kb/s would translate to 64xN kb/s
• Incoming bandwidth:
• If silence suppression is used then only active speakers send data
• In case of video lots of bandwidth might be consumed
• Unless only active speakers send video
• One of the end systems need to mix the traffic -more complex
end system.
• Mostly used solution for three-way conferencing.
El
□□□ FOR W HAT'S NEXT
8. How to Realize? Peer-to-Peer i TEKELEC
• Mixing is done at the end systems
• Increases processing over-head at the end
systems
• Increases overall delay
• Possibly mixed a multiple times
• If central points leave a conference the
conference is dissolved
• Security: Must trust all members
□
• Any member could send all data to non-trusted
users
• Access control: Must trust all members
• Any member can invite new members
• Floor control: requires cooperating users
□□□ FOR W HAT'S NEXT
9. IP Multicast addresses �
i
�TEKELEC
• Reserved IP addresses
■ special IP addresses (class D): 224.0.0.0 through 239.255.255.255
class D: 1110+28 bits 268 million groups (plus scope for add. reuse)
■ 224.0.0.x: local network only
■ 224.0.0.1: all hosts
■ Static addresses for popular services (e.g., SAP -Session
Announcement protocol)
□□□ FOR W HAT'S NEXT
10. Transport considerations �
i
�TEKELEC
• Transport layer:
• Most of the group communication systems on top of unicast sessions.
• Very popular in the past: multicast.
• Application layer:
• RTP over UDP.
• Why not TCP?
Better NAT traversal capabilites (used by Skype as the last solution).
But, not really suitable for real time feed back (Why?).
• Control protocol:
• Interactive conferencing: SIP, H.323, Skype, etc...
■ Webcast: RTSP, Real audio and other flavours.
• Session description:
• SOP (Session description protocol).
□□□ FOR W HAT'S NEXT
11. Multicast vs. Unicast i TEKELEC
A
• File transfer from C to A,B,D and E
• Unicast:
• Multicast:
multiple copies
single copy
□□□ FOR W HAT'S NEXT
12. IP Multicast i TEKELEC
• True N-way communication
• Any participant can send at any time and everyone receives the message
• Unreliable delivery
• Based on UDP: Why?
Avoids hard problem (e.g., ACK explosion)
• Efficient delivery
• Packets only traverse network links once (i.e., tree delivery)
• Location independent addressing
• One IP address per multicast group
• Receiver-oriented service model
• Receivers can join/leave at any time
• Senders do not know who is listening
□□□ FOR W HAT'S NEXT
13. Alternatives to Multicast �
i
�TEKELEC
• Use application level multicast
• Multicast routing done using end hosts
Hosts build a multicast routing tables and act as multicast router (but on
application level)
• User request content using unicast
• Content distributed over unicast to the final users
□□□ FOR W HAT'S NEXT
14. Conference mixer architecture �
i
�TEKELEC
• Main components for centralized conference mixer:
• Coder/ decoder(+ quality ensuring components).
• Synchronization
• Mixer
• Processing pipeline:
incoming
RTPstream
. decoder -
adaptive
playout
inter-stream
synchronisation
PLC
....
'I
----------1·
[confe_
rence ]
[ encoder)
••
_ _
outgoing ,..
RTPstream mixer
□□□ FOR W HAT'S NEXT
15. Application level Mt1lticast vs. unicast �
i
�TEKELEC
Content source
Traditional l§'rffl>
Content source
Application level
multicast
□□□ FOR W HAT'S NEXT
16. Audio Mixing �
i
�TEKELEC
G.711
G.711
G.711
A
E 1 1 1 1 1
G.729
B
E 1 1 1 1 1
X-A=B+C
X=A+B+C
E
G.729
GSM
0
I
GSM
_
C _.E X-C=B+A E
A
B
C
□□□ FOR W HAT'S NEXT Perio dic timer
E: Encoder
o: oecoder
17. Codecs Description SamplingRate
(Khz)
Blrate(Kbps) MOS
G.711 Pulse Code Modulation (PCM) 8 64 3.65
G.722 7 kHz sub channels 16 64 3.6
G.722.1 Coding for low frame loss systems 16 24132 nla
G.723.1 Dual rate speech coder 8 5.316.3 3.8
G.726 AdaptiveDifferentialPulse Code Modulation
(ADPCM)
8 16/24/32140 4.3
G.729 Conjugate-Structure Algebraic-Code-Excited
Linear-Prediction (CS-ACELP)
8 8 4.02
GSM Global System forMobile communications 8 13 2.5
ilBC internet Low Bitrate Codec 8 13.3 3.7
LPC-10 Linear-Predictive Codec 8 2.4 2.3
Speex Multivairable rate,with Narrow and Wide
bandoperation
8/16132 2.15-24.6(Narrow Band), 4·
44.2 (WideBand)
nla
This tablelists well knowncodecs used to compress/decompress human speech signals. SamI1lin9rate shows
the range of sound frequencies sampled; thehitrate determines the qualityof the sound reproduced bythe codec,
where a higherbit rate allows for better sound qualityto be transmitted,bearing in mine! a high bit ratewill utilise
more bandwidth.TheMean Opinion Score (MOS)is a numerical measure (a rating out of5) ofthe qualityof speech
at thereceiving end of a phone line.
Source:S'{.mbioNetworks
Codecs quality measurements �
i
�TEKELEC
• Codecs: Mean Opinion Score (MOS) measurements:
- -
□□□ FOR WHAT'S NEXT
18. Audio Quality i TEKELEC
• Mostly based on ,,Best effort" networks:
• No garanty for nothing.
• Packet get lost and/or delayed depending on the congestion status of the network.
• Depending on the codec, different quality can be reached:
• Mostly reducible to a ,,needed bandwidth vs. quality" tradeoff.
• Wanted properties: loss resistancy, low complexity (easy to implement in embedded hardware).
• Audio datas have to be played at the same rate they have been sampled:
• Different buffering techniques have to be considered, depending on the application.
• Pure streaming (Radio/TV) are not interactive and thus not influenced by the delay. Quality is
everything.
• Interactive conferencing need short delays to garanty the real time property. Delay is experienced
as ,,very annoying" by users in such applications.
□□□ FOR W HAT'S NEXT
19. Codecs: loss resistancy i TEKELEC
1.5
0 s 10
Packet Loss (%)
ilbcfreeware.org
iLBC
0 G.723.1
The tests were performed by Dynstat, Inc., an independent test laboratory.
Score system range
,: = bad,2 = poor, 3 = fair, 4 = good, s= excellent
Courtesy of GLOBAL IP SOUND
□□□ FOR W HAT'S NEXT
20. CODEC l{bps llIIPS Frame(1ns) JVIOS
G.711 64 0.34 0.l'.25 4.1
G.7'.26 32 14 0.1'.25 3.85
G.728 16 33 0.6'.25 3.61
G.7'.29a 8 10.5 10 3.7
G.7'.23.1 5.3 16 30 3.65
Codecs: complexity i TEKELEC
}.t[JPS Processing power givingfor Texas Instruments 54x DSP's
□□□ FOR W HAT'S NEXT
22. Audio quality: jitter i TEKELEC
• Delay variation (Jitter)
• Why?
varying buffering time at the routers on the packets' way.
Inherent to the transmission medium (WiFi).
• Depending on the buffering algorithm, quality impairements are mostly caused by a
too high ear-to-mouth delay or late loss.
• Ear-to-mouth delay:
Whereby delays under 100 ms are not noticeable, value over 400 ms make a natural
conversation very difficult.
• Late loss:
If the buffering delay is smaller than the actual delay, some packets arrive after their
playout schedule. This effect in called ,Late loss'.
• Delivering a good voice quality means, apart from packet loss concealment,
minimizing delay and late loss.
□□□ FOR W HAT'S NEXT
23. Adaptive playout �
i
�TEKELEC
• Static buffer
• Playout is delayed by a fix value.
• Buffer size has to be computed once for the rest of call.
• Some clients implement a panic mode, increasing the buffer size dramaticaly (x 2)
if the late loss rate is too high.
• Advantages:
Very low complexity.
• Drawbacks:
High delay.
Performs poorly if the jitter is too high.
Does not solve the clock skew problem.
□□□ FOR W HAT'S NEXT
24. Adaptive playout (2) i TEKELEC
• Dynamic buffer: talk spurt based.
• Within a phone, a speaker is rarely active all the time. So it is possible to distinguish between
voiced and unvoiced segments.
• Ajusting the buffering delay within unvoiced segments has no negative impact on the voice
quality.
• Using a delay prediction algorithm on the previous packets, we then try to calculate the
appropriate buffering delay for the next voiced segment.
• Advantages:
,:. Low complexity.
Solves the clock skew problem.
• Drawbacks:
Needs Voice Activity Detection (VAD), either at the sender or at the receiver.
High delay.
Performs poorly if the jitter is varying fast (within a voice segment).
□□□ FOR W HAT'S NEXT
25. WSOLA: how does it work? �
i
�TEKELEC
Pitch Period
, 2 3 4 5 3 4 5
Input
gm nt
I
I
I
Windo1
I
I
I
Found Similar
Waveform
I
gm nt
I
I
I
l an verlap- l'"il,w-
I a
d
d
4 I s
Output
5 5
(a)
160
(b)
□□□•i·hl*i=G----ow,
1
TimelsampleJ
Usedforcorrelation
26. Adaptive playout (3) i TEKELEC
• Dynamic buffer: packet based.
• Based on Waveform Similarity Overlap Add Time-scale modification
(WSOLA)
Enables packet scaling without pitch distortion.
Very good voice quality: scaling factors from 0.5 to 2.0 are mostly unhearable
if done locally.
But: High processing complexity.
□□□ FOR WHAT"S NEXT
27. Audio quality: packet loss �
i
�TEKELEC
• Packet loss:
• The impact on voice quality depends on many factors:
Average rate: rate under 2~5% (depending on the codec) are almost unhearable. Over
15% (highly depending on the burstiness), most calls are experienced as
ununderstandable.
Burstiness: depending on the loss distribution, the impairement can vary from small
artifacts due to packet loss concealment to really anoying quality loss.
• Modern codecs like iLBC, which are exclusively focused on VoIP, are much more
resistant and should thus be prefered to PSTN based low-bitrate codecs.
• Considering media servers and specially conferencing bridge, we should
concentrate on receiver based methods, as every other method would not be
compatible with the customers' phones.
• Solutions: support appropriate codecs, assert a minimal link quality and implement
a reasonable PLC algorithm.
□□□ FOR W HAT'S NEXT