SlideShare a Scribd company logo
1 of 64
Download to read offline
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing           1/12/2010




                         TOWARDS GLITCH-FREE
                         VOIP AND VIDEO CONFERENCING
                         JIN LI
                         MICROSOFT RESEARCH




                        Outline
                    2


                             Introduction
                             Anatomy of VoIP and Video Conferencing Systems
                             Audio/Video Components
                             Network Components
                             Summary




Jin Li, Microsoft Research                                                           1
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                        3      Introduction




                        Booming of IP Based Communication
                    4


                             Advanced voice over IP (VoIP)
                             Web-, audio-, video-conferencing
                             Tele-presence
                             Instant messaging
                             Calendar and other PIM functions
                             Email, fax and voice mail




Jin Li, Microsoft Research                                                   2
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                                                                    1/12/2010




                            Worldwide VoIP subscribers
                    5




                        • Worldwide VoIP service revenue was $24.1B in 2007, up 52% over 2006.
                        • It is expected that worldwide VoIP service to more than double over the next 4 years, to
                        $61.3B in 2011, with an annual growth rate of 26%.

                                                                             Source: 2008 Infonetics Research Inc,




                            US Broadband Telephony Forecast,
                    6
                            2007-2013




                                    VoIP subscriber base are predicted to double from 2007 to 2013.
                                                                             Source: Jupiter Research, US Broadband Telephony Forecast, 2008 to 2013




Jin Li, Microsoft Research                                                                                                                                    3
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                1/12/2010




                        VoIP Trend
                    7


                             IP networks are the next gen networks for all forms of
                             communication.
                             Broadband penetration is a key driver of VoIP expansion
                               Worldwide DSL subscriptions were at 205.9M at the end of
                               2007, up 23% from 2011. It is predicted to increase to 363.6M
                               in 2011.
                               Cable subscriptions were up 15% annually to 68M at the end of
                               2007, climbing to 97.3M in 2011.
                               Passive Optical Network (PON) subscribers were at 10.9M in
                               2007
                               Ethernet FTTH subscribers were at 1.7M in 2007
                               2004/2005 are breakthrough years for VoIP adoption




                        High End Systems – Tele-Presence
                    8




                                   Cisco Telepresence $299K               Tandberg Experia $225K




                              HP Halo $425K + $18K/mo         Polycom RPX210M $269K + $18.5K/mo




Jin Li, Microsoft Research                                                                                4
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                                                           1/12/2010




                         Worldwide Tele-presence Forecast
                    9
                         (2006-2012)

                                                                                          # of end points




                                                                                                      Revenue forecast




                                                                                                                  Source: 2008 IDC Research




                         Desktop Video Conferencing
                    10

                             Multiple solutions, often acted as add on to VoIP



                             Benefit
                                See faces of people you may not have met before
                                See facial expressions & gestures
                                Easier to follow a conversation
                                More interactive than phone
                                Get the general mood of ambience
                                See and show documents/objects
                             Drawback
                                Difficult to setup and planning
                                Network reliability
                                   Without(or poor) video, people talk; without(or poor) audio, people walk.
                                Interpersonal factors




Jin Li, Microsoft Research                                                                                                                           5
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         11
                                 Anatomy of VoIP and Video
                                 Conferencing Systems




                          Infrastructure vs. P2P
                    12


                              Infrastructure based        P2P based
                               Microsoft Unified            Skype
                              Communication

                                Cisco

                                Gtalk




Jin Li, Microsoft Research                                                   6
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         13
                              Infrastructure Based VoIP:
                              Microsoft Unified Communication




                          Unified Communication: Architecture
                    14




Jin Li, Microsoft Research                                                   7
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         Unified Communication: P2P Call
                    15




                         Key Steps
                    16


                             Alice calls Bob




                             Find Bob’s registered SIP endpoints




Jin Li, Microsoft Research                                                   8
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                 1/12/2010




                         Unified Communication: To VoiceMail
                    17




                         Key Steps
                    18

                             Alice calls Bob




                             Find Bob’s registered SIP endpoints



                                Bob doesn’t answer after a certain period, call re-routes




                             Voicemail system plays a greeting, records Alice’s msg, send the msg
                             to Bob’s email, and use speech server to transcribe the msg




Jin Li, Microsoft Research                                                                                 9
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing              1/12/2010




                         Unified Communication: PSTN UC
                    19




                         Key Steps
                    20


                             PSTN user Alice calls Bob

                             IP-PSTN gateway terminates the call




                             MS/Gateway routes call to mediation server, which
                             performs transcoding & ICE, etc..
                             Through director, the proper UC client is found




Jin Li, Microsoft Research                                                             10
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing               1/12/2010




                         21      P2P VoIP: Skype




                              P2P VoIP: Skype
                    22

                          Information
                              Debut: 08/2003, by N. Zennstrom and J. Friis, who
                              founded KaZaA
                              A P2P overlay network for VoIP and other app
                              Free intra-net VoIP and fee-based
                              SkypeOut/SkypeIn




Jin Li, Microsoft Research                                                              11
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing               1/12/2010




                         Skype Usage (Apr. 2008)
                    23


                         11 million concurrent Skype users on line in peak time
                         (180,000+ simultaneous calls)
                         309 million registered users worldwide, the largest
                         registered user base within eBay portfolio (33 million
                         added users for Q1FY08)
                         $126M revenue in Q1FY08 (61% YOY growth, 5.6
                         billion SkypeOut minutes in FY2007)
                         100 billion cumulative Skype-to-Skype minutes




                         Skype Share of International VoIP
                    24
                         Traffic




Jin Li, Microsoft Research                                                              12
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                              1/12/2010




                            Skype Gadget
                    25




                                                                                         IPDRUM mobile Skype
                                                                                               Cable

                                          Motorola CN620        IPEVO Free-1
                                          WiFi Cellphone       USB Skype Phone

                         Netgear Skype
                          Wi-Fi Phone

                                                                                          USB Mouse with Phone
                                 50 hardware partners, 150+ Skype certificated device.




                            Skype vs. VoIP
                    26


                               Public VoIP standard
                                 H.323, SIP
                               Skype is a proprietary VoIP solution
                                 Rely on P2P network for user directory
                                    Scalable without costly infrastructure
                                 Route calls through supernodes in Skype
                                    Universal firewall/NAT traversal
                                 Encrypted traffic (but you have to trust eBay/Skype)




Jin Li, Microsoft Research                                                                                             13
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                                    1/12/2010




                         Skype Ingredient (1)
                    27




                                     User retrieves ID from
                                         a skype server




                         Skype Network
                    28

                                                Skype
                                                Server
                         authentication



                                                         Supernode Overlay:




                                                                              any computer w/ sufficient CPU, memory
                                                                              & network bw & not behind firewall
                                                                              For distributed directory service
                                                                              Relay traffic for computer behind
                                                                              NAT/firewall




Jin Li, Microsoft Research                                                                                                   14
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                       1/12/2010




                         NAT Traversal (Skype)
                    29


                             NAT/Firewall detection
                               Try UDP connection
                               Try TCP connection (arb port, 80 (http), 443(https) )
                             Traversal
                               Direct connection if a) both clients have no NAT, b) one
                               client has no NAT, and one behind cone-NAT
                               Relay by supernode otherwise
                               Since Skype doesn’t need to pay for relay cost
                                  High bitrate wideband voice codec (>24kbps)




                         Skype : Call Routing Through Supernode
                    30

                                                   Skype
                                                   Server
                          authentication



                                                            Supernode Overlay:




                     Route call through
                    supernodes
                     High bitrate wideband voice
                    codec (>24kbps)




Jin Li, Microsoft Research                                                                      15
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                          1/12/2010




                         Skype Encryption
                    31




                                  Peer 1
                                                                     Peer 2




                             256-bit AES over 128 bit data block
                             1536/2048 RSA for key negotiation (2048/2048
                             for paid service)




                         Skype: Complete Black box
                         (Security by Obfuscation )
                    32


                             Almost everything is obfuscated
                               Many protections, anti-debugging tricks, ciphered code
                               Avoid static disassembly: xor binary with a hard-coded key,
                               erasure beginning of the code, own packer
                               Code integrity check: use checksum to avoid breakpoint
                               Anti-debugging technique: anti softice, integrity check
                               Code obfuscation
                               Network obfuscation




Jin Li, Microsoft Research                                                                         16
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         33     Audio/Video Component




                          Audio/Video Component
                    34


                              Audio Codec
                              Video Codec
                              Acoustic Echo Cancellation




Jin Li, Microsoft Research                                                  17
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                    1/12/2010




                      35        Audio Codec




                        G.711 (PCM)
                             Still widely used today: PSTN interface
                             If uniform quantization
                                12 bits * 8 k/sec = 96 kbps
                              Non-uniform quantization
                                65 kbps DS0 rate
                                North America: µ-law

                               Other countries: A-law


                               MOS of about 4.3
                                                                  µ = 255 , A = 87.6




Jin Li, Microsoft Research                                                                   18
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                     1/12/2010




                                               G.722.1: Siren

                             Audio bandwidth:       14 kHz
                             Sample rate:           32 kHz
                             Bit rate:              24, 32, and 48 kbit/s
                             Algorithm:              Transform coding (Siren14TM)
                             Frame size:            20 ms
                             Algorithmic delay:      40 ms
                             Complexity:            <11 WMOPS (enc/dec)
                             Available on royalty-free licensing terms (from Polycom)




                                         Siren Encoder




Jin Li, Microsoft Research                                                                    19
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing             1/12/2010




                         Siren Decoder
                    39




                                               Siren Codec

                             Audio sampled at 32kHz
                             Operates on frames of 20 ms corresponding to 640
                             samples
                             Based on transform coding, using a Modulated
                             Lapped Transform (MLT)
                             A Look-ahead of 20 ms due to 50% overlap between
                             frames
                             Total algorithmic delay of 40 ms




Jin Li, Microsoft Research                                                            20
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                              1/12/2010




                             MLT - Modulated Lapped Transforms
                   41/75




                                       Spatial Response              Frequency Domain




                             Categorization & SQVH
                    42




                                                                    Quantization Used by SQVH
                           Expected # of Bits For Each Category




                                                                  Vector Property Used in SQVH




Jin Li, Microsoft Research                                                                             21
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                    1/12/2010




                         AMR-WB Basics
                             “Wideband coding of speech at around 16kbit/s using
                             adaptive multi-rate wideband (AMR-WB)”
                             Adopted as ITU-T G722.2, and also as 3GPP spec TS
                             26.190.
                             “Foreseen applications are: VoIP and internet
                             applications, Mobile Com., PSTN app, ISNDN wideband
                             telephony, ISDN videophone and videoconf.”
                             Sampling rate 16KHz;
                             Bitrate: 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85,
                             23.05, and 23.85 kbit/s.
                             20 ms frame.
                             ACELP (algebraic code excited LPC).




                        Pre-processing

                        Sampling rate conversion: 16 to 12.8KHz; (now a
                        20ms frame has 256 samples…)
                        HP filter (cut off @ 50Hz)
                        Pre-emphasis filter ( 1 -.68 z-1 )




Jin Li, Microsoft Research                                                                   22
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                 1/12/2010




                         LP analysis and Quant.
                             One 30 ms asymmetric window
                               5 ms look-ahead
                             Obtain LPC Coef.:
                               Compute correlation;
                               Multiply by window (add 60HZ BW expansion);
                               R(0) = 1.0001R(0) ( adds 40dB noise floor);
                                levinson-durbin to compute LP coefficients.
                             LP to ISP
                             Quantize in ISP q-domain.




                         LP analysis and Quant. (2)
                             Quantization bottom line:
                               46 bits/frame on most modes;
                               36 bits/frame on 6.60 Kbps mode;
                               M.A. prediction with 1/3 gain;
                             Quantizer: S-MSVQ (split multistage VQ)
                             Both quantized and unquantized coefs will be used in
                             algorithm.




Jin Li, Microsoft Research                                                                23
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                   1/12/2010




                         subframes
                             Each 20ms (256 samples) frame is divided in 4 sub-
                             frames (64 samples each).
                             Interpolated LPC coefficients obtained for each sub-
                             frame
                               Interpolation done in ISP q-domain




                         Perceptual weighting
                             Weighting filter is:
                                      W(z) = A(z/γ1).Hde-emph(z)

                             This helps solving the tilt problem, which is worse in
                             WB speech.




Jin Li, Microsoft Research                                                                  24
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                              1/12/2010




                         Excitation
                             Searched for each 5ms sub-frame.
                             Two components:
                               Adaptive codebook (past excitation)
                               Algebraic codebook
                             “target” signal obtained by filtering the LPC residual
                             (for the sub-frame) through the synthesis LPC filter
                             and weighting filter.




                         Adaptive codebook
                         Start with “open loop” pitch estimation
                               based on cross correlation;
                              Low-value bias;
                              ‘last value’ value bias (actually 5-frame median), if voiced.
                         Re-compute with “closed loop”, around initial value ±7, and up to ¼ sample precision.
                              “Analysis by synthesis” based;
                              Restrict to values allowed by encoding step.
                         Start with “open loop” pitch estimation
                               based on cross correlation;
                              Low-value bias;
                              ‘last value’ value bias (actually 5-frame median), if voiced.
                         Re-compute with “closed loop”, around initial value ±7, and up to ¼ sample precision.
                              “Analysis by synthesis” based;
                              Restrict to values allowed by encoding step.




Jin Li, Microsoft Research                                                                                             25
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                1/12/2010




                        Algebraic codebook

                        Remove contribution of (unquantized) prediction
                        from adaptive codebook from the “target signal”
                        to obtain new target.
                        Divide sub-frame into 4 alternating tracks.




                         Algebraic codebook (2)
                             Select best pulses, for a total of 24 (6),
                             18(5-4), 16 (4), 12(3), 10(3-2), 8(2), 4(1), 2(.5),
                             depending on bitrate.
                             Pulses + Two filters:
                               Periodicity enhancement: 1/(1-.85z-T);
                               Tilt: 1/(1- β1 z -1)
                             Tricks to save bits in encoding pulse position;
                             Tricks to save computation on pulse search.




Jin Li, Microsoft Research                                                               26
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                    1/12/2010




                         Wrap up
                             High pass, de-emphasis;
                             Upsample back to 16KHz;
                             Add high frequency components.




                         High Freq. Components
                             Random noise used as excitation
                             LP filter is extended to 8KHz.
                             Energy of excitation based on energy of base-band
                             residual, and voicing info, except in highest bitrate
                             mode.
                             Extension of LPC filter is equivalent to mapping 5.1 to
                             5.6Khz to 6.4 to 7.0KHz;
                             Band-pass filtered to 6-7KHz, and added to output
                             signal.




Jin Li, Microsoft Research                                                                   27
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         55   Video Codec




                          H.264/AVC Encoder
                    56




Jin Li, Microsoft Research                                                  28
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                          1/12/2010




                         H.264/AVC Decoder
                    57




                         Reference Picture Management
                    58

                             Reference pictures are stored in decoded picture buffer (DPB)
                             Short/long term reference picture, a decoded frame may be
                             marked as
                               unused for reference
                               short term picture
                               long term picture
                             Sliding Window” memory management
                               Keep #(long_term_pic+ short_term_pic)
                               Remove short term picture if lack of space
                             Adaptive memory control
                               issued by encoder
                               change the type of the ref frame
                             IDR (Instantaneous Decoder Refresh)
                               clear ref buffer
                               I frame




Jin Li, Microsoft Research                                                                         29
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing               1/12/2010




                         Slice Group
                    59


                             Former called “FMO” (Flexible Macroblock
                             Ordering)
                             A subset of the macroblocks and may contain one or
                             more slices
                               Error resilience




                         Inter Prediction
                    60


                             Variable block size
                             ¼ pixel motion compensation
                             Interpolation




Jin Li, Microsoft Research                                                              30
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                 1/12/2010




                         Motion Vector (MV) Prediction
                    61

                             Efficiently encode correlated MV
                               Other than 16×8 and 8×16, MVp=(MVA+MVB+MVC) /3
                               16×8, MVp of the upper =MVB ;MVp of the lower =MVA
                               8×16, MVp of the left =MVA ;MVp of the right =MVC
                               For skipped macroblocks, do as 16 × 16 Inter mode




                         Intra Prediction
                    62


                             For Luma samples
                               4*4 block: 9 prediction modes
                               16*16 block: 4 modes
                               I_PCM: transmit the encoded samples w/o pred. &
                               trans




Jin Li, Microsoft Research                                                                31
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                            1/12/2010




                         Prediction Modes
                    63




                                                       4x4 Luma




                                                       Intra 16x16
                                      8x8 Chroma is similar to 16x16 luma intra




                         Signaling of Intra Prediction Modes
                    64

                             Mode choices need to be signaled to the decoder, but compactly
                             The prediction mode for luma coded in Intra-16 16 mode or
                                                                                  ×
                             chroma coded in Intra mode is signaled in the macroblock header
                             Intra modes for neighboring 4 4 blocks are often correlated
                                                               ×
                                                               B
                                                 A             C
                             If A and B are available, C = min (A,B)
                             else if (neither A nor B are available) C = 2 (DC)
                             else C = available (A,B)
                             Use prev_intra4x4_pred_mode flag & rem_intra4x4_pred_mode
                             flag to indicate mode selected.




Jin Li, Microsoft Research                                                                           32
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                    1/12/2010




                         Deblocking filter
                    65


                             Filter 4 vertical/horizontal boundaries of luma
                             Filter 2 vertical/horizontal boundaries of chroma
                             Affect up to 3 samples on the either side.
                             The filter is stronger at places where there is likely to be
                             significant blocking distortion
                               e.g.: such as the boundary of an intra coded macroblock or a boundary
                               between blocks that contain coded coefficients.




                         Transform and Quantisation
                    66


                             3 transforms
                               DCT-base transform for all 4*4 residual block




                         a=1/2, b = (2/5)1/2, d = 1/2
                               Hadamard transform for 4*4 luma DC coefficient (in
                               16*16 intra)
                               Hadamard transform for 2*2 chroma DC coefficient




Jin Li, Microsoft Research                                                                                   33
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                     1/12/2010




                         Combine Quantization into Scaling
                    67
                         of Transform


                                                4x4 DC Intra Luma

                         |ZD(i, j)| = (|YD(i, j)| MF(0,0) + 2f ) >> (qbits +1)
                         sign (ZD(i, j)) = sign (YD(i, j))



                         |ZD(i, j)| = (|YD(i, j)| MF(0,0) + 2f ) >> (qbits +1)
                         sign (ZD(i, j)) = sign (YD(i, j))




                         CAVLC: Context-Based Adaptive
                    68
                         Variable Length Coding
                             Characteristics:
                               Run-level coding to compact zero string
                               Trailing ones (+1, -1 after 0)
                               Number of nonzero coefficient in neighboring blocks is
                               correlated
                               Choice VLC lookup table for level parameter for level
                               magnitude




Jin Li, Microsoft Research                                                                    34
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                              1/12/2010




                          CAVLC Encoding
                    69

                              1. Encode the number of coefficients and trailing ones (coeff token)
                                TotalCoeffs : 0 ~ 16
                                TrailingOnes : 0 ~ 3
                                   if more than 3 TrailingOnes, only last three are treated as ‘special cases’
                                Four look up table
                                   Three variable-length, one fixed-length
                                   Choice depend on neighboring blocks
                              2. Encode the sign of each TrailingOne: In reverse order
                              3. Encode the levels of the remaining nonzero coefficients
                                level_prefix, level_suffix
                              4.Encode the total number of zeros before the last coefficient
                                Zero-runs at start of the array need not to be encoded
                              5. Encode each run of zeros
                                If less then 3 TrailingOnes, the first nonzero coefficient is adjusted




                         70       Acoustic Echo Cancellation




Jin Li, Microsoft Research                                                                                             35
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                           Acoustic Echo Cancellation
                    71



                         From Audio
                         Decoder




                         To Audio
                         Encoder

                                      Acoustic Echo Cancellation




                           Acoustic Echo Cancellation Module
                    72




Jin Li, Microsoft Research                                                  36
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                 1/12/2010




                         Adaptive Traversal Filter
                    73


                             FIR filter – inherently stable
                               Length of the filter affects other performance, convergence,
                               goodness, and complexity.
                               Filter introduces errors since it is trying to model IIR response.
                             Short Filters
                               128 – 256 coefficients (taps)
                               Faster convergence, but final solution has more residual error
                               Less complex O(N).
                             Long Filters
                               512-1024
                               Slower convergence, but final solution has less error.
                               More complex, as algorithm can be O(N2)




                         Challenges
                    74

                             Dynamic range of the human ear = 120dB.
                               Even quiet echoes can be heard.
                             Longer delays from satellite (300-500ms), VoIP
                               Ear is more sensitive to longer delays.
                               More difficult to find the beginning of the echo.
                               Long filters (~1000 taps) are needed (complexity &
                               convergence)
                             Near-end noise: corrupt the echo, decreasing the
                             cancellers ability to converge.
                             Acoustic echo paths can change rapidly
                               More difficult for the AEC to remain converged.
                             Nonlinear echo components
                               Speakers driven beyond linear region.




Jin Li, Microsoft Research                                                                                37
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         75   Network Component




                          IP-based VoIP / Video Conference
                    76




Jin Li, Microsoft Research                                                  38
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         77   Internet Primer




                          Internet : Grand View
                    78




Jin Li, Microsoft Research                                                  39
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                        1/12/2010




                            Impact on ISPs
                    79



                                                                     Economics of ISP relationships
                         transit                    peering entity
                                                                       sibling relationship
                                                      boundary           several ISPs belong to same org
                                   peering

                                                                       peering relationship
                                                                         mutual beneficial free
                                                                         agreement (to certain extent)


                         sibling             sibling entity            transit relationship
                                               boundary
                                                                         one ISP pays another




                            Inside ISP
                    80




Jin Li, Microsoft Research                                                                                       40
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         ISP POP (Point of Presence)
                    81




                         Home Networking
                    82




Jin Li, Microsoft Research                                                  41
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing            1/12/2010




                         83   Network Characteristics




                          Under-provisioned Links
                    84




                              Branch                                  Branch




Jin Li, Microsoft Research                                                           42
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         Growth Trends
                    85




                         Packet Loss vs. Jitter (vs. Delay?)
                    86




Jin Li, Microsoft Research                                                  43
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         The Usual Suspects
                    87




                         Packet Bursts
                    88




Jin Li, Microsoft Research                                                  44
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         What kind of Enterprise User?
                    89




                         How QoS can help
                    90




Jin Li, Microsoft Research                                                  45
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                1/12/2010




                         QoS helps inside and between
                    91
                         branches!




                         Observation
                    92


                             IP-based communication in the enterprise is growing
                             Empirical results show poor calls for Wireless and
                             VPN users
                             QoS (DiffServ) is both used and useful!




Jin Li, Microsoft Research                                                               46
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing             1/12/2010




                         93     Available Bandwidth Estimation




                          What is Available Bandwidth (ABW)?
                    94


                              ABW is the left-over capacity along an Internet
                              path




Jin Li, Microsoft Research                                                            47
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                1/12/2010




                         Why Is It Useful?
                             Maximizing QoE (Quality of Experience) in A/V
                             conferencing
                               Audio prefers minimum delay (high priority)
                               Video prefers maximum rate (low priority)




                   One Way Delay (OWD) = propagation delay (constant) + queuing delay (variable)

                             One solution: measure ABW, encode and send
                             video at the ABW rate




                         Typical Targeting Scenario




                             First hop is the bottleneck
                               Cable modem, DSL, high-speed link…
                             Timescale for the ABW estimation: 2-4 seconds




Jin Li, Microsoft Research                                                                               48
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing       1/12/2010




                         Why Is Measuring ABW Hard?
                             Available bandwidth changes over time
                                ABW measurements must be quick

                             Audio packets (along the same path) should
                             experience minimum delay
                                Measurement must be non-intrusive




                         Two Models
                             Probe Rate Model (PRM) based solutions
                               Pathload, TOPP, Pathchirp, Bfind, PTR …
                             Probe Gap Model (PGM) based solutions
                               Spruce, Delphi, IGI, Moseab …




Jin Li, Microsoft Research                                                      49
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing           1/12/2010




                        Pathload (PRM) [Jain & Dovrolis]
                      Send probe trains at various rates
                      ABW is the probe rate at transition, where OWD is
                      increasing (queuing delay is observed)




                        Spruce (PGM) [Jacob et. al.]
                             Send probe pairs/train at Ri (Ri > A), measure
                             sending gaps and receiving gaps
                             Compute A directly




Jin Li, Microsoft Research                                                          50
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                         1/12/2010




                         Advantage/Disadvantages of The
                         Approaches

                                       Advantages                Disadvantages
                        PGM based      Fast estimation:          Assumptions are not easy
                        approaches                               to verify in practice
                                       Estimation can be done in
                                       single probe.
                        PRM based      No assumption             Slow estimation:
                        approaches
                                                                 iterative probes




                      102    Forward Error Correction




Jin Li, Microsoft Research                                                                        51
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                 1/12/2010




                         Block Based Erasure Resilient Coding
                   103

                     Original data:    1    2    3               k     k messages

                     ERC:              1    2    3               k   k+1                    n

                     At a certain
                     instance          X         X                                  X      X
                                                            X         X




                         Some of the blocks may be lost in delivery. However, as long as there
                         are at least k blocks delivered, the original data can be reconstructed.




                         ERC in VoIP and Video Conferencing
                   104


                            VoIP
                              Mainly packet replication, due to small VoIP packet size
                              & low delay requirement
                            Video Conferencing
                              Packet loss protection (for I frame or P frame in HD)
                              Each frame is separate into k msg, and protect by n-k
                              msg. As long as there are less than n-k loss, the
                              transmission succeeds




Jin Li, Microsoft Research                                                                                52
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                       1/12/2010




                           ERC Terms
                   105


                              Number of Original Block: k
                              Number of Coded Block: n
                              Rate of ERC:   k/n
                              MDS: Maximum Distance Separable
                                  Any k of n coded block may recover the original
                                  The theoretical optimal performance




                           Erasure Encoding: Mathematics
                         Original data:   x1    x2                          xk

                         Coded data:      y1    y2                                  yn




                                               : Vectors on Galois Field.

                                                                                    106




Jin Li, Microsoft Research                                                                      53
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                               1/12/2010




                           Example: ERC of 10MB
                         Original data    x1   x2          xk k=10, GF(28), each vector is 1MB.
                         (10MB):
                         Coded data:      y1   y2                                         yn
                         (n=30)




                             30




                                                    10     1M           1M

                                                                                          107




                           Erasure Decoding: Mathmatics
                   108
                         Original data:   x1   x2          xk

                         Coded data:      y1   y2                                         yn

                                                                              Available

                                                                                Code select




Jin Li, Microsoft Research                                                                              54
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                            1/12/2010




                           Erasure Decoding: Mathmatics
                   109

                         Original data:          x1       x2                       xk

                         Coded data:             y1       y2                                              yn




                                Original data can be recovered if the sub-generator matrix
                                has a full rank k.




                   Systematic vs Non-Systematic ERC
                   110

                     Original data:          1        2        3               k         k messages

                     Non systematic          1        2        3               k        k+1           n
                     ERC:

                    Systematic               1        2        3              k         k+1           n
                    ERC:

                             Systematic ERC
                                 Slightly low encoding & decoding complexity
                                 Even can’t recover, we can still use some original msg




Jin Li, Microsoft Research                                                                                           55
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing         1/12/2010




                         Reed-Solomon
                   111


                             Has been around for decades
                             Has systematic form
                             Cauchy Reed-Solomon Code




                                                 Tutorial, Jin Li




                         Reed-Solomon Decoding



                                                  Inverse



                                      Receive




                                                                      112




Jin Li, Microsoft Research                                                        56
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing            1/12/2010




                      113    Dejitter Buffer




                        Variable Delay & Dejitter Buffer
                                  Queuing      Queuing    Queuing
                                   Delay        Delay      Delay




                                                                    Dejitter
                                                                    Buffer



                                            Queuing delay
                                            Dejitter buffers
                                            Variable packet sizes




Jin Li, Microsoft Research                                                           57
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                     1/12/2010




                        Fixed Dejitter Buffer – Budget For Worst Case

                                   Coder Queuing
                                   Delay Delay                              Dejitter Buffer
                                   40 ms 4-50 ms                                50 ms
                         Site A                                                               Site B
                                                          Propagation
                                                          Delay—8 ms
                                                       (128kbps Bandwidth




                             Total End-to-End Delay
                               Codec delay: 40ms
                               Propagation delay: 8ms
                               Dejitter buffer: 50ms
                                  To accommodate queuing delay: 0-50 ms
                               Total delay: 98ms




                        Dejitter Buffer Size & Late Loss

                                                                         late loss




                                                       buffering delay

                                                                    Fixed playout deadline and jitter
                                      Playout Jitter                absorption:
                                                                    The playout rate is constant
                                                                    The tradeoff is between Dejitter
                                                                    buffer size and late loss
                              Delay              Packet Loss




Jin Li, Microsoft Research                                                                                    58
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                                                           1/12/2010




                         Adaptive Playout and Dejitter Buffer Adaptation




                                                         buffering delay

                                                                   Adaptive playout and jitter adaptation
                                     Playout Jitter                Scaling of voice/video packets in highly dynamic
                                                                   way
                                                                   Playout schedule set according to past delays
                                                                   recorded
                                                                      Usually dejitter buffer size expand quickly to late
                                                                      packet arrival, and shrink slowly when jitter reduces
                             Delay              Packet Loss
                                                                   Improved tradeoff between buffering delay and
                                                                   late loss
                                                                   Playout rate is not constant




                         Adaptive Play Out
                   118



                                                          Audio Adaptive
                                                             Playout




                             Packets push into Adaptive Playout module
                             Render requests new waveform seg for playout
                             Playout module passes packet to audio decoder




Jin Li, Microsoft Research                                                                                                          59
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                        1/12/2010




                      119     Packet Loss Concealment




                        Audio Packet Loss Concealment
                                             L                                 ∆L

                                      i-2   i-1   i lost        i+1    i+2

                         alignment found by correlation                             time

                                            i-2       i-1             i+1     i+2
                                                                                    time
                                                           2L
                                                                      1.3 L


                         Depend on voiced & unvoiced segment




Jin Li, Microsoft Research                                                                       60
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         Voiced segments




                         Unvoiced segments




Jin Li, Microsoft Research                                                  61
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing                1/12/2010




                         Concealment as (bi-directional)
                         stretching




                         Video Packet Loss Concealment
                   124


                             Spatial Concealment
                               Use spatial correlation
                                 E.g., bilinear interpolation
                                 Projection onto convex sets
                             Temporal Concealment
                               Use correlation exists between consecutive frames
                                 Temporal replacement
                                 Boundary matching




Jin Li, Microsoft Research                                                               62
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                          Spatial-Temporal Concealment
                   125




                         126   Summary




Jin Li, Microsoft Research                                                  63
CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing   1/12/2010




                         Summary
                   127

                             VoIP/Video Conference Systems
                               Infrastructure based
                               P2P based
                             Audio/Video Components
                               Audio codec
                               Video codec
                               Acoustic echo cancellation
                             Network components
                               Primer of the Internet
                               Network characteristics
                               Available bandwidth estimation
                               Forward error correction (FEC)
                               Dejitter buffer
                               Packet loss concealment




Jin Li, Microsoft Research                                                  64

More Related Content

What's hot

Mei Yick Offer MPLS
Mei Yick Offer MPLSMei Yick Offer MPLS
Mei Yick Offer MPLSTony Ma
 
Tips for fulfilling patent application
Tips for fulfilling patent applicationTips for fulfilling patent application
Tips for fulfilling patent applicationfungfung Chen
 
32983 hpn ucc update mar 2013
32983 hpn ucc update mar 201332983 hpn ucc update mar 2013
32983 hpn ucc update mar 2013gmazuel
 
Securing Unified Communications Systems
Securing Unified Communications SystemsSecuring Unified Communications Systems
Securing Unified Communications SystemsVoxeo Corp
 
H.320 Videoconferencing over Frame Relay for The World Bank
H.320 Videoconferencing over Frame Relay for The World BankH.320 Videoconferencing over Frame Relay for The World Bank
H.320 Videoconferencing over Frame Relay for The World BankVideoguy
 
Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...
Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...
Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...Global Business Events
 
Preparing for 4G Video Services
Preparing for 4G Video ServicesPreparing for 4G Video Services
Preparing for 4G Video ServicesYankee Group
 
A New Approach to Video Conferencing
A New Approach to Video ConferencingA New Approach to Video Conferencing
A New Approach to Video ConferencingVideoguy
 
Presentación Collaboration Video Cablevisión Day 2010
Presentación Collaboration Video Cablevisión Day 2010Presentación Collaboration Video Cablevisión Day 2010
Presentación Collaboration Video Cablevisión Day 2010Logicalis Latam
 
Microsoft Word - Video Conferencing White Paper _FINAL 6_
Microsoft Word - Video Conferencing White Paper _FINAL 6_Microsoft Word - Video Conferencing White Paper _FINAL 6_
Microsoft Word - Video Conferencing White Paper _FINAL 6_Videoguy
 

What's hot (14)

NGN voice corporate seminar
NGN voice corporate seminarNGN voice corporate seminar
NGN voice corporate seminar
 
Mei Yick Offer MPLS
Mei Yick Offer MPLSMei Yick Offer MPLS
Mei Yick Offer MPLS
 
Tips for fulfilling patent application
Tips for fulfilling patent applicationTips for fulfilling patent application
Tips for fulfilling patent application
 
32983 hpn ucc update mar 2013
32983 hpn ucc update mar 201332983 hpn ucc update mar 2013
32983 hpn ucc update mar 2013
 
10 fn s15
10 fn s1510 fn s15
10 fn s15
 
Ipcbu0903 B
Ipcbu0903 BIpcbu0903 B
Ipcbu0903 B
 
Securing Unified Communications Systems
Securing Unified Communications SystemsSecuring Unified Communications Systems
Securing Unified Communications Systems
 
Digital content distribution
Digital content distributionDigital content distribution
Digital content distribution
 
H.320 Videoconferencing over Frame Relay for The World Bank
H.320 Videoconferencing over Frame Relay for The World BankH.320 Videoconferencing over Frame Relay for The World Bank
H.320 Videoconferencing over Frame Relay for The World Bank
 
Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...
Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...
Jonathan Beale, Sales Manager UK&I at Vidyo - HD Visual Communications Revolu...
 
Preparing for 4G Video Services
Preparing for 4G Video ServicesPreparing for 4G Video Services
Preparing for 4G Video Services
 
A New Approach to Video Conferencing
A New Approach to Video ConferencingA New Approach to Video Conferencing
A New Approach to Video Conferencing
 
Presentación Collaboration Video Cablevisión Day 2010
Presentación Collaboration Video Cablevisión Day 2010Presentación Collaboration Video Cablevisión Day 2010
Presentación Collaboration Video Cablevisión Day 2010
 
Microsoft Word - Video Conferencing White Paper _FINAL 6_
Microsoft Word - Video Conferencing White Paper _FINAL 6_Microsoft Word - Video Conferencing White Paper _FINAL 6_
Microsoft Word - Video Conferencing White Paper _FINAL 6_
 

Viewers also liked

Executive-Assistant-Jobs
Executive-Assistant-JobsExecutive-Assistant-Jobs
Executive-Assistant-JobsRalph290Roman
 
Microsoft Office 2010 by Mr. EJ Lopez
Microsoft Office 2010 by Mr. EJ LopezMicrosoft Office 2010 by Mr. EJ Lopez
Microsoft Office 2010 by Mr. EJ Lopezkristine1018
 
Outlook 2010 - How to Guide
Outlook 2010 - How to GuideOutlook 2010 - How to Guide
Outlook 2010 - How to GuideRShankar31
 
ProveIt Test Results - Jan 2010
ProveIt Test Results - Jan 2010ProveIt Test Results - Jan 2010
ProveIt Test Results - Jan 2010Tracy Anne Rose
 
Kenexa Prove It Microsoft PowerPoint 2010
Kenexa Prove It Microsoft PowerPoint 2010Kenexa Prove It Microsoft PowerPoint 2010
Kenexa Prove It Microsoft PowerPoint 2010Sabrina Aziz
 
Word 20 questions
Word 20 questionsWord 20 questions
Word 20 questionswildman099
 
Prove it!
Prove it!Prove it!
Prove it!SalemJC
 
Introducing PPT 2010
Introducing PPT 2010Introducing PPT 2010
Introducing PPT 2010gueste814d58
 
Courseware microsoft outlook 2010
Courseware microsoft outlook 2010Courseware microsoft outlook 2010
Courseware microsoft outlook 2010Mutd Ph
 
Microsoft PowerPoint 2010
Microsoft PowerPoint 2010Microsoft PowerPoint 2010
Microsoft PowerPoint 2010nhumar
 
Presentation Skills for Teachers version 3.0
Presentation Skills for Teachers  version 3.0Presentation Skills for Teachers  version 3.0
Presentation Skills for Teachers version 3.0Simon Jones
 
Microsoft Word 2010 Beginning Class
Microsoft Word 2010 Beginning ClassMicrosoft Word 2010 Beginning Class
Microsoft Word 2010 Beginning ClassLady_Informationado
 
How to create a basic power point presentation
How to create a basic power point presentationHow to create a basic power point presentation
How to create a basic power point presentationjoluisae
 
What Makes Great Infographics
What Makes Great InfographicsWhat Makes Great Infographics
What Makes Great InfographicsSlideShare
 
Masters of SlideShare
Masters of SlideShareMasters of SlideShare
Masters of SlideShareKapost
 
10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation OptimizationOneupweb
 
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to SlideshareSTOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to SlideshareEmpowered Presentations
 

Viewers also liked (20)

Executive-Assistant-Jobs
Executive-Assistant-JobsExecutive-Assistant-Jobs
Executive-Assistant-Jobs
 
Microsoft Office 2010 by Mr. EJ Lopez
Microsoft Office 2010 by Mr. EJ LopezMicrosoft Office 2010 by Mr. EJ Lopez
Microsoft Office 2010 by Mr. EJ Lopez
 
Outlook 2010 - How to Guide
Outlook 2010 - How to GuideOutlook 2010 - How to Guide
Outlook 2010 - How to Guide
 
ProveIt Test Results - Jan 2010
ProveIt Test Results - Jan 2010ProveIt Test Results - Jan 2010
ProveIt Test Results - Jan 2010
 
Kenexa Prove It Microsoft PowerPoint 2010
Kenexa Prove It Microsoft PowerPoint 2010Kenexa Prove It Microsoft PowerPoint 2010
Kenexa Prove It Microsoft PowerPoint 2010
 
Word 20 questions
Word 20 questionsWord 20 questions
Word 20 questions
 
Prove it!
Prove it!Prove it!
Prove it!
 
My first power point
My first power pointMy first power point
My first power point
 
Manual de microsoft power point 2010
Manual de microsoft power point 2010Manual de microsoft power point 2010
Manual de microsoft power point 2010
 
PowerPoint 2010
PowerPoint 2010PowerPoint 2010
PowerPoint 2010
 
Introducing PPT 2010
Introducing PPT 2010Introducing PPT 2010
Introducing PPT 2010
 
Courseware microsoft outlook 2010
Courseware microsoft outlook 2010Courseware microsoft outlook 2010
Courseware microsoft outlook 2010
 
Microsoft PowerPoint 2010
Microsoft PowerPoint 2010Microsoft PowerPoint 2010
Microsoft PowerPoint 2010
 
Presentation Skills for Teachers version 3.0
Presentation Skills for Teachers  version 3.0Presentation Skills for Teachers  version 3.0
Presentation Skills for Teachers version 3.0
 
Microsoft Word 2010 Beginning Class
Microsoft Word 2010 Beginning ClassMicrosoft Word 2010 Beginning Class
Microsoft Word 2010 Beginning Class
 
How to create a basic power point presentation
How to create a basic power point presentationHow to create a basic power point presentation
How to create a basic power point presentation
 
What Makes Great Infographics
What Makes Great InfographicsWhat Makes Great Infographics
What Makes Great Infographics
 
Masters of SlideShare
Masters of SlideShareMasters of SlideShare
Masters of SlideShare
 
10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization10 Ways to Win at SlideShare SEO & Presentation Optimization
10 Ways to Win at SlideShare SEO & Presentation Optimization
 
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to SlideshareSTOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
 

Similar to Microsoft PowerPoint - ccnc10_voip

COLT Telecom - VoIP For Enterprise Customers, a COLT Business Briefing
COLT Telecom - VoIP For Enterprise Customers, a COLT Business BriefingCOLT Telecom - VoIP For Enterprise Customers, a COLT Business Briefing
COLT Telecom - VoIP For Enterprise Customers, a COLT Business BriefingAlessandro Vigilante
 
Cisco Video Data Explosion
Cisco Video Data ExplosionCisco Video Data Explosion
Cisco Video Data Explosionmenkento
 
What’s Next for Mobile Video
What’s Next for Mobile VideoWhat’s Next for Mobile Video
What’s Next for Mobile VideoIMTC
 
Rumana Akther Id#072842056
Rumana Akther Id#072842056Rumana Akther Id#072842056
Rumana Akther Id#072842056mashiur
 
Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...
Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...
Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...dvalik
 
WEB CONFERENCING
WEB CONFERENCINGWEB CONFERENCING
WEB CONFERENCINGRobbySahoo
 
HTML5 vidéo : Facts and fiction
HTML5 vidéo : Facts and fictionHTML5 vidéo : Facts and fiction
HTML5 vidéo : Facts and fictionBertrand CHARLET
 
Sdp Evolution Issue 1
Sdp Evolution Issue 1Sdp Evolution Issue 1
Sdp Evolution Issue 1Alan Quayle
 
Use Vo Ip Solution For Improve Enterprise Communication Ntust0716
Use Vo Ip Solution For Improve Enterprise Communication Ntust0716Use Vo Ip Solution For Improve Enterprise Communication Ntust0716
Use Vo Ip Solution For Improve Enterprise Communication Ntust0716jones1812
 
Radvision webinar: Making Real Time Video Work Over The Internet
Radvision webinar: Making Real Time Video Work Over The InternetRadvision webinar: Making Real Time Video Work Over The Internet
Radvision webinar: Making Real Time Video Work Over The InternetRADVISION Ltd.
 
Riding the Crest of the Video Tsunami: Nevada Telecom
Riding the Crest of the Video Tsunami: Nevada TelecomRiding the Crest of the Video Tsunami: Nevada Telecom
Riding the Crest of the Video Tsunami: Nevada TelecomCalix
 
Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...
Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...
Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...eCommConf
 
Wp Service Provider Voip[1]
Wp Service Provider Voip[1]Wp Service Provider Voip[1]
Wp Service Provider Voip[1]sarvodaya2001
 
Explanation of voip
Explanation of voipExplanation of voip
Explanation of voiphuntysen
 
WebRTC: players, business models and implications for telecommunication carriers
WebRTC: players, business models and implications for telecommunication carriersWebRTC: players, business models and implications for telecommunication carriers
WebRTC: players, business models and implications for telecommunication carriersHarry Behrens, PhD
 

Similar to Microsoft PowerPoint - ccnc10_voip (20)

COLT Telecom - VoIP For Enterprise Customers, a COLT Business Briefing
COLT Telecom - VoIP For Enterprise Customers, a COLT Business BriefingCOLT Telecom - VoIP For Enterprise Customers, a COLT Business Briefing
COLT Telecom - VoIP For Enterprise Customers, a COLT Business Briefing
 
Cisco Video Data Explosion
Cisco Video Data ExplosionCisco Video Data Explosion
Cisco Video Data Explosion
 
What’s Next for Mobile Video
What’s Next for Mobile VideoWhat’s Next for Mobile Video
What’s Next for Mobile Video
 
Rumana Akther Id#072842056
Rumana Akther Id#072842056Rumana Akther Id#072842056
Rumana Akther Id#072842056
 
Mwc wip jam jabber sdk final
Mwc wip jam jabber sdk finalMwc wip jam jabber sdk final
Mwc wip jam jabber sdk final
 
Video the new voice
Video the new voiceVideo the new voice
Video the new voice
 
Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...
Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...
Whats New In Microsoft Lync Server 2010 Launch Presentation Speaker Daniel J ...
 
Lync RoI Study
Lync RoI StudyLync RoI Study
Lync RoI Study
 
WEB CONFERENCING
WEB CONFERENCINGWEB CONFERENCING
WEB CONFERENCING
 
HTML5 vidéo : Facts and fiction
HTML5 vidéo : Facts and fictionHTML5 vidéo : Facts and fiction
HTML5 vidéo : Facts and fiction
 
Lync to the Future: Skype, Mobile, Meetings & Video
Lync to the Future: Skype, Mobile, Meetings & VideoLync to the Future: Skype, Mobile, Meetings & Video
Lync to the Future: Skype, Mobile, Meetings & Video
 
Sdp Evolution Issue 1
Sdp Evolution Issue 1Sdp Evolution Issue 1
Sdp Evolution Issue 1
 
The future telecom
The future telecomThe future telecom
The future telecom
 
Use Vo Ip Solution For Improve Enterprise Communication Ntust0716
Use Vo Ip Solution For Improve Enterprise Communication Ntust0716Use Vo Ip Solution For Improve Enterprise Communication Ntust0716
Use Vo Ip Solution For Improve Enterprise Communication Ntust0716
 
Radvision webinar: Making Real Time Video Work Over The Internet
Radvision webinar: Making Real Time Video Work Over The InternetRadvision webinar: Making Real Time Video Work Over The Internet
Radvision webinar: Making Real Time Video Work Over The Internet
 
Riding the Crest of the Video Tsunami: Nevada Telecom
Riding the Crest of the Video Tsunami: Nevada TelecomRiding the Crest of the Video Tsunami: Nevada Telecom
Riding the Crest of the Video Tsunami: Nevada Telecom
 
Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...
Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...
Cullen Jenning's Presentation at Emerging Communication Conference & Awards 2...
 
Wp Service Provider Voip[1]
Wp Service Provider Voip[1]Wp Service Provider Voip[1]
Wp Service Provider Voip[1]
 
Explanation of voip
Explanation of voipExplanation of voip
Explanation of voip
 
WebRTC: players, business models and implications for telecommunication carriers
WebRTC: players, business models and implications for telecommunication carriersWebRTC: players, business models and implications for telecommunication carriers
WebRTC: players, business models and implications for telecommunication carriers
 

More from Videoguy

Energy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingEnergy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingVideoguy
 
Microsoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresMicrosoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresVideoguy
 
Proxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingProxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingVideoguy
 
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksFree-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksVideoguy
 
Instant video streaming
Instant video streamingInstant video streaming
Instant video streamingVideoguy
 
Video Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideo Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideoguy
 
Video Streaming
Video StreamingVideo Streaming
Video StreamingVideoguy
 
Reaching a Broader Audience
Reaching a Broader AudienceReaching a Broader Audience
Reaching a Broader AudienceVideoguy
 
Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Videoguy
 
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGVideoguy
 
Impact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingImpact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingVideoguy
 
Application Brief
Application BriefApplication Brief
Application BriefVideoguy
 
Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Videoguy
 
Streaming Video into Second Life
Streaming Video into Second LifeStreaming Video into Second Life
Streaming Video into Second LifeVideoguy
 
Flash Live Video Streaming Software
Flash Live Video Streaming SoftwareFlash Live Video Streaming Software
Flash Live Video Streaming SoftwareVideoguy
 
Videoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoguy
 
Streaming Video Formaten
Streaming Video FormatenStreaming Video Formaten
Streaming Video FormatenVideoguy
 
iPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareiPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareVideoguy
 
Glow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxGlow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxVideoguy
 

More from Videoguy (20)

Energy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingEnergy-Aware Wireless Video Streaming
Energy-Aware Wireless Video Streaming
 
Microsoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresMicrosoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_Pres
 
Proxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video StreamingProxy Cache Management for Fine-Grained Scalable Video Streaming
Proxy Cache Management for Fine-Grained Scalable Video Streaming
 
Adobe
AdobeAdobe
Adobe
 
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer NetworksFree-riding Resilient Video Streaming in Peer-to-Peer Networks
Free-riding Resilient Video Streaming in Peer-to-Peer Networks
 
Instant video streaming
Instant video streamingInstant video streaming
Instant video streaming
 
Video Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A SurveyVideo Streaming over Bluetooth: A Survey
Video Streaming over Bluetooth: A Survey
 
Video Streaming
Video StreamingVideo Streaming
Video Streaming
 
Reaching a Broader Audience
Reaching a Broader AudienceReaching a Broader Audience
Reaching a Broader Audience
 
Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...Considerations for Creating Streamed Video Content over 3G ...
Considerations for Creating Streamed Video Content over 3G ...
 
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMINGADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
ADVANCES IN CHANNEL-ADAPTIVE VIDEO STREAMING
 
Impact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video StreamingImpact of FEC Overhead on Scalable Video Streaming
Impact of FEC Overhead on Scalable Video Streaming
 
Application Brief
Application BriefApplication Brief
Application Brief
 
Video Streaming Services – Stage 1
Video Streaming Services – Stage 1Video Streaming Services – Stage 1
Video Streaming Services – Stage 1
 
Streaming Video into Second Life
Streaming Video into Second LifeStreaming Video into Second Life
Streaming Video into Second Life
 
Flash Live Video Streaming Software
Flash Live Video Streaming SoftwareFlash Live Video Streaming Software
Flash Live Video Streaming Software
 
Videoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions CookbookVideoconference Streaming Solutions Cookbook
Videoconference Streaming Solutions Cookbook
 
Streaming Video Formaten
Streaming Video FormatenStreaming Video Formaten
Streaming Video Formaten
 
iPhone Live Video Streaming Software
iPhone Live Video Streaming SoftwareiPhone Live Video Streaming Software
iPhone Live Video Streaming Software
 
Glow: Video streaming training guide - Firefox
Glow: Video streaming training guide - FirefoxGlow: Video streaming training guide - Firefox
Glow: Video streaming training guide - Firefox
 

Microsoft PowerPoint - ccnc10_voip

  • 1. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 TOWARDS GLITCH-FREE VOIP AND VIDEO CONFERENCING JIN LI MICROSOFT RESEARCH Outline 2 Introduction Anatomy of VoIP and Video Conferencing Systems Audio/Video Components Network Components Summary Jin Li, Microsoft Research 1
  • 2. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 3 Introduction Booming of IP Based Communication 4 Advanced voice over IP (VoIP) Web-, audio-, video-conferencing Tele-presence Instant messaging Calendar and other PIM functions Email, fax and voice mail Jin Li, Microsoft Research 2
  • 3. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Worldwide VoIP subscribers 5 • Worldwide VoIP service revenue was $24.1B in 2007, up 52% over 2006. • It is expected that worldwide VoIP service to more than double over the next 4 years, to $61.3B in 2011, with an annual growth rate of 26%. Source: 2008 Infonetics Research Inc, US Broadband Telephony Forecast, 6 2007-2013 VoIP subscriber base are predicted to double from 2007 to 2013. Source: Jupiter Research, US Broadband Telephony Forecast, 2008 to 2013 Jin Li, Microsoft Research 3
  • 4. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 VoIP Trend 7 IP networks are the next gen networks for all forms of communication. Broadband penetration is a key driver of VoIP expansion Worldwide DSL subscriptions were at 205.9M at the end of 2007, up 23% from 2011. It is predicted to increase to 363.6M in 2011. Cable subscriptions were up 15% annually to 68M at the end of 2007, climbing to 97.3M in 2011. Passive Optical Network (PON) subscribers were at 10.9M in 2007 Ethernet FTTH subscribers were at 1.7M in 2007 2004/2005 are breakthrough years for VoIP adoption High End Systems – Tele-Presence 8 Cisco Telepresence $299K Tandberg Experia $225K HP Halo $425K + $18K/mo Polycom RPX210M $269K + $18.5K/mo Jin Li, Microsoft Research 4
  • 5. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Worldwide Tele-presence Forecast 9 (2006-2012) # of end points Revenue forecast Source: 2008 IDC Research Desktop Video Conferencing 10 Multiple solutions, often acted as add on to VoIP Benefit See faces of people you may not have met before See facial expressions & gestures Easier to follow a conversation More interactive than phone Get the general mood of ambience See and show documents/objects Drawback Difficult to setup and planning Network reliability Without(or poor) video, people talk; without(or poor) audio, people walk. Interpersonal factors Jin Li, Microsoft Research 5
  • 6. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 11 Anatomy of VoIP and Video Conferencing Systems Infrastructure vs. P2P 12 Infrastructure based P2P based Microsoft Unified Skype Communication Cisco Gtalk Jin Li, Microsoft Research 6
  • 7. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 13 Infrastructure Based VoIP: Microsoft Unified Communication Unified Communication: Architecture 14 Jin Li, Microsoft Research 7
  • 8. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Unified Communication: P2P Call 15 Key Steps 16 Alice calls Bob Find Bob’s registered SIP endpoints Jin Li, Microsoft Research 8
  • 9. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Unified Communication: To VoiceMail 17 Key Steps 18 Alice calls Bob Find Bob’s registered SIP endpoints Bob doesn’t answer after a certain period, call re-routes Voicemail system plays a greeting, records Alice’s msg, send the msg to Bob’s email, and use speech server to transcribe the msg Jin Li, Microsoft Research 9
  • 10. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Unified Communication: PSTN UC 19 Key Steps 20 PSTN user Alice calls Bob IP-PSTN gateway terminates the call MS/Gateway routes call to mediation server, which performs transcoding & ICE, etc.. Through director, the proper UC client is found Jin Li, Microsoft Research 10
  • 11. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 21 P2P VoIP: Skype P2P VoIP: Skype 22 Information Debut: 08/2003, by N. Zennstrom and J. Friis, who founded KaZaA A P2P overlay network for VoIP and other app Free intra-net VoIP and fee-based SkypeOut/SkypeIn Jin Li, Microsoft Research 11
  • 12. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Usage (Apr. 2008) 23 11 million concurrent Skype users on line in peak time (180,000+ simultaneous calls) 309 million registered users worldwide, the largest registered user base within eBay portfolio (33 million added users for Q1FY08) $126M revenue in Q1FY08 (61% YOY growth, 5.6 billion SkypeOut minutes in FY2007) 100 billion cumulative Skype-to-Skype minutes Skype Share of International VoIP 24 Traffic Jin Li, Microsoft Research 12
  • 13. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Gadget 25 IPDRUM mobile Skype Cable Motorola CN620 IPEVO Free-1 WiFi Cellphone USB Skype Phone Netgear Skype Wi-Fi Phone USB Mouse with Phone 50 hardware partners, 150+ Skype certificated device. Skype vs. VoIP 26 Public VoIP standard H.323, SIP Skype is a proprietary VoIP solution Rely on P2P network for user directory Scalable without costly infrastructure Route calls through supernodes in Skype Universal firewall/NAT traversal Encrypted traffic (but you have to trust eBay/Skype) Jin Li, Microsoft Research 13
  • 14. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Ingredient (1) 27 User retrieves ID from a skype server Skype Network 28 Skype Server authentication Supernode Overlay: any computer w/ sufficient CPU, memory & network bw & not behind firewall For distributed directory service Relay traffic for computer behind NAT/firewall Jin Li, Microsoft Research 14
  • 15. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 NAT Traversal (Skype) 29 NAT/Firewall detection Try UDP connection Try TCP connection (arb port, 80 (http), 443(https) ) Traversal Direct connection if a) both clients have no NAT, b) one client has no NAT, and one behind cone-NAT Relay by supernode otherwise Since Skype doesn’t need to pay for relay cost High bitrate wideband voice codec (>24kbps) Skype : Call Routing Through Supernode 30 Skype Server authentication Supernode Overlay: Route call through supernodes High bitrate wideband voice codec (>24kbps) Jin Li, Microsoft Research 15
  • 16. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Encryption 31 Peer 1 Peer 2 256-bit AES over 128 bit data block 1536/2048 RSA for key negotiation (2048/2048 for paid service) Skype: Complete Black box (Security by Obfuscation ) 32 Almost everything is obfuscated Many protections, anti-debugging tricks, ciphered code Avoid static disassembly: xor binary with a hard-coded key, erasure beginning of the code, own packer Code integrity check: use checksum to avoid breakpoint Anti-debugging technique: anti softice, integrity check Code obfuscation Network obfuscation Jin Li, Microsoft Research 16
  • 17. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 33 Audio/Video Component Audio/Video Component 34 Audio Codec Video Codec Acoustic Echo Cancellation Jin Li, Microsoft Research 17
  • 18. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 35 Audio Codec G.711 (PCM) Still widely used today: PSTN interface If uniform quantization 12 bits * 8 k/sec = 96 kbps Non-uniform quantization 65 kbps DS0 rate North America: µ-law Other countries: A-law MOS of about 4.3 µ = 255 , A = 87.6 Jin Li, Microsoft Research 18
  • 19. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 G.722.1: Siren Audio bandwidth: 14 kHz Sample rate: 32 kHz Bit rate: 24, 32, and 48 kbit/s Algorithm: Transform coding (Siren14TM) Frame size: 20 ms Algorithmic delay: 40 ms Complexity: <11 WMOPS (enc/dec) Available on royalty-free licensing terms (from Polycom) Siren Encoder Jin Li, Microsoft Research 19
  • 20. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Siren Decoder 39 Siren Codec Audio sampled at 32kHz Operates on frames of 20 ms corresponding to 640 samples Based on transform coding, using a Modulated Lapped Transform (MLT) A Look-ahead of 20 ms due to 50% overlap between frames Total algorithmic delay of 40 ms Jin Li, Microsoft Research 20
  • 21. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 MLT - Modulated Lapped Transforms 41/75 Spatial Response Frequency Domain Categorization & SQVH 42 Quantization Used by SQVH Expected # of Bits For Each Category Vector Property Used in SQVH Jin Li, Microsoft Research 21
  • 22. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 AMR-WB Basics “Wideband coding of speech at around 16kbit/s using adaptive multi-rate wideband (AMR-WB)” Adopted as ITU-T G722.2, and also as 3GPP spec TS 26.190. “Foreseen applications are: VoIP and internet applications, Mobile Com., PSTN app, ISNDN wideband telephony, ISDN videophone and videoconf.” Sampling rate 16KHz; Bitrate: 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, and 23.85 kbit/s. 20 ms frame. ACELP (algebraic code excited LPC). Pre-processing Sampling rate conversion: 16 to 12.8KHz; (now a 20ms frame has 256 samples…) HP filter (cut off @ 50Hz) Pre-emphasis filter ( 1 -.68 z-1 ) Jin Li, Microsoft Research 22
  • 23. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 LP analysis and Quant. One 30 ms asymmetric window 5 ms look-ahead Obtain LPC Coef.: Compute correlation; Multiply by window (add 60HZ BW expansion); R(0) = 1.0001R(0) ( adds 40dB noise floor); levinson-durbin to compute LP coefficients. LP to ISP Quantize in ISP q-domain. LP analysis and Quant. (2) Quantization bottom line: 46 bits/frame on most modes; 36 bits/frame on 6.60 Kbps mode; M.A. prediction with 1/3 gain; Quantizer: S-MSVQ (split multistage VQ) Both quantized and unquantized coefs will be used in algorithm. Jin Li, Microsoft Research 23
  • 24. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 subframes Each 20ms (256 samples) frame is divided in 4 sub- frames (64 samples each). Interpolated LPC coefficients obtained for each sub- frame Interpolation done in ISP q-domain Perceptual weighting Weighting filter is: W(z) = A(z/γ1).Hde-emph(z) This helps solving the tilt problem, which is worse in WB speech. Jin Li, Microsoft Research 24
  • 25. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Excitation Searched for each 5ms sub-frame. Two components: Adaptive codebook (past excitation) Algebraic codebook “target” signal obtained by filtering the LPC residual (for the sub-frame) through the synthesis LPC filter and weighting filter. Adaptive codebook Start with “open loop” pitch estimation based on cross correlation; Low-value bias; ‘last value’ value bias (actually 5-frame median), if voiced. Re-compute with “closed loop”, around initial value ±7, and up to ¼ sample precision. “Analysis by synthesis” based; Restrict to values allowed by encoding step. Start with “open loop” pitch estimation based on cross correlation; Low-value bias; ‘last value’ value bias (actually 5-frame median), if voiced. Re-compute with “closed loop”, around initial value ±7, and up to ¼ sample precision. “Analysis by synthesis” based; Restrict to values allowed by encoding step. Jin Li, Microsoft Research 25
  • 26. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Algebraic codebook Remove contribution of (unquantized) prediction from adaptive codebook from the “target signal” to obtain new target. Divide sub-frame into 4 alternating tracks. Algebraic codebook (2) Select best pulses, for a total of 24 (6), 18(5-4), 16 (4), 12(3), 10(3-2), 8(2), 4(1), 2(.5), depending on bitrate. Pulses + Two filters: Periodicity enhancement: 1/(1-.85z-T); Tilt: 1/(1- β1 z -1) Tricks to save bits in encoding pulse position; Tricks to save computation on pulse search. Jin Li, Microsoft Research 26
  • 27. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Wrap up High pass, de-emphasis; Upsample back to 16KHz; Add high frequency components. High Freq. Components Random noise used as excitation LP filter is extended to 8KHz. Energy of excitation based on energy of base-band residual, and voicing info, except in highest bitrate mode. Extension of LPC filter is equivalent to mapping 5.1 to 5.6Khz to 6.4 to 7.0KHz; Band-pass filtered to 6-7KHz, and added to output signal. Jin Li, Microsoft Research 27
  • 28. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 55 Video Codec H.264/AVC Encoder 56 Jin Li, Microsoft Research 28
  • 29. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 H.264/AVC Decoder 57 Reference Picture Management 58 Reference pictures are stored in decoded picture buffer (DPB) Short/long term reference picture, a decoded frame may be marked as unused for reference short term picture long term picture Sliding Window” memory management Keep #(long_term_pic+ short_term_pic) Remove short term picture if lack of space Adaptive memory control issued by encoder change the type of the ref frame IDR (Instantaneous Decoder Refresh) clear ref buffer I frame Jin Li, Microsoft Research 29
  • 30. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Slice Group 59 Former called “FMO” (Flexible Macroblock Ordering) A subset of the macroblocks and may contain one or more slices Error resilience Inter Prediction 60 Variable block size ¼ pixel motion compensation Interpolation Jin Li, Microsoft Research 30
  • 31. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Motion Vector (MV) Prediction 61 Efficiently encode correlated MV Other than 16×8 and 8×16, MVp=(MVA+MVB+MVC) /3 16×8, MVp of the upper =MVB ;MVp of the lower =MVA 8×16, MVp of the left =MVA ;MVp of the right =MVC For skipped macroblocks, do as 16 × 16 Inter mode Intra Prediction 62 For Luma samples 4*4 block: 9 prediction modes 16*16 block: 4 modes I_PCM: transmit the encoded samples w/o pred. & trans Jin Li, Microsoft Research 31
  • 32. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Prediction Modes 63 4x4 Luma Intra 16x16 8x8 Chroma is similar to 16x16 luma intra Signaling of Intra Prediction Modes 64 Mode choices need to be signaled to the decoder, but compactly The prediction mode for luma coded in Intra-16 16 mode or × chroma coded in Intra mode is signaled in the macroblock header Intra modes for neighboring 4 4 blocks are often correlated × B A C If A and B are available, C = min (A,B) else if (neither A nor B are available) C = 2 (DC) else C = available (A,B) Use prev_intra4x4_pred_mode flag & rem_intra4x4_pred_mode flag to indicate mode selected. Jin Li, Microsoft Research 32
  • 33. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Deblocking filter 65 Filter 4 vertical/horizontal boundaries of luma Filter 2 vertical/horizontal boundaries of chroma Affect up to 3 samples on the either side. The filter is stronger at places where there is likely to be significant blocking distortion e.g.: such as the boundary of an intra coded macroblock or a boundary between blocks that contain coded coefficients. Transform and Quantisation 66 3 transforms DCT-base transform for all 4*4 residual block a=1/2, b = (2/5)1/2, d = 1/2 Hadamard transform for 4*4 luma DC coefficient (in 16*16 intra) Hadamard transform for 2*2 chroma DC coefficient Jin Li, Microsoft Research 33
  • 34. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Combine Quantization into Scaling 67 of Transform 4x4 DC Intra Luma |ZD(i, j)| = (|YD(i, j)| MF(0,0) + 2f ) >> (qbits +1) sign (ZD(i, j)) = sign (YD(i, j)) |ZD(i, j)| = (|YD(i, j)| MF(0,0) + 2f ) >> (qbits +1) sign (ZD(i, j)) = sign (YD(i, j)) CAVLC: Context-Based Adaptive 68 Variable Length Coding Characteristics: Run-level coding to compact zero string Trailing ones (+1, -1 after 0) Number of nonzero coefficient in neighboring blocks is correlated Choice VLC lookup table for level parameter for level magnitude Jin Li, Microsoft Research 34
  • 35. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 CAVLC Encoding 69 1. Encode the number of coefficients and trailing ones (coeff token) TotalCoeffs : 0 ~ 16 TrailingOnes : 0 ~ 3 if more than 3 TrailingOnes, only last three are treated as ‘special cases’ Four look up table Three variable-length, one fixed-length Choice depend on neighboring blocks 2. Encode the sign of each TrailingOne: In reverse order 3. Encode the levels of the remaining nonzero coefficients level_prefix, level_suffix 4.Encode the total number of zeros before the last coefficient Zero-runs at start of the array need not to be encoded 5. Encode each run of zeros If less then 3 TrailingOnes, the first nonzero coefficient is adjusted 70 Acoustic Echo Cancellation Jin Li, Microsoft Research 35
  • 36. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Acoustic Echo Cancellation 71 From Audio Decoder To Audio Encoder Acoustic Echo Cancellation Acoustic Echo Cancellation Module 72 Jin Li, Microsoft Research 36
  • 37. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Adaptive Traversal Filter 73 FIR filter – inherently stable Length of the filter affects other performance, convergence, goodness, and complexity. Filter introduces errors since it is trying to model IIR response. Short Filters 128 – 256 coefficients (taps) Faster convergence, but final solution has more residual error Less complex O(N). Long Filters 512-1024 Slower convergence, but final solution has less error. More complex, as algorithm can be O(N2) Challenges 74 Dynamic range of the human ear = 120dB. Even quiet echoes can be heard. Longer delays from satellite (300-500ms), VoIP Ear is more sensitive to longer delays. More difficult to find the beginning of the echo. Long filters (~1000 taps) are needed (complexity & convergence) Near-end noise: corrupt the echo, decreasing the cancellers ability to converge. Acoustic echo paths can change rapidly More difficult for the AEC to remain converged. Nonlinear echo components Speakers driven beyond linear region. Jin Li, Microsoft Research 37
  • 38. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 75 Network Component IP-based VoIP / Video Conference 76 Jin Li, Microsoft Research 38
  • 39. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 77 Internet Primer Internet : Grand View 78 Jin Li, Microsoft Research 39
  • 40. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Impact on ISPs 79 Economics of ISP relationships transit peering entity sibling relationship boundary several ISPs belong to same org peering peering relationship mutual beneficial free agreement (to certain extent) sibling sibling entity transit relationship boundary one ISP pays another Inside ISP 80 Jin Li, Microsoft Research 40
  • 41. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 ISP POP (Point of Presence) 81 Home Networking 82 Jin Li, Microsoft Research 41
  • 42. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 83 Network Characteristics Under-provisioned Links 84 Branch Branch Jin Li, Microsoft Research 42
  • 43. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Growth Trends 85 Packet Loss vs. Jitter (vs. Delay?) 86 Jin Li, Microsoft Research 43
  • 44. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 The Usual Suspects 87 Packet Bursts 88 Jin Li, Microsoft Research 44
  • 45. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 What kind of Enterprise User? 89 How QoS can help 90 Jin Li, Microsoft Research 45
  • 46. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 QoS helps inside and between 91 branches! Observation 92 IP-based communication in the enterprise is growing Empirical results show poor calls for Wireless and VPN users QoS (DiffServ) is both used and useful! Jin Li, Microsoft Research 46
  • 47. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 93 Available Bandwidth Estimation What is Available Bandwidth (ABW)? 94 ABW is the left-over capacity along an Internet path Jin Li, Microsoft Research 47
  • 48. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Why Is It Useful? Maximizing QoE (Quality of Experience) in A/V conferencing Audio prefers minimum delay (high priority) Video prefers maximum rate (low priority) One Way Delay (OWD) = propagation delay (constant) + queuing delay (variable) One solution: measure ABW, encode and send video at the ABW rate Typical Targeting Scenario First hop is the bottleneck Cable modem, DSL, high-speed link… Timescale for the ABW estimation: 2-4 seconds Jin Li, Microsoft Research 48
  • 49. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Why Is Measuring ABW Hard? Available bandwidth changes over time ABW measurements must be quick Audio packets (along the same path) should experience minimum delay Measurement must be non-intrusive Two Models Probe Rate Model (PRM) based solutions Pathload, TOPP, Pathchirp, Bfind, PTR … Probe Gap Model (PGM) based solutions Spruce, Delphi, IGI, Moseab … Jin Li, Microsoft Research 49
  • 50. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Pathload (PRM) [Jain & Dovrolis] Send probe trains at various rates ABW is the probe rate at transition, where OWD is increasing (queuing delay is observed) Spruce (PGM) [Jacob et. al.] Send probe pairs/train at Ri (Ri > A), measure sending gaps and receiving gaps Compute A directly Jin Li, Microsoft Research 50
  • 51. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Advantage/Disadvantages of The Approaches Advantages Disadvantages PGM based Fast estimation: Assumptions are not easy approaches to verify in practice Estimation can be done in single probe. PRM based No assumption Slow estimation: approaches iterative probes 102 Forward Error Correction Jin Li, Microsoft Research 51
  • 52. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Block Based Erasure Resilient Coding 103 Original data: 1 2 3 k k messages ERC: 1 2 3 k k+1 n At a certain instance X X X X X X Some of the blocks may be lost in delivery. However, as long as there are at least k blocks delivered, the original data can be reconstructed. ERC in VoIP and Video Conferencing 104 VoIP Mainly packet replication, due to small VoIP packet size & low delay requirement Video Conferencing Packet loss protection (for I frame or P frame in HD) Each frame is separate into k msg, and protect by n-k msg. As long as there are less than n-k loss, the transmission succeeds Jin Li, Microsoft Research 52
  • 53. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 ERC Terms 105 Number of Original Block: k Number of Coded Block: n Rate of ERC: k/n MDS: Maximum Distance Separable Any k of n coded block may recover the original The theoretical optimal performance Erasure Encoding: Mathematics Original data: x1 x2 xk Coded data: y1 y2 yn : Vectors on Galois Field. 106 Jin Li, Microsoft Research 53
  • 54. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Example: ERC of 10MB Original data x1 x2 xk k=10, GF(28), each vector is 1MB. (10MB): Coded data: y1 y2 yn (n=30) 30 10 1M 1M 107 Erasure Decoding: Mathmatics 108 Original data: x1 x2 xk Coded data: y1 y2 yn Available Code select Jin Li, Microsoft Research 54
  • 55. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Erasure Decoding: Mathmatics 109 Original data: x1 x2 xk Coded data: y1 y2 yn Original data can be recovered if the sub-generator matrix has a full rank k. Systematic vs Non-Systematic ERC 110 Original data: 1 2 3 k k messages Non systematic 1 2 3 k k+1 n ERC: Systematic 1 2 3 k k+1 n ERC: Systematic ERC Slightly low encoding & decoding complexity Even can’t recover, we can still use some original msg Jin Li, Microsoft Research 55
  • 56. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Reed-Solomon 111 Has been around for decades Has systematic form Cauchy Reed-Solomon Code Tutorial, Jin Li Reed-Solomon Decoding Inverse Receive 112 Jin Li, Microsoft Research 56
  • 57. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 113 Dejitter Buffer Variable Delay & Dejitter Buffer Queuing Queuing Queuing Delay Delay Delay Dejitter Buffer Queuing delay Dejitter buffers Variable packet sizes Jin Li, Microsoft Research 57
  • 58. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Fixed Dejitter Buffer – Budget For Worst Case Coder Queuing Delay Delay Dejitter Buffer 40 ms 4-50 ms 50 ms Site A Site B Propagation Delay—8 ms (128kbps Bandwidth Total End-to-End Delay Codec delay: 40ms Propagation delay: 8ms Dejitter buffer: 50ms To accommodate queuing delay: 0-50 ms Total delay: 98ms Dejitter Buffer Size & Late Loss late loss buffering delay Fixed playout deadline and jitter Playout Jitter absorption: The playout rate is constant The tradeoff is between Dejitter buffer size and late loss Delay Packet Loss Jin Li, Microsoft Research 58
  • 59. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Adaptive Playout and Dejitter Buffer Adaptation buffering delay Adaptive playout and jitter adaptation Playout Jitter Scaling of voice/video packets in highly dynamic way Playout schedule set according to past delays recorded Usually dejitter buffer size expand quickly to late packet arrival, and shrink slowly when jitter reduces Delay Packet Loss Improved tradeoff between buffering delay and late loss Playout rate is not constant Adaptive Play Out 118 Audio Adaptive Playout Packets push into Adaptive Playout module Render requests new waveform seg for playout Playout module passes packet to audio decoder Jin Li, Microsoft Research 59
  • 60. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 119 Packet Loss Concealment Audio Packet Loss Concealment L ∆L i-2 i-1 i lost i+1 i+2 alignment found by correlation time i-2 i-1 i+1 i+2 time 2L 1.3 L Depend on voiced & unvoiced segment Jin Li, Microsoft Research 60
  • 61. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Voiced segments Unvoiced segments Jin Li, Microsoft Research 61
  • 62. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Concealment as (bi-directional) stretching Video Packet Loss Concealment 124 Spatial Concealment Use spatial correlation E.g., bilinear interpolation Projection onto convex sets Temporal Concealment Use correlation exists between consecutive frames Temporal replacement Boundary matching Jin Li, Microsoft Research 62
  • 63. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Spatial-Temporal Concealment 125 126 Summary Jin Li, Microsoft Research 63
  • 64. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Summary 127 VoIP/Video Conference Systems Infrastructure based P2P based Audio/Video Components Audio codec Video codec Acoustic echo cancellation Network components Primer of the Internet Network characteristics Available bandwidth estimation Forward error correction (FEC) Dejitter buffer Packet loss concealment Jin Li, Microsoft Research 64