Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Microsoft PowerPoint - ccnc10_voip

764 views

Published on

  • Be the first to comment

  • Be the first to like this

Microsoft PowerPoint - ccnc10_voip

  1. 1. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 TOWARDS GLITCH-FREE VOIP AND VIDEO CONFERENCING JIN LI MICROSOFT RESEARCH Outline 2 Introduction Anatomy of VoIP and Video Conferencing Systems Audio/Video Components Network Components Summary Jin Li, Microsoft Research 1
  2. 2. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 3 Introduction Booming of IP Based Communication 4 Advanced voice over IP (VoIP) Web-, audio-, video-conferencing Tele-presence Instant messaging Calendar and other PIM functions Email, fax and voice mail Jin Li, Microsoft Research 2
  3. 3. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Worldwide VoIP subscribers 5 • Worldwide VoIP service revenue was $24.1B in 2007, up 52% over 2006. • It is expected that worldwide VoIP service to more than double over the next 4 years, to $61.3B in 2011, with an annual growth rate of 26%. Source: 2008 Infonetics Research Inc, US Broadband Telephony Forecast, 6 2007-2013 VoIP subscriber base are predicted to double from 2007 to 2013. Source: Jupiter Research, US Broadband Telephony Forecast, 2008 to 2013 Jin Li, Microsoft Research 3
  4. 4. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 VoIP Trend 7 IP networks are the next gen networks for all forms of communication. Broadband penetration is a key driver of VoIP expansion Worldwide DSL subscriptions were at 205.9M at the end of 2007, up 23% from 2011. It is predicted to increase to 363.6M in 2011. Cable subscriptions were up 15% annually to 68M at the end of 2007, climbing to 97.3M in 2011. Passive Optical Network (PON) subscribers were at 10.9M in 2007 Ethernet FTTH subscribers were at 1.7M in 2007 2004/2005 are breakthrough years for VoIP adoption High End Systems – Tele-Presence 8 Cisco Telepresence $299K Tandberg Experia $225K HP Halo $425K + $18K/mo Polycom RPX210M $269K + $18.5K/mo Jin Li, Microsoft Research 4
  5. 5. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Worldwide Tele-presence Forecast 9 (2006-2012) # of end points Revenue forecast Source: 2008 IDC Research Desktop Video Conferencing 10 Multiple solutions, often acted as add on to VoIP Benefit See faces of people you may not have met before See facial expressions & gestures Easier to follow a conversation More interactive than phone Get the general mood of ambience See and show documents/objects Drawback Difficult to setup and planning Network reliability Without(or poor) video, people talk; without(or poor) audio, people walk. Interpersonal factors Jin Li, Microsoft Research 5
  6. 6. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 11 Anatomy of VoIP and Video Conferencing Systems Infrastructure vs. P2P 12 Infrastructure based P2P based Microsoft Unified Skype Communication Cisco Gtalk Jin Li, Microsoft Research 6
  7. 7. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 13 Infrastructure Based VoIP: Microsoft Unified Communication Unified Communication: Architecture 14 Jin Li, Microsoft Research 7
  8. 8. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Unified Communication: P2P Call 15 Key Steps 16 Alice calls Bob Find Bob’s registered SIP endpoints Jin Li, Microsoft Research 8
  9. 9. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Unified Communication: To VoiceMail 17 Key Steps 18 Alice calls Bob Find Bob’s registered SIP endpoints Bob doesn’t answer after a certain period, call re-routes Voicemail system plays a greeting, records Alice’s msg, send the msg to Bob’s email, and use speech server to transcribe the msg Jin Li, Microsoft Research 9
  10. 10. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Unified Communication: PSTN UC 19 Key Steps 20 PSTN user Alice calls Bob IP-PSTN gateway terminates the call MS/Gateway routes call to mediation server, which performs transcoding & ICE, etc.. Through director, the proper UC client is found Jin Li, Microsoft Research 10
  11. 11. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 21 P2P VoIP: Skype P2P VoIP: Skype 22 Information Debut: 08/2003, by N. Zennstrom and J. Friis, who founded KaZaA A P2P overlay network for VoIP and other app Free intra-net VoIP and fee-based SkypeOut/SkypeIn Jin Li, Microsoft Research 11
  12. 12. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Usage (Apr. 2008) 23 11 million concurrent Skype users on line in peak time (180,000+ simultaneous calls) 309 million registered users worldwide, the largest registered user base within eBay portfolio (33 million added users for Q1FY08) $126M revenue in Q1FY08 (61% YOY growth, 5.6 billion SkypeOut minutes in FY2007) 100 billion cumulative Skype-to-Skype minutes Skype Share of International VoIP 24 Traffic Jin Li, Microsoft Research 12
  13. 13. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Gadget 25 IPDRUM mobile Skype Cable Motorola CN620 IPEVO Free-1 WiFi Cellphone USB Skype Phone Netgear Skype Wi-Fi Phone USB Mouse with Phone 50 hardware partners, 150+ Skype certificated device. Skype vs. VoIP 26 Public VoIP standard H.323, SIP Skype is a proprietary VoIP solution Rely on P2P network for user directory Scalable without costly infrastructure Route calls through supernodes in Skype Universal firewall/NAT traversal Encrypted traffic (but you have to trust eBay/Skype) Jin Li, Microsoft Research 13
  14. 14. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Ingredient (1) 27 User retrieves ID from a skype server Skype Network 28 Skype Server authentication Supernode Overlay: any computer w/ sufficient CPU, memory & network bw & not behind firewall For distributed directory service Relay traffic for computer behind NAT/firewall Jin Li, Microsoft Research 14
  15. 15. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 NAT Traversal (Skype) 29 NAT/Firewall detection Try UDP connection Try TCP connection (arb port, 80 (http), 443(https) ) Traversal Direct connection if a) both clients have no NAT, b) one client has no NAT, and one behind cone-NAT Relay by supernode otherwise Since Skype doesn’t need to pay for relay cost High bitrate wideband voice codec (>24kbps) Skype : Call Routing Through Supernode 30 Skype Server authentication Supernode Overlay: Route call through supernodes High bitrate wideband voice codec (>24kbps) Jin Li, Microsoft Research 15
  16. 16. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Skype Encryption 31 Peer 1 Peer 2 256-bit AES over 128 bit data block 1536/2048 RSA for key negotiation (2048/2048 for paid service) Skype: Complete Black box (Security by Obfuscation ) 32 Almost everything is obfuscated Many protections, anti-debugging tricks, ciphered code Avoid static disassembly: xor binary with a hard-coded key, erasure beginning of the code, own packer Code integrity check: use checksum to avoid breakpoint Anti-debugging technique: anti softice, integrity check Code obfuscation Network obfuscation Jin Li, Microsoft Research 16
  17. 17. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 33 Audio/Video Component Audio/Video Component 34 Audio Codec Video Codec Acoustic Echo Cancellation Jin Li, Microsoft Research 17
  18. 18. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 35 Audio Codec G.711 (PCM) Still widely used today: PSTN interface If uniform quantization 12 bits * 8 k/sec = 96 kbps Non-uniform quantization 65 kbps DS0 rate North America: µ-law Other countries: A-law MOS of about 4.3 µ = 255 , A = 87.6 Jin Li, Microsoft Research 18
  19. 19. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 G.722.1: Siren Audio bandwidth: 14 kHz Sample rate: 32 kHz Bit rate: 24, 32, and 48 kbit/s Algorithm: Transform coding (Siren14TM) Frame size: 20 ms Algorithmic delay: 40 ms Complexity: <11 WMOPS (enc/dec) Available on royalty-free licensing terms (from Polycom) Siren Encoder Jin Li, Microsoft Research 19
  20. 20. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Siren Decoder 39 Siren Codec Audio sampled at 32kHz Operates on frames of 20 ms corresponding to 640 samples Based on transform coding, using a Modulated Lapped Transform (MLT) A Look-ahead of 20 ms due to 50% overlap between frames Total algorithmic delay of 40 ms Jin Li, Microsoft Research 20
  21. 21. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 MLT - Modulated Lapped Transforms 41/75 Spatial Response Frequency Domain Categorization & SQVH 42 Quantization Used by SQVH Expected # of Bits For Each Category Vector Property Used in SQVH Jin Li, Microsoft Research 21
  22. 22. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 AMR-WB Basics “Wideband coding of speech at around 16kbit/s using adaptive multi-rate wideband (AMR-WB)” Adopted as ITU-T G722.2, and also as 3GPP spec TS 26.190. “Foreseen applications are: VoIP and internet applications, Mobile Com., PSTN app, ISNDN wideband telephony, ISDN videophone and videoconf.” Sampling rate 16KHz; Bitrate: 6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, and 23.85 kbit/s. 20 ms frame. ACELP (algebraic code excited LPC). Pre-processing Sampling rate conversion: 16 to 12.8KHz; (now a 20ms frame has 256 samples…) HP filter (cut off @ 50Hz) Pre-emphasis filter ( 1 -.68 z-1 ) Jin Li, Microsoft Research 22
  23. 23. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 LP analysis and Quant. One 30 ms asymmetric window 5 ms look-ahead Obtain LPC Coef.: Compute correlation; Multiply by window (add 60HZ BW expansion); R(0) = 1.0001R(0) ( adds 40dB noise floor); levinson-durbin to compute LP coefficients. LP to ISP Quantize in ISP q-domain. LP analysis and Quant. (2) Quantization bottom line: 46 bits/frame on most modes; 36 bits/frame on 6.60 Kbps mode; M.A. prediction with 1/3 gain; Quantizer: S-MSVQ (split multistage VQ) Both quantized and unquantized coefs will be used in algorithm. Jin Li, Microsoft Research 23
  24. 24. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 subframes Each 20ms (256 samples) frame is divided in 4 sub- frames (64 samples each). Interpolated LPC coefficients obtained for each sub- frame Interpolation done in ISP q-domain Perceptual weighting Weighting filter is: W(z) = A(z/γ1).Hde-emph(z) This helps solving the tilt problem, which is worse in WB speech. Jin Li, Microsoft Research 24
  25. 25. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Excitation Searched for each 5ms sub-frame. Two components: Adaptive codebook (past excitation) Algebraic codebook “target” signal obtained by filtering the LPC residual (for the sub-frame) through the synthesis LPC filter and weighting filter. Adaptive codebook Start with “open loop” pitch estimation based on cross correlation; Low-value bias; ‘last value’ value bias (actually 5-frame median), if voiced. Re-compute with “closed loop”, around initial value ±7, and up to ¼ sample precision. “Analysis by synthesis” based; Restrict to values allowed by encoding step. Start with “open loop” pitch estimation based on cross correlation; Low-value bias; ‘last value’ value bias (actually 5-frame median), if voiced. Re-compute with “closed loop”, around initial value ±7, and up to ¼ sample precision. “Analysis by synthesis” based; Restrict to values allowed by encoding step. Jin Li, Microsoft Research 25
  26. 26. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Algebraic codebook Remove contribution of (unquantized) prediction from adaptive codebook from the “target signal” to obtain new target. Divide sub-frame into 4 alternating tracks. Algebraic codebook (2) Select best pulses, for a total of 24 (6), 18(5-4), 16 (4), 12(3), 10(3-2), 8(2), 4(1), 2(.5), depending on bitrate. Pulses + Two filters: Periodicity enhancement: 1/(1-.85z-T); Tilt: 1/(1- β1 z -1) Tricks to save bits in encoding pulse position; Tricks to save computation on pulse search. Jin Li, Microsoft Research 26
  27. 27. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Wrap up High pass, de-emphasis; Upsample back to 16KHz; Add high frequency components. High Freq. Components Random noise used as excitation LP filter is extended to 8KHz. Energy of excitation based on energy of base-band residual, and voicing info, except in highest bitrate mode. Extension of LPC filter is equivalent to mapping 5.1 to 5.6Khz to 6.4 to 7.0KHz; Band-pass filtered to 6-7KHz, and added to output signal. Jin Li, Microsoft Research 27
  28. 28. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 55 Video Codec H.264/AVC Encoder 56 Jin Li, Microsoft Research 28
  29. 29. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 H.264/AVC Decoder 57 Reference Picture Management 58 Reference pictures are stored in decoded picture buffer (DPB) Short/long term reference picture, a decoded frame may be marked as unused for reference short term picture long term picture Sliding Window” memory management Keep #(long_term_pic+ short_term_pic) Remove short term picture if lack of space Adaptive memory control issued by encoder change the type of the ref frame IDR (Instantaneous Decoder Refresh) clear ref buffer I frame Jin Li, Microsoft Research 29
  30. 30. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Slice Group 59 Former called “FMO” (Flexible Macroblock Ordering) A subset of the macroblocks and may contain one or more slices Error resilience Inter Prediction 60 Variable block size ¼ pixel motion compensation Interpolation Jin Li, Microsoft Research 30
  31. 31. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Motion Vector (MV) Prediction 61 Efficiently encode correlated MV Other than 16×8 and 8×16, MVp=(MVA+MVB+MVC) /3 16×8, MVp of the upper =MVB ;MVp of the lower =MVA 8×16, MVp of the left =MVA ;MVp of the right =MVC For skipped macroblocks, do as 16 × 16 Inter mode Intra Prediction 62 For Luma samples 4*4 block: 9 prediction modes 16*16 block: 4 modes I_PCM: transmit the encoded samples w/o pred. & trans Jin Li, Microsoft Research 31
  32. 32. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Prediction Modes 63 4x4 Luma Intra 16x16 8x8 Chroma is similar to 16x16 luma intra Signaling of Intra Prediction Modes 64 Mode choices need to be signaled to the decoder, but compactly The prediction mode for luma coded in Intra-16 16 mode or × chroma coded in Intra mode is signaled in the macroblock header Intra modes for neighboring 4 4 blocks are often correlated × B A C If A and B are available, C = min (A,B) else if (neither A nor B are available) C = 2 (DC) else C = available (A,B) Use prev_intra4x4_pred_mode flag & rem_intra4x4_pred_mode flag to indicate mode selected. Jin Li, Microsoft Research 32
  33. 33. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Deblocking filter 65 Filter 4 vertical/horizontal boundaries of luma Filter 2 vertical/horizontal boundaries of chroma Affect up to 3 samples on the either side. The filter is stronger at places where there is likely to be significant blocking distortion e.g.: such as the boundary of an intra coded macroblock or a boundary between blocks that contain coded coefficients. Transform and Quantisation 66 3 transforms DCT-base transform for all 4*4 residual block a=1/2, b = (2/5)1/2, d = 1/2 Hadamard transform for 4*4 luma DC coefficient (in 16*16 intra) Hadamard transform for 2*2 chroma DC coefficient Jin Li, Microsoft Research 33
  34. 34. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Combine Quantization into Scaling 67 of Transform 4x4 DC Intra Luma |ZD(i, j)| = (|YD(i, j)| MF(0,0) + 2f ) >> (qbits +1) sign (ZD(i, j)) = sign (YD(i, j)) |ZD(i, j)| = (|YD(i, j)| MF(0,0) + 2f ) >> (qbits +1) sign (ZD(i, j)) = sign (YD(i, j)) CAVLC: Context-Based Adaptive 68 Variable Length Coding Characteristics: Run-level coding to compact zero string Trailing ones (+1, -1 after 0) Number of nonzero coefficient in neighboring blocks is correlated Choice VLC lookup table for level parameter for level magnitude Jin Li, Microsoft Research 34
  35. 35. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 CAVLC Encoding 69 1. Encode the number of coefficients and trailing ones (coeff token) TotalCoeffs : 0 ~ 16 TrailingOnes : 0 ~ 3 if more than 3 TrailingOnes, only last three are treated as ‘special cases’ Four look up table Three variable-length, one fixed-length Choice depend on neighboring blocks 2. Encode the sign of each TrailingOne: In reverse order 3. Encode the levels of the remaining nonzero coefficients level_prefix, level_suffix 4.Encode the total number of zeros before the last coefficient Zero-runs at start of the array need not to be encoded 5. Encode each run of zeros If less then 3 TrailingOnes, the first nonzero coefficient is adjusted 70 Acoustic Echo Cancellation Jin Li, Microsoft Research 35
  36. 36. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Acoustic Echo Cancellation 71 From Audio Decoder To Audio Encoder Acoustic Echo Cancellation Acoustic Echo Cancellation Module 72 Jin Li, Microsoft Research 36
  37. 37. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Adaptive Traversal Filter 73 FIR filter – inherently stable Length of the filter affects other performance, convergence, goodness, and complexity. Filter introduces errors since it is trying to model IIR response. Short Filters 128 – 256 coefficients (taps) Faster convergence, but final solution has more residual error Less complex O(N). Long Filters 512-1024 Slower convergence, but final solution has less error. More complex, as algorithm can be O(N2) Challenges 74 Dynamic range of the human ear = 120dB. Even quiet echoes can be heard. Longer delays from satellite (300-500ms), VoIP Ear is more sensitive to longer delays. More difficult to find the beginning of the echo. Long filters (~1000 taps) are needed (complexity & convergence) Near-end noise: corrupt the echo, decreasing the cancellers ability to converge. Acoustic echo paths can change rapidly More difficult for the AEC to remain converged. Nonlinear echo components Speakers driven beyond linear region. Jin Li, Microsoft Research 37
  38. 38. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 75 Network Component IP-based VoIP / Video Conference 76 Jin Li, Microsoft Research 38
  39. 39. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 77 Internet Primer Internet : Grand View 78 Jin Li, Microsoft Research 39
  40. 40. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Impact on ISPs 79 Economics of ISP relationships transit peering entity sibling relationship boundary several ISPs belong to same org peering peering relationship mutual beneficial free agreement (to certain extent) sibling sibling entity transit relationship boundary one ISP pays another Inside ISP 80 Jin Li, Microsoft Research 40
  41. 41. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 ISP POP (Point of Presence) 81 Home Networking 82 Jin Li, Microsoft Research 41
  42. 42. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 83 Network Characteristics Under-provisioned Links 84 Branch Branch Jin Li, Microsoft Research 42
  43. 43. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Growth Trends 85 Packet Loss vs. Jitter (vs. Delay?) 86 Jin Li, Microsoft Research 43
  44. 44. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 The Usual Suspects 87 Packet Bursts 88 Jin Li, Microsoft Research 44
  45. 45. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 What kind of Enterprise User? 89 How QoS can help 90 Jin Li, Microsoft Research 45
  46. 46. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 QoS helps inside and between 91 branches! Observation 92 IP-based communication in the enterprise is growing Empirical results show poor calls for Wireless and VPN users QoS (DiffServ) is both used and useful! Jin Li, Microsoft Research 46
  47. 47. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 93 Available Bandwidth Estimation What is Available Bandwidth (ABW)? 94 ABW is the left-over capacity along an Internet path Jin Li, Microsoft Research 47
  48. 48. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Why Is It Useful? Maximizing QoE (Quality of Experience) in A/V conferencing Audio prefers minimum delay (high priority) Video prefers maximum rate (low priority) One Way Delay (OWD) = propagation delay (constant) + queuing delay (variable) One solution: measure ABW, encode and send video at the ABW rate Typical Targeting Scenario First hop is the bottleneck Cable modem, DSL, high-speed link… Timescale for the ABW estimation: 2-4 seconds Jin Li, Microsoft Research 48
  49. 49. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Why Is Measuring ABW Hard? Available bandwidth changes over time ABW measurements must be quick Audio packets (along the same path) should experience minimum delay Measurement must be non-intrusive Two Models Probe Rate Model (PRM) based solutions Pathload, TOPP, Pathchirp, Bfind, PTR … Probe Gap Model (PGM) based solutions Spruce, Delphi, IGI, Moseab … Jin Li, Microsoft Research 49
  50. 50. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Pathload (PRM) [Jain & Dovrolis] Send probe trains at various rates ABW is the probe rate at transition, where OWD is increasing (queuing delay is observed) Spruce (PGM) [Jacob et. al.] Send probe pairs/train at Ri (Ri > A), measure sending gaps and receiving gaps Compute A directly Jin Li, Microsoft Research 50
  51. 51. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Advantage/Disadvantages of The Approaches Advantages Disadvantages PGM based Fast estimation: Assumptions are not easy approaches to verify in practice Estimation can be done in single probe. PRM based No assumption Slow estimation: approaches iterative probes 102 Forward Error Correction Jin Li, Microsoft Research 51
  52. 52. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Block Based Erasure Resilient Coding 103 Original data: 1 2 3 k k messages ERC: 1 2 3 k k+1 n At a certain instance X X X X X X Some of the blocks may be lost in delivery. However, as long as there are at least k blocks delivered, the original data can be reconstructed. ERC in VoIP and Video Conferencing 104 VoIP Mainly packet replication, due to small VoIP packet size & low delay requirement Video Conferencing Packet loss protection (for I frame or P frame in HD) Each frame is separate into k msg, and protect by n-k msg. As long as there are less than n-k loss, the transmission succeeds Jin Li, Microsoft Research 52
  53. 53. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 ERC Terms 105 Number of Original Block: k Number of Coded Block: n Rate of ERC: k/n MDS: Maximum Distance Separable Any k of n coded block may recover the original The theoretical optimal performance Erasure Encoding: Mathematics Original data: x1 x2 xk Coded data: y1 y2 yn : Vectors on Galois Field. 106 Jin Li, Microsoft Research 53
  54. 54. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Example: ERC of 10MB Original data x1 x2 xk k=10, GF(28), each vector is 1MB. (10MB): Coded data: y1 y2 yn (n=30) 30 10 1M 1M 107 Erasure Decoding: Mathmatics 108 Original data: x1 x2 xk Coded data: y1 y2 yn Available Code select Jin Li, Microsoft Research 54
  55. 55. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Erasure Decoding: Mathmatics 109 Original data: x1 x2 xk Coded data: y1 y2 yn Original data can be recovered if the sub-generator matrix has a full rank k. Systematic vs Non-Systematic ERC 110 Original data: 1 2 3 k k messages Non systematic 1 2 3 k k+1 n ERC: Systematic 1 2 3 k k+1 n ERC: Systematic ERC Slightly low encoding & decoding complexity Even can’t recover, we can still use some original msg Jin Li, Microsoft Research 55
  56. 56. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Reed-Solomon 111 Has been around for decades Has systematic form Cauchy Reed-Solomon Code Tutorial, Jin Li Reed-Solomon Decoding Inverse Receive 112 Jin Li, Microsoft Research 56
  57. 57. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 113 Dejitter Buffer Variable Delay & Dejitter Buffer Queuing Queuing Queuing Delay Delay Delay Dejitter Buffer Queuing delay Dejitter buffers Variable packet sizes Jin Li, Microsoft Research 57
  58. 58. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Fixed Dejitter Buffer – Budget For Worst Case Coder Queuing Delay Delay Dejitter Buffer 40 ms 4-50 ms 50 ms Site A Site B Propagation Delay—8 ms (128kbps Bandwidth Total End-to-End Delay Codec delay: 40ms Propagation delay: 8ms Dejitter buffer: 50ms To accommodate queuing delay: 0-50 ms Total delay: 98ms Dejitter Buffer Size & Late Loss late loss buffering delay Fixed playout deadline and jitter Playout Jitter absorption: The playout rate is constant The tradeoff is between Dejitter buffer size and late loss Delay Packet Loss Jin Li, Microsoft Research 58
  59. 59. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Adaptive Playout and Dejitter Buffer Adaptation buffering delay Adaptive playout and jitter adaptation Playout Jitter Scaling of voice/video packets in highly dynamic way Playout schedule set according to past delays recorded Usually dejitter buffer size expand quickly to late packet arrival, and shrink slowly when jitter reduces Delay Packet Loss Improved tradeoff between buffering delay and late loss Playout rate is not constant Adaptive Play Out 118 Audio Adaptive Playout Packets push into Adaptive Playout module Render requests new waveform seg for playout Playout module passes packet to audio decoder Jin Li, Microsoft Research 59
  60. 60. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 119 Packet Loss Concealment Audio Packet Loss Concealment L ∆L i-2 i-1 i lost i+1 i+2 alignment found by correlation time i-2 i-1 i+1 i+2 time 2L 1.3 L Depend on voiced & unvoiced segment Jin Li, Microsoft Research 60
  61. 61. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Voiced segments Unvoiced segments Jin Li, Microsoft Research 61
  62. 62. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Concealment as (bi-directional) stretching Video Packet Loss Concealment 124 Spatial Concealment Use spatial correlation E.g., bilinear interpolation Projection onto convex sets Temporal Concealment Use correlation exists between consecutive frames Temporal replacement Boundary matching Jin Li, Microsoft Research 62
  63. 63. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Spatial-Temporal Concealment 125 126 Summary Jin Li, Microsoft Research 63
  64. 64. CCNC 2010 Tutorial: Towards Glitch Free VoIP and Video Conferencing 1/12/2010 Summary 127 VoIP/Video Conference Systems Infrastructure based P2P based Audio/Video Components Audio codec Video codec Acoustic echo cancellation Network components Primer of the Internet Network characteristics Available bandwidth estimation Forward error correction (FEC) Dejitter buffer Packet loss concealment Jin Li, Microsoft Research 64

×