SlideShare a Scribd company logo
1 of 34
MODELING THE QOE OF
RATE CHANGES IN
SKYPE/SILK VOIP CALLS
CHIEN-NAN CHEN      CING-YU CHU
                    SU-LING YEH
                    HAO-HUA CHU
                    POLLY HUANG


UNIVERSITY OF       NATIONAL TAIWAN
ILLINOIS, URBANA-   UNIVERSITY
CHAMPAIGN

                                      1
OUTLINE
• Motivation
• Preliminary Experiment
• Proposed Model
• Large-Scale Experiment
• Evaluation
• Conclusion



                           2
Conclusion      Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




VOICE OVER IP
              




          
                              Internet
                                  




 Delay        Jitter    Packet Loss                Bandwidth Fluctuation


                                                                                     3
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




RATE ADAPTATION
• Available bandwidth 
                    Ramping up the sending rate

     Is the quality improved proportionally?

• Available bandwidth 
                  Tuning down the sending rate

          Rate change  Disturbing users?

                                                                                   4
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




GOAL
• Investigating the relationship of
       Sending rate vs. Perceived quality
• To explore the influence of
      Rate change magnitude/frequency


• Methodology
  • Synthesized VoIP calls
  • User study experiments
                                                                                   5
Conclusion     Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




CONTRIBUTION
• Sending bitrate vs. user perception
                Logarithmic Relationship
• Frequency of rate change
                Logarithmic Relationship
• Magnitude of rate change
          Complicated, but Interesting
• Closed-form models to predict user
  perception under bandwidth fluctuation
                                                                                     6
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




PRELIMINARY EXPERIMENT
• To confirm the influence of
  • sending bitrate
  • rate change magnitude
  • rate change frequency


• 5-level MOS (Mean Opinion Score)
• 14 participants


                                                                                   7
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.      Motivation




AUDIO TRACK PRODUCTION
• Skype/SILK audio codec
   • 30s audio track
   • sentences without contextual connection
• Fixed-rate tracks
               5.6    9.5   13.3 17.2 21.1 25.0 28.9 32.8 36.6 40.6


• Variable-rate tracks                                                Bitrate (kbps)




                                                                                      8
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


RESULT
FIXED-RATE
• MOS vs. sending bitrate


User Variation

 Logarithmic
   Trend




                                                                                   9
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


RESULT
VARIABLE-RATE
• MOS - ΔT plot




                                                 Rate change
                                                   matters!




                                                                              10
Conclusion   Evaluation   Large Exp.   Proposed Model    Pre. Exp.   Motivation


EFFECT OF RATE CHANGE
FREQUENCY
• When ΔT varies…



                                                         Logarithmic
                                                           Trend




                                                                               11
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


EFFECT OF RATE CHANGE
MAGNITUDE
• When sharing the same average bitrate…



                                                          Magnitude 



                                                               MOS 




                                                                              12
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


EFFECT OF RATE CHANGE
MAGNITUDE
• However, with the same magnitude…


                                             Higher (hr + lr)




                Lower (hr + lr)




                                                                              13
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




SHORT SUMMARY
• Fixed-rate
  • MOS – bitrate  logarithmic


• Variable-rate
  • MOS – ΔT  logarithmic
  • MOS – (hr, lr)
          • hr - lr up  MOS down
          • hr + lr up  MOS up

                                                                              14
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




PROPOSED MODELS
• Fixed-rate model



• Variable-rate model




  Massive Data                                 Numerical Fitting

                                                                              15
Conclusion   Evaluation     Large Exp.   Proposed Model   Pre. Exp.    Motivation




LARGE-SCALE EXPERIMENT
• Same methodology


• 127 participants
   • Each track is scored by 30 participants


• Rate selection
r9 r8 r7 r6 r5 r4 r3                r2                       r1
5.6 6.1 7.1 8.5
               10.714.119.4        27.7                     40.6
                                                                    Bitrate (kbps)

                                                                                 16
Conclusion      Evaluation     Large Exp.   Proposed Model     Pre. Exp.   Motivation




SIGNIFICANCE OF FACTORS
• ANOVA tests
  • MOS – sending bitrate
    Significant
  • Interaction between ΔT and (hr, lr)
    Significant
  • MOS - ΔT
               Test    p-value       Test     p-value     Test     p-value
               r1r2       .31        r6r7       .31       r7r8        .26
               r3r4       .42        r6r8       .11       r7r9        .34
               r4r5       .31        r6r9       .09       r8r9        .32
                                                                                     17
Conclusion     Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


MODEL SPECIFICS
FIXED-RATE MODEL
• α=4.091, β=1.515, and γ=1.000


  • with R-square = 0.96

               Lower bound of user perception (?)




                  close to the lowest bitrate of SILK
                                                                                18
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


MODEL SPECIFICS
VARIABLE-RATE MODEL
• Logarithmic regression on each (hr, lr)
  pair

                      (r1, r2): p12 x ln(ΔT) + q12

                      (r1, r3): p13 x ln(ΔT) + q13

                      (r1, r4): p14 x ln(ΔT) + q14

                      (r1, r5): p15 x ln(ΔT) + q15
                                  :
                      SCALE()     :            SHIFT()
                                  :
                                                                              19
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


MODEL SPECIFICS
SCALE()
• Polynomial regression
  • x = hr – lr , y = hr + lr




                                                                              20
Conclusion    Evaluation    Large Exp.     Proposed Model     Pre. Exp.     Motivation
   Conclusion    Evaluation     Large Exp.     Proposed Model     Pre. Exp.    Motivation

MODEL SPECIFICS
SHIFT()
• Independent to ΔT
• Basic idea
  • ΔT approaches the track duration
  • Fluctuation diminishes




                                                                                      21
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


EVALUATION
GOODNESS OF FIT
• Training data
  • R-square = 0.86




                                                                              22
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


EVALUATION
ACCURACY OF PREDICTION
• 2 dataset independent to training data
  • Dataset I: Preliminary experiment
  • Dataset II: Additional (New) experiment




                                                                              23
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




PESQ
• Perceptual Evaluation of Speech Quality
• Limited spectrum
  • Narrow-band: 8k Hz
  • Wide-band: 16k Hz
  (SILK: 8k, 12k, 16k and 24 k Hz)

• Requires both original and degraded
  audio files

                                                                              24
Conclusion   Evaluation   Large Exp.    Proposed Model    Pre. Exp.   Motivation


  COMPARISON WITH
  PESQ – FIXED RATE




 model     Proposed        PESQ                 model         Proposed     PESQ
R-square     0.9601        0.7841           Avg. Err. Ratio    3.68%       14.59%

                                                                                   25
Conclusion   Evaluation   Large Exp.    Proposed Model    Pre. Exp.   Motivation


  COMPARISON WITH
  PESQ – VARIABLE RATE




 model     Proposed        PESQ                 model         Proposed     PESQ
R-square     0.2512       -0.3491           Avg. Err. Ratio    8.03%       12.60%

                                                                                   26
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation


COMPARISON ON
AMR-WB
• AMR-WB audio codec
  • Older Codec
  • Widely used in 3G network
  • 9 difference coding bitrates
• User study experiment
  • Same methodology
  • 14 participants



                                                                              27
Conclusion   Evaluation       Large Exp.    Proposed Model    Pre. Exp.   Motivation


 COMPARISON ON
 AMR-WB




                                Proposed                                       Proposed
                                PESQ                                           PESQ




 model     Proposed           PESQ                 model         Proposed     PESQ
R-square     0.7878           0.6289           Avg. Err. Ratio    2.18%       2.86%

                                                                                      28
Conclusion   Evaluation   Large Exp.   Proposed Model   Pre. Exp.   Motivation




CONCLUSION
• The logarithmic relationship (Weber-Fechner
  Law) is observed in the MOS-bitrate relation-
  ship of Skype/SILK
• Rate change frequency (W-F Law) and
  magnitude (complicated) have significant
  influence on perceived quality
• We have established both fixed- (SIGCOMM’12
  W-MUST) and variable-rate models
• User-centric rate adaptation for VoIP
  applications (coming next)
                                                                              29
Q&A




      30
MODEL SPECIFICS
SHIFT()
• Dominant quality
  • Expected quality when fluctuation diminishes
      • ΔT approaches the track length
            dominant
             quality




                       ΔT (second)                 31
MODEL SPECIFICS
SHIFT()
• Dominant quality: D()
                      hr = 14.1 kbps




                                       D() = MOSh




                                                32
MODEL SPECIFICS
SHIFT()
• hr > 14.1 kbps
  • normalized y-axis




                        33
MODEL SPECIFICS
SHIFT()
• Linear to the MOS difference (hr > 14.1 kbps)




                                                  34

More Related Content

Similar to Mm presentation bkk

The Bootstrap and Beyond: Using JSL for Resampling
The Bootstrap and Beyond: Using JSL for ResamplingThe Bootstrap and Beyond: Using JSL for Resampling
The Bootstrap and Beyond: Using JSL for ResamplingJMP software from SAS
 
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...Daniel Varro
 
MICP QC & Interpretation Workflow
MICP QC & Interpretation WorkflowMICP QC & Interpretation Workflow
MICP QC & Interpretation WorkflowJules Reed
 
MICP QC & Interpretation Workflow
MICP QC & Interpretation WorkflowMICP QC & Interpretation Workflow
MICP QC & Interpretation WorkflowJules Reed
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Martin Pelikan
 
Mining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert BifetMining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert BifetJ On The Beach
 
Areva10 Technical Day
Areva10 Technical DayAreva10 Technical Day
Areva10 Technical DaySDTools
 
Introduction_to_dataassimilation
Introduction_to_dataassimilationIntroduction_to_dataassimilation
Introduction_to_dataassimilationNils van Velzen
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAAlbert Bifet
 
Assessment Model for Opportunistic Routing
Assessment Model for Opportunistic RoutingAssessment Model for Opportunistic Routing
Assessment Model for Opportunistic RoutingWaldir Moreira
 
Kaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeKaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeBernard Ong
 
2007 Oral Preliminary Defense
2007 Oral Preliminary Defense2007 Oral Preliminary Defense
2007 Oral Preliminary DefenseJon Ernstberger
 
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...PyData
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudRevolution Analytics
 
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + FugueIntuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + FugueDatabricks
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleDatabricks
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OSri Ambati
 

Similar to Mm presentation bkk (20)

The Bootstrap and Beyond: Using JSL for Resampling
The Bootstrap and Beyond: Using JSL for ResamplingThe Bootstrap and Beyond: Using JSL for Resampling
The Bootstrap and Beyond: Using JSL for Resampling
 
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
 
MICP QC & Interpretation Workflow
MICP QC & Interpretation WorkflowMICP QC & Interpretation Workflow
MICP QC & Interpretation Workflow
 
MICP QC & Interpretation Workflow
MICP QC & Interpretation WorkflowMICP QC & Interpretation Workflow
MICP QC & Interpretation Workflow
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
Mining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert BifetMining big data streams with APACHE SAMOA by Albert Bifet
Mining big data streams with APACHE SAMOA by Albert Bifet
 
Areva10 Technical Day
Areva10 Technical DayAreva10 Technical Day
Areva10 Technical Day
 
Zero defect
Zero defectZero defect
Zero defect
 
Introduction_to_dataassimilation
Introduction_to_dataassimilationIntroduction_to_dataassimilation
Introduction_to_dataassimilation
 
07 Handling of Uncertainties in the Safety Case
07 Handling of Uncertainties in the Safety Case07 Handling of Uncertainties in the Safety Case
07 Handling of Uncertainties in the Safety Case
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Assessment Model for Opportunistic Routing
Assessment Model for Opportunistic RoutingAssessment Model for Opportunistic Routing
Assessment Model for Opportunistic Routing
 
Kaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeKaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning Challenge
 
2007 Oral Preliminary Defense
2007 Oral Preliminary Defense2007 Oral Preliminary Defense
2007 Oral Preliminary Defense
 
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
 
15303589.ppt
15303589.ppt15303589.ppt
15303589.ppt
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + FugueIntuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at Scale
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
 

More from Shannon Chen

Mmsys14 amphi - slideshare
Mmsys14 amphi - slideshareMmsys14 amphi - slideshare
Mmsys14 amphi - slideshareShannon Chen
 
A3c mmgc13-slideshare
A3c mmgc13-slideshareA3c mmgc13-slideshare
A3c mmgc13-slideshareShannon Chen
 
A Psychophysical Design towards Fair Bandwidth Allocation among VoIP Sessions
A Psychophysical Design towards Fair Bandwidth Allocation among VoIP SessionsA Psychophysical Design towards Fair Bandwidth Allocation among VoIP Sessions
A Psychophysical Design towards Fair Bandwidth Allocation among VoIP SessionsShannon Chen
 

More from Shannon Chen (6)

f.live
f.livef.live
f.live
 
Cyphy
CyphyCyphy
Cyphy
 
Mmsys14 amphi - slideshare
Mmsys14 amphi - slideshareMmsys14 amphi - slideshare
Mmsys14 amphi - slideshare
 
A3c mmgc13-slideshare
A3c mmgc13-slideshareA3c mmgc13-slideshare
A3c mmgc13-slideshare
 
eq
eqeq
eq
 
A Psychophysical Design towards Fair Bandwidth Allocation among VoIP Sessions
A Psychophysical Design towards Fair Bandwidth Allocation among VoIP SessionsA Psychophysical Design towards Fair Bandwidth Allocation among VoIP Sessions
A Psychophysical Design towards Fair Bandwidth Allocation among VoIP Sessions
 

Mm presentation bkk

  • 1. MODELING THE QOE OF RATE CHANGES IN SKYPE/SILK VOIP CALLS CHIEN-NAN CHEN CING-YU CHU SU-LING YEH HAO-HUA CHU POLLY HUANG UNIVERSITY OF NATIONAL TAIWAN ILLINOIS, URBANA- UNIVERSITY CHAMPAIGN 1
  • 2. OUTLINE • Motivation • Preliminary Experiment • Proposed Model • Large-Scale Experiment • Evaluation • Conclusion 2
  • 3. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation VOICE OVER IP   Internet  Delay Jitter Packet Loss Bandwidth Fluctuation 3
  • 4. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation RATE ADAPTATION • Available bandwidth  Ramping up the sending rate Is the quality improved proportionally? • Available bandwidth  Tuning down the sending rate Rate change  Disturbing users? 4
  • 5. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation GOAL • Investigating the relationship of Sending rate vs. Perceived quality • To explore the influence of Rate change magnitude/frequency • Methodology • Synthesized VoIP calls • User study experiments 5
  • 6. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation CONTRIBUTION • Sending bitrate vs. user perception  Logarithmic Relationship • Frequency of rate change  Logarithmic Relationship • Magnitude of rate change Complicated, but Interesting • Closed-form models to predict user perception under bandwidth fluctuation 6
  • 7. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation PRELIMINARY EXPERIMENT • To confirm the influence of • sending bitrate • rate change magnitude • rate change frequency • 5-level MOS (Mean Opinion Score) • 14 participants 7
  • 8. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation AUDIO TRACK PRODUCTION • Skype/SILK audio codec • 30s audio track • sentences without contextual connection • Fixed-rate tracks 5.6 9.5 13.3 17.2 21.1 25.0 28.9 32.8 36.6 40.6 • Variable-rate tracks Bitrate (kbps) 8
  • 9. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation RESULT FIXED-RATE • MOS vs. sending bitrate User Variation Logarithmic Trend 9
  • 10. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation RESULT VARIABLE-RATE • MOS - ΔT plot Rate change matters! 10
  • 11. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation EFFECT OF RATE CHANGE FREQUENCY • When ΔT varies… Logarithmic Trend 11
  • 12. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation EFFECT OF RATE CHANGE MAGNITUDE • When sharing the same average bitrate… Magnitude  MOS  12
  • 13. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation EFFECT OF RATE CHANGE MAGNITUDE • However, with the same magnitude… Higher (hr + lr) Lower (hr + lr) 13
  • 14. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation SHORT SUMMARY • Fixed-rate • MOS – bitrate  logarithmic • Variable-rate • MOS – ΔT  logarithmic • MOS – (hr, lr) • hr - lr up  MOS down • hr + lr up  MOS up 14
  • 15. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation PROPOSED MODELS • Fixed-rate model • Variable-rate model Massive Data Numerical Fitting 15
  • 16. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation LARGE-SCALE EXPERIMENT • Same methodology • 127 participants • Each track is scored by 30 participants • Rate selection r9 r8 r7 r6 r5 r4 r3 r2 r1 5.6 6.1 7.1 8.5 10.714.119.4 27.7 40.6 Bitrate (kbps) 16
  • 17. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation SIGNIFICANCE OF FACTORS • ANOVA tests • MOS – sending bitrate Significant • Interaction between ΔT and (hr, lr) Significant • MOS - ΔT Test p-value Test p-value Test p-value r1r2 .31 r6r7 .31 r7r8 .26 r3r4 .42 r6r8 .11 r7r9 .34 r4r5 .31 r6r9 .09 r8r9 .32 17
  • 18. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation MODEL SPECIFICS FIXED-RATE MODEL • α=4.091, β=1.515, and γ=1.000 • with R-square = 0.96 Lower bound of user perception (?) close to the lowest bitrate of SILK 18
  • 19. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation MODEL SPECIFICS VARIABLE-RATE MODEL • Logarithmic regression on each (hr, lr) pair (r1, r2): p12 x ln(ΔT) + q12 (r1, r3): p13 x ln(ΔT) + q13 (r1, r4): p14 x ln(ΔT) + q14 (r1, r5): p15 x ln(ΔT) + q15 : SCALE() : SHIFT() : 19
  • 20. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation MODEL SPECIFICS SCALE() • Polynomial regression • x = hr – lr , y = hr + lr 20
  • 21. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation MODEL SPECIFICS SHIFT() • Independent to ΔT • Basic idea • ΔT approaches the track duration • Fluctuation diminishes 21
  • 22. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation EVALUATION GOODNESS OF FIT • Training data • R-square = 0.86 22
  • 23. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation EVALUATION ACCURACY OF PREDICTION • 2 dataset independent to training data • Dataset I: Preliminary experiment • Dataset II: Additional (New) experiment 23
  • 24. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation PESQ • Perceptual Evaluation of Speech Quality • Limited spectrum • Narrow-band: 8k Hz • Wide-band: 16k Hz (SILK: 8k, 12k, 16k and 24 k Hz) • Requires both original and degraded audio files 24
  • 25. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation COMPARISON WITH PESQ – FIXED RATE model Proposed PESQ model Proposed PESQ R-square 0.9601 0.7841 Avg. Err. Ratio 3.68% 14.59% 25
  • 26. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation COMPARISON WITH PESQ – VARIABLE RATE model Proposed PESQ model Proposed PESQ R-square 0.2512 -0.3491 Avg. Err. Ratio 8.03% 12.60% 26
  • 27. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation COMPARISON ON AMR-WB • AMR-WB audio codec • Older Codec • Widely used in 3G network • 9 difference coding bitrates • User study experiment • Same methodology • 14 participants 27
  • 28. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation COMPARISON ON AMR-WB Proposed Proposed PESQ PESQ model Proposed PESQ model Proposed PESQ R-square 0.7878 0.6289 Avg. Err. Ratio 2.18% 2.86% 28
  • 29. Conclusion Evaluation Large Exp. Proposed Model Pre. Exp. Motivation CONCLUSION • The logarithmic relationship (Weber-Fechner Law) is observed in the MOS-bitrate relation- ship of Skype/SILK • Rate change frequency (W-F Law) and magnitude (complicated) have significant influence on perceived quality • We have established both fixed- (SIGCOMM’12 W-MUST) and variable-rate models • User-centric rate adaptation for VoIP applications (coming next) 29
  • 30. Q&A 30
  • 31. MODEL SPECIFICS SHIFT() • Dominant quality • Expected quality when fluctuation diminishes • ΔT approaches the track length dominant quality ΔT (second) 31
  • 32. MODEL SPECIFICS SHIFT() • Dominant quality: D() hr = 14.1 kbps D() = MOSh 32
  • 33. MODEL SPECIFICS SHIFT() • hr > 14.1 kbps • normalized y-axis 33
  • 34. MODEL SPECIFICS SHIFT() • Linear to the MOS difference (hr > 14.1 kbps) 34

Editor's Notes

  1. Hello everyone, my name is Cing-Yu Chu. Today, on behalf of our research group, I am going to introduce our paper "Modeling the QoE of Rate Changes in SKYPE/SILK VoIP Calls". It is a joint work with Chien-Nan Chen who is now in UIUC, and Su-Ling Yeh, Hao-Hua Chu and Polly Huang from National Taiwan University.
  2. Here is the outline of my presentation today. I will start with our motivation and then introduce the preliminary experiment we used to proposed the models for quality prediction. We then tried to find out the coefficients of these models using a large-scale experiment, and evaluate the derived models with both training data and other dataset that are independent to the training data.
  3. So let's begin with what is Voice over IP, or VoIP, application. As shown in this slide, VoIP applications allow users to send their voice data or voice packet into the Internet. The Internet will then help to forward these data or packets to the receivers, or another end user. In this process, we can easily see that the quality of VoIP applications would be influenced by different network conditions such as delay, jitter and packet loss. And all of these can actually be attributed to the bandwidth fluctuation.
  4. Therefore, in order to deal with the bandwidth fluctuation, the rate adaptation has long been a classical topic in VoIP application. Traditionally, to fully utilize the network resource, we ramp up the sending rate when there is extra bandwidth available. We believe this can improve the service quality. However, our concern here is that if the improved quality is proportional to the increased sending rate. On the other hand, when the available bandwidth is insufficient, we have to tune down the sending rate to avoid congestion or packet loss. Combining ramping up and tuning down the sending rate, we actually introduce the quality fluctuation, or we call it rate change event. So our second concern is we wonder if such rate change would disturb users and make them unhappy.
  5. So, in this work, we tried to understand the relationship between the sending bitrate and the perceived quality. Also, we are interested in how the rate change, both magnitude and frequency, would influence the perceived quality. We answer these questions by synthesizing VoIP calls with different sending bitrate and rate change magnitude and frequency. And then we conducted a series of user study experiment to see how real human perceive these VoIP calls.
  6. In the end of this work, we were able to find that the relationship between the sending bitrate and perceived quality is logarithmic. And such logarithmic relationship can also be observed in the influence of rate change frequency. As for the rate change magnitude, we found it to be kind of complicated but it's interesting, and we will talk about the detail later. After we had the above findings, we were able to derive closed-form models to predict the perceived quality.
  7. All of our observations are based on a preliminary experiment. The purpose of this preliminary experiment is to confirm if sending bitrate, rate change magnitude/frequency would really influence the perceived quality. If yes, then how do they influence the perceived quality. In our work, we adopted the 5-level Mean Opinion Score (MOS) which is recommended by ITU, with 5 is the best quality while 1 is the worst. We use MOS to represent the perceived quality throughout our work. In this experiment, totally 14 particpiants were recruited.
  8. Here we have to explain how did we produce the synthesized VoIP calls. Since Skype might be the most popular VoIP application, we chose it as our research target, and SILK is the audio codec adopted by Skype in its latest version. The most desirable property of SILK is that it allows arbitrary coding bitrate from 5 kbps to 40 kbps, which is very suitable for investigating the influence of sending rate and rate change. We used SILK to endcode and decode an original audio track. The length of this audio track is 30 seconds, and it is composed of several simple and meaningful sentences. After being processed by SILK, we could get the degraded audio tracks. These audio tracks could be classified into 2 categories. The first on is called fixed-rate tracks. We evenly chose 10 different rates between the maximum and minimum of SILK's coding bitrate, and we used these bitrate to encode and decode the audio files to form the fixed-rate tracks. The second on is called variable-rate tracks. A variable-rate track is defined by three parameters: high rate, low rate and delta T. We switched the coding bitrate between high rate and low rate every delta T period to introduce the quality fluctuation or rte change.
  9. Here is the result of the fixed-rate case and the figure is the MOS-bitrate plot where the x-axis is the bitrate and the y-axis is the MOS. From this figure, we can see that there exists user variation which is indicated by the error bar of one standard deviation. However, we can still observe a trend if we look at the average user score which is indicated by the red points. The trend we found here is a logarithmic relationsip between the bitrate and the perceived quality.
  10. As for the variable-rate case, the MOS-delta T plot here can tell us how the rate change influence the perceived quality. Again, the red points here are the average user scores, and they are the result of variable-rate tracks whose high rate is 40.6 and low rat is 17.2 kbps. We can clearly observe that the perceived quality changes when we vary the delta T, and the perceived quality becomes better when the delta T is bigger. Furthermore, the three horizontal lines in this figure are the quality of fixed-rate case with bitrate equals the high rate, low rate and the average of high rate and low rate. So it tells us that the quality of a variable-rate track is different from the quality of average bitrate. And also, even though the bitrate of this variable-rate track is never below the low rate, its quality could be worse than the low rate when the rate change is rapid. So, we can conclude that, the rate change plays an important role!
  11. Then, let's look at the influence of frequency and magnitude separately. Here is the result of a few variable-rate tracks. Because the logarithmic regression on all these curves reveal good fitting result. We simply conclude that the influence of rate change frequency is logarithmic.
  12. As for the rate change magnitude, here is the result of variable-rate tracks that shared the same average bitrate. But the upper curve has smaller rate change magnitude and its quality is always better than the lower one. This suggests that when the rate change magnitude is bigger, the quality is worse. However, does it mean the difference between high rate and low rate is the only factor that influence that quality? The answer is NO.
  13. Here are the cases where delta T is 3 seconds. This time we plot x-axis as the difference between high rate and low rate while the y-axis is still the MOS. The red line is the result of tracks that have higher high rate and the blue line is the result of tracks that have lower high rate. So, from this figure, we can see that even though the rate difference is the same, tracks with both high rate and low rate are high have better quality. So we can conclude the quality is determined by only the rate difference but also the level of both high rate and low rate.
  14. Here is the short summary of all the findings in the preliminary experiment. Based on all of these findings, we then proposed 2 models for quality prediction.
  15. Because we have identified the relationship between bitrate and the perceived quality is logarithmic, we simply proposed a logarithmic model for the fixed-rate case. As for the variable-rate case, because we have found that the influence of rate change frequency is logarithmic, we then proposed a logarithmic model in this form and distributed the impact of high rate and low rate into 2 components: SCALE and SHIFT. So what we have to do next is to collect a lot of data and use these data to find out all the coefficients in the proposed models.
  16. So, we have conducted a large-scale experiment. The methodology used here is similar to the preliminary experiment, but this time we have totally recruited 127 participants to make sure each audio track is scored by 30 participants to provide a more reliable result. And also, different from the the preliminary experiment, this time we chose the coding bitrate based on the perceived quality. Since we already have the result from the preliminary experiment, now we can divide the perceived quality evenly based on the curve and find out the corresponding bitrate. This can help us to collect more data points around the region MOS changes fast. It also allows us to examine the result from a quality perspective instead of only from bitrate.
  17. After the data collection, we then applied ANOVA tests to check if the influence of each factor is significant. The result suggested that, the sending has significant influence. And, the interaction between delta T and high rate, low rate is also significant which supports the multiplication of deltat T and SCALE. As for the influence of delta T, we found in most cases, it's significant, however, when the quality of high rate and low rate are very close to each other, the influence in not significant because users seems to be unable to tell if there is a rate change or quality fluctuation.
  18. ok, so, with the collected data, we have found the coefficients of the fixed-rate model with a high r-square value. With this model, we found there is something interestring. If we set the MOS to be 1 which means the worst quality, and try to find out the corresponding bitrate using our model. We can see this bitrate would be close to 5 kbps which is also the minimum of SILK's coding rate. So, we wonder if this means Skype is actually aware of where is the worst quality that users perceive? But of course, we don't know the answer.
  19. As for the variable-rate case, because we need to have the gourd truth of both SCALE and SHIFT. We first grouped all the variable-rate tracks based on their high rate and low rate, which means each group would contain 5 different delta T. We then applied the logarithmic regression on each group and the result of these regression formed the ground truth of SCALE and SHIFT.
  20. With this ground truth, we can now explore how SCALE and SHIFT interact with the high rate and low rate. As I mentioned earlier, the SCALE is not only determined by the rate difference but also the level of both high rate and low rate. So, we used 2 variables, x and y, to represent the difference and level of high rate and low rate. We then applied a polynomial regression on the collected data and got this 3D plot. With this 3D plot, we are albe to observe how the SCALE changes with difference high rate and low rate. In order to explain this figure easier and observe the trend clearly, we convert this 3D plot into a contour plot. Here, we can clearly see that the SCALE becomes larger when the rate difference increases. Actually, we would like to interpret the SCALE as the sensitivity to rate change. So a larger SCALE would amplify the influence of dealta T. On the other hands, a smaller SCALE means the sensitivity to rate change is low so that users can not perceive there is a rate change. From this contour, we can also observe that when the level of both high rate and low rate are higher, the SCALE is smaller. It is because that in such case, the quality of the high rate and low rate is quite close to each other, so users can not really tell the difference and leads to a lower sensitivity.
  21. And the last part of the variable-rate model is SHIFT. It is a term that doesn't interact with delta T. Because the derivation of SHIFT is more like a pure numerical fitting task, we would like to suggest that people who are interested in this part can refer to our paper for more detail.
  22. To evaluate our model, we first check if the derived model can capture the training data properly. Here the x-axis is the predicted score while the y-axis is the average user score, so the diagonal black line represents perfect prediction. As we can see from this figure, all the points are dense around the perfect prediction which is also supported by a high r-square value. This means our model is able to capture the user perception well.
  23. To see if our model is still applicable when the content or participants are different, we used 2 dataset that are independent to the training dataset. The first one is the data collected from the preliminary experiment, the second one is an additional experiment. And the audio content of this new dataset is different from the preliminary and large-scale experiment. The prediction accuracy is illustrated in this figure and it suggests that our model is actually robust enough when the participant or the audio content is different.
  24. So, in the end of my presentation, I would like conclude our work with that, we have identified the relationship between the bitrate and the perceived quality is logarithmic which actually echoes the well-known psychophysics law call Weber-Fechner Law. This law describe that the relationship between the intensity of stimulus and human perception is logarithmic. So, in our case, we can regard the bitrate as a kind of stimulus. And we further explored the influence of rate change, both frequency and magnitude. We then derived closed-form models for quality prediction. So, we are now interested if we can use these models to design a user-centric rate adaptation mechanism for VoIP applications, and it is also what we are working on.
  25. Thanks for your attention and I am willing to take question.