Your SlideShare is downloading. ×
  • Like
Quantifying Skype User Satisfaction
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply
Published

The success of Skype has inspired a generation of peer-to-peer-based solutions for satisfactory real-time multimedia services over the Internet. However, fundamental questions, such as whether VoIP …

The success of Skype has inspired a generation of peer-to-peer-based solutions for satisfactory real-time multimedia services over the Internet. However, fundamental questions, such as whether VoIP services like Skype are good enough in terms of user satisfaction, have not been formally addressed. One of the major challenges lies in the lack of an easily accessible and objective index to quantify the degree of user satisfaction.

In this work, we propose a model, geared to Skype, but generalizable to other VoIP services, to quantify VoIP user satisfaction based on a rigorous analysis of the call duration from actual Skype traces. The User Satisfaction Index (USI) derived from the model is unique in that 1) it is composed by objective source- and network-level metrics, such as the bit rate, bit rate jitter, and round-trip time, 2) unlike speech quality measures based on voice signals, such as the PESQ model standardized by ITU-T, the metrics are easily accessible and computable for real-time adaptation, and 3) the model development only requires network measurements, i.e., no user surveys or voice signals are necessary. Our model is validated by an independent set of metrics that quantifies the degree of user interaction from the actual traces.

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,002
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
32
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Quantifying User Satisfaction Sheng‐Wei (Kuan‐Ta) Chen  Institute of Information Science, Academia Sinica Collaborators: Chun‐Ying Huang Polly Huang Chin‐Laung Lei (National Taiwan University) 2008/10/30
  • 2. Motivation Are users satisfied with our system? User survey Market response User satisfaction metric To make a system self‐adaptable in real time for better user  experience User satisfaction metric Need of a Quality‐of‐Experience (QoE) metric! Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 2
  • 3. QoE metrics FTP applications: data throughput rate Web applications: response time and page load time VoIP applications: voice quality (fidelity, loudness, noise),  conversational delay, echo Online games: interactivity, responsiveness, consistency,  fairness QoE is multi‐dimensional esp. for  real‐time interactive applications! Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 3
  • 4. What path should Skype choose? Internet Which path is “the best”? path avail bandwidth loss rate delay 10 Kbps 2% 100 ms 20 Kbps 1% 300 ms 30 Kbps 3% 500 ms Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 4
  • 5. QoS and QoE  QoS (Quality of service) The quality level of “native” performance metric Communication networks: delay, loss rate Voice/audio codec: fidelity DBMS: query completion time QoE (Quality of experience) How users “feel” about a service Usually multi‐dimensional, and tradeoffs exist between  different dimensions (download time vs. video quality,  responsivess vs. smoothness) However, a unified (scalar) index is normally desired! Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 5
  • 6. A typical relationship between QoS and QoE Hard to tell “very bad” from “extremely bad” QoE Marginal benefit is small QoS,  e.g., network bandwidth Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 6
  • 7. Mapping between QoS and QoE Which QoS metric is most influential on users’ perceptions  (QoE)? Source rate? Loss? Delay? Jitter? Combination of the above?  Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 7
  • 8. How to measure QoE: A quick review Subjective evaluation procedures Human studies, not scalable Costly! Objective evaluation procedures Statistical models based on subjective evaluation results Pros: Computation without human involvement Cons: (Over‐)simplifications of model parameters E.g., use a single “loss rate” to capture the packet loss process E.g., assume every voice/video packet is equally important  Not consider external effects such as loudness and quality of handsets Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 8
  • 9. Subjective Evaluation Procedures Single Stimulus Method (SSM) Single Stimulus Continuous  Quality Evaluation (SSCQE) Double Stimulus Continuous  Quality Scale (DSCQS) Double Stimulus Impairment  Scale (DSIS) Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 9
  • 10. Objective Evaluation Methods Refereneced models speech‐layer model: PESQ (ITU‐T P.862) Compare original and degraded signals Unreferenced models (no original signals required) speech‐layer model: P.VTQ (ITU‐T P.563) Detect unnatural voices, noise, mute/interruptions in degraded signals network‐layer model:  E‐model (ITU‐T G.107) Regression model based on delay, loss rate, and 20+ variables Equations are over‐complex for physical interpretation, e.g. ∙ ¸ Xolr 8 1 Xolr Is = 20 {1 + ( ) }8 − 8 8 Xolr = OLR + 0.2(64 + N o − RLR)
  • 11. Our goals An objective QoE assessment framework passive measurement (thus scalable) easy to construct models (for your own application) easy to access input parameters easy to compute in real time Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 11
  • 12. Our contributions USI = 2.15 × log(bit rate) − 1.55 × log(jitter) − 0.36 × RTT bit rate: data rate of voice packets jitter: receiving rate jitter (level of network congestion) RTT: round‐trip times between two parties An index for Skype user satisfaction derived from real‐life Skype call sessions verified by users’ speech interactivities in calls accessible and computable in real time Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 12
  • 13. Talk outline The Question Measurement Modeling Validation Significance Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 13
  • 14. Setting things up Uplink Port Mirroring Relayed Traffic Traffic Monitor L3 switch Dedicated Skype node Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 14
  • 15. Capturing Skype traffic 1. Identify Skype hosts and ports Track hosts sending http to “ui.skype.com” Track their ports sending UDP within 10 seconds (host, port) Other parties which communicate with discovered host‐ port pairs 2. Record packets Whose source or destination ∈ these (host, port) Reduce the # of traced packets to 1‐2% Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 15
  • 16. Extracting Skype calls 1. Take these sessions Average packet rate within (10, 100) pkt/sec Average packet size within (30, 300) bytes For longer than 10 seconds 2. Merge two sessions into one relay session  If the two sessions share a common relay node Their start and finish time are close to each other with 30  seconds And their packet rate series are correlated Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 16
  • 17. Probing RTTs As we take traces Send ICMP ping, application‐level ping & traceroute Exponential intervals Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 17
  • 18. Trace Summary Direct sessions Campus Network Internet Uplink Port Mirroring Relayed Traffic Traffic Monitor L3 switch Relayed sessions Dedicated Skype node Category Calls Hosts Avg. Time Direct 253 240 29 min Relayed 209 369 18 min Total 462 570 24 min Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 18
  • 19. Talk outline The Question Measurement Modeling Validation Significance Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 19
  • 20. The intuition behide our analysis The conversation quality (i.e., QoE) perceived by call  parties is more or less related to the call duration The network conditions of a VoIP call are independent of importance of talk content call parties’ schedule call parties’ talkativeness other incentives to talk (e.g., free of charge) Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 20
  • 21. First, getting a better sense call duration (QoE) correlated? source rate service level relayed? TCP / UDP? jitter network quality RTT (QoS factors) Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 21
  • 22. Is call duration related to each factor? For each factor Scatter plot of the factor to the call duration See whether they are positively, negatively, or not correlated Hypothesis tests Confirm whether they are indeed positively, negatively, or not  correlated Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 22
  • 23. Call duration vs. jitter (std dev of received bytes/sec) 95% confidence band avg call duration (min) of the average average jitter (Kbps) There are short calls with low jitters The average shows a negative correlation between the 2 variables Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 23
  • 24. Effect of Jitter – Hypothesis Testing The probability distribution of hanging up a call Null Hypothesis All the survival curves are equivalent Log‐rank test: P < 1e‐20 We have > 99.999% confidence claiming  jitters are correlated with call duration Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 24
  • 25. Effect of Source Rate (the bandwidth Skype intended to use) Average session time (min) Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 25
  • 26. The better sense call duration correlated? source rate positive service level relayed? negative TCP / UDP? none jitter negative network quality RTT negative (non‐significant) Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 26
  • 27. Linear regression? No! Reasons Assumptions no longer hold errors are not independent and not normally distributed variance of errors are not constant Censorship There are calls that have been going on for a while There are calls that have not yet finished by the time we terminate tracing We can’t simply discard these calls Otherwise we end up with a biased set of calls  with limited call duration Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 27
  • 28. Cox regression modeling The Cox regression model provides a good fit the effect of treatment on patients’ survival time log‐hazard function is proportional to the weighted sum of factors log h(t|Z) ∝ β t Z Z : factors (bit rate=x, jitter=y, RTT=z, …) β : weights of factors Hazard function (conditional failure rate)  The instantaneous rate at which failures occur for observations that have  survived at time t Pr[t ≤ T < t + ∆t|T ≥ t] h(t) = lim ∆t→0 ∆t Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 28
  • 29. Functional Form Checks h(t|Z) ∝ exp(β t Z) Assumption                                must be conformed Explore “true” functional forms of factors by  Human beings are known sensitive to the scale of physical  generalized additive models of the quantity quantity rather than the magnitude • Scale of sound (decibels vs. intensity)  • Musical staff for notes (distance vs. frequency)  Bit rate and jitter  log scale • Star magnitudes (magnitude vs. brightness)  Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 29
  • 30. The Logarithm Fits Better (Bit rate) After taking logarithm … Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 30
  • 31. The Logarithm Fits Better (Jitter) After taking logarithm … Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 31
  • 32. Final model & interpretation variable coef std. err. signif. log(bit rate) ‐2.15 0.13 < 1e‐20 log(jitter) 1.55 0.09 < 1e‐20 RTT 0.36 0.18 4.3e‐02 Interpretation A: bit rate = 20 Kbps B: bit rate = 15 Kbps, other factors same as A The hazard ratio between A and B can be computed by  exp((log(15) – log(20)) × ‐2.15) ≈ 1.86 The probability B will hang up is 1.86 times the probability A will do  so at any instant. Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 32
  • 33. Hang‐up rate and USI Hang-up rate = 2.15 × log(bit rate) − 1.55 × log(jitter) − 0.36 × RTT User satisfaction index (USI) = −Hang-Up Rate Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 33
  • 34. Average session time (min) Actual and Predicted Time vs. USI Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 34
  • 35. The multi‐path scenario Internet path avail bandwidth jitter RTT USI 10 Kbps 2 Kbps 100 ms 3.84 20 Kbps 1 Kbps 300 ms 6.33 30 Kbps 3 Kbps 500 ms 5.43 BUT, is call hang‐up rate a good indication of user satisfaction? Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 35
  • 36. Talk outline The Question Measurement Modeling Validation Significance Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 36
  • 37. User satisfaction: Validation Call duration intuition: call duration <‐> satisfaction not confirmed yet Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 37
  • 38. User satisfaction: One step further Call duration now we’re going to check! Speech interactivity intuition:  interactive and tight speech  ? activities in a cheerful conversation Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 38
  • 39. Identifying talk bursts The problem Every voice packet is encrypted with 256‐bit AES  (Advanced Encryption Standard) Possible solutions packet rate: no silence suppression in Skype packet size: our choice Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 39
  • 40. What we need to achieve Input: a time series of packet sizes Output: estimated ON/OFF periods (ON = talk / OFF=  silence) Time Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 40
  • 41. Speech activity detection 1. Wavelet de‐noising Removing high‐frequency fluctuations 2. Detect peaks and dips 3. Dynamic thresholding Deciding the beginning/end of a talk burst Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 41
  • 42. Speech detection algorithm: Validation The speech detection algorithm is validated with: synthesized sin waves (500 Hz – 2000 Hz) real speech recordings Force packet size processes contaminated by serious network  impairment (delay and loss) play sound >> capture packet  relay node  (chosen by Skype) size processes average  RTT: 350 ms jitter: 5.1 Kbps Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 42
  • 43. Validation with synthesized sin waves true ON periods estimated ON periods 3 times for each of 10 test cases correctness (ratio of matched 0.1‐second periods): 0.73 – 0.92  Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 43
  • 44. Validation with speech recordings true ON periods estimated ON periods 3 times for each of 3 test cases correctness (ratio of matched 0.1‐second periods): 0.71 – 0.85  Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 44
  • 45. Speech interactivity analysis Responsiveness: Avg. Response Delay: Avg. Burst Length: Responsiveness: whether the other party responds Response delay: how long before the other party responds Burst length: how long does a speech burst last Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 45
  • 46. USI vs. Speech interactivity higher USI  higher USI  higher USI  higher responsiveness shorter response delay shorter burst length All are statistically significant (at 0.01 significance level) Speech interactivity in conversation supports the proposed  USI Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 46
  • 47. Talk outline The Question Measurement Modeling Validation Significance Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 47
  • 48. Implications should put more attention to delay jitters (rather then  focus on network delay only) and the encoding bit rate! Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 48
  • 49. Significance QoE‐aware systems that can optimize user experience in  run time Is it worth to sacrifice 20 ms latency for reducing 10 ms  jitters (say, with a de‐jitter buffer)? Pick the most appropriate parameters in run time playout scheduling (buffer time) coding scheme (& rate) source rate data path (overlay routing) transmission scheme (redundacy, erasure coding, …) Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 49
  • 50. Future work (1) Measurement larger data sets (p2p traffic is hard to collect) diverse locations Validation user studies comparison with existing models (PESQ, etc) Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 50
  • 51. Future work (2) Beyond “call duration” Call Behavior Who hangs up a call? Call disconnect‐n‐connect behavior More sophisticated modeling Voice codec Pricing effect ? Time‐of‐day effect Time‐dependent impact Kuan‐Ta Chen  / Quantifying Skype User Satisfaction (MRA 2008) 51
  • 52. Thank You! Sheng‐Wei (Kuan‐Ta) Chen http://www.iis.sinica.edu.tw/~swc