Quality Prediction for Speech-based Telecommunication Services

634 views
518 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
634
On SlideShare
0
From Embeds
0
Number of Embeds
180
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Quality Prediction for Speech-based Telecommunication Services

  1. 1. Quality Prediction for Speech-basedTelecommunication ServicesSebastian Möller, Stefan Hillmann, Klaus-Peter Engelbrecht, FlorianHinterleitner, Friedemann Köster, Florian Kretzschmar, Matthias Schulz,Stefan and Usability Lab, Telekom Innovation Laboratories, TU BerlinQuality Schaffer Life is for sharing.
  2. 2. Agenda.  Overview Quality & Usability Lab  Motivation  Speech Quality Prediction  Transmitted-speech Quality Prediction  TTS Quality Prediction  Dimension-based Quality Prediction  Spoken-dialogue Quality Prediction  Approach  Example  New Developments  Other Application Examples  Conclusions 9
  3. 3. MotivationQuality of Service (QoS) vs. Quality of Experience(QoE). developer‘s point-of-view:System  Performance: “The ability of a unit to provide the function it has been designed for.” (Möller, 2005)  Quality of Service (QoS): “The collective effect of service performance which determines the degree of satisfaction of the user of the service.” (ITU-T Rec. E.800, 1994)  Includes service support, service operability, serveability, and service securityUser‘s point-of-view:  Quality: “Result of appraisal of the perceived composition of the service with respect to its desired composition.” (ITU-T Rec. P.851, 2003, following Jekosch, 2000, 2005)  Quality of Experience (QoE): “The overall acceptability of an application or service, as perceived subjectively by the end user.” (ITU-T Rec. P.10, 2007)  Includes the complete end-to-end system effects 10
  4. 4. MotivationQuality of Service (QoS) vs. Quality of Experience(QoE).Qualinet White Paper on Definitions of Quality of Experience (2012): “Quality of Experience (QoE) is the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and / or enjoyment of the application or service in the light of the user’s personality and current state.” Service: “An event in which an entity takes the responsibility that something desirable happens on the behalf of another entity.” (Dagstuhl Seminar 09192, May 2009) Acceptability: “Acceptability is the outcome of a decision which is partially based on the Quality of Experience.” (Dagstuhl Seminar 09192, May 2009) 11
  5. 5. MotivationQuality of Service (QoS) vs. Quality of Experience(QoE).Service Useprovider rService design Service perceptionQuality elements Service Quality featuresQuality of Service (QoS) Quality of Experience (QoE) 12
  6. 6. Motivation.Quality perception and judgment processes. Response- Physical Signal Modifying Factors (Physical Nature) Adjustmen Percepti t on Desired Nature Perceived Nature Anticipation Reflexion Reflexion Desired Quality Perceived Quality Features Comparison Features and Judgment Perceived Quality Encoding User Quality Rating (Description) (Jekosch, 2004; Raake, 2006) 13
  7. 7. Quality of Service (QoS) vs. Quality of Experience(QoE): Taxonomy. User Context System factors Influencing Static Dynamic Environmental Service Agent Functional factors factors factors factors factors factors Quality of Service (QoS) Output modality Perceptu Form Contextual appropriatness performance Interaction al effort appropriatness appropriateness Dialog Cognitiv management e User Input System Physical performance worklo performance Input Interpretation ad response effort modality performance appropria tness Outp Cooperativity Input ut qualit quali y Quality of Experience (QoE) ty Interaction System quality Learnab Aestheti Persona cs ility Efficien Effectivene Utility lity Appe Joy of use Ease of use ss cy Intuitiv al ity Usability Usefulness Hedonic Acceptability Pragmatic (Möller et al., 2009) 14
  8. 8. Agenda.  Overview Quality & Usability Lab  Motivation  Speech Quality Prediction  Transmitted-speech Quality Prediction  TTS Quality Prediction  Dimension-based Quality Prediction  Spoken-dialogue Quality Prediction  Approach  Example  New Developments  Other Application Examples  Conclusions 15
  9. 9. Speech Quality PredictionTransmitted-speech quality prediction.Approaches: Linguis Attitude Emotions Experi- Motivation, t. ence Goals Backgr. User Factors Subjective Transmission Quality System Judgment System Speech Signals Parameter s Estimated Model Quality Index 16
  10. 10. Speech Quality PredictionTransmitted-speech quality prediction.Signal-comparison approach: Natural Speech x’(Interna Pre- k) l process represTransmiss ing ent. Distanc Avera Transfo ion MOS Interna e ge rm. System Pre- l process repres ing y(k y’( ent. ) k) (Hauenstein, 1997) 17
  11. 11. Speech Quality PredictionTransmitted-speech quality prediction.No-reference approach: Natural Speech Referenx’( Interna ce k) l Generati represTransmiss on ent. Distanc Avera Transfo ion MOS Interna e ge rm. System Pre- l process repres ing y(k y’( ent. Paramet High additional ) k) ric noises Analysis Time-varying charact. (ITU-T Rec. P.563, 2004) Unnatural voice 18
  12. 12. Speech Quality PredictionTransmitted-speech quality prediction.No-reference approach: Reference generation LPC Coeff. Vocal Mod. LPC Coeff. Tract Model Residual y(k) signal x’( LPC LPC + + k) Analysis - Synthesis (ITU-T Rec. P.563, 2004) 19
  13. 13. Speech Quality PredictionTTS quality prediction.Signal-comparison approach for TTS quality prediction: Natural speech x’(Internainventory Pre- k) l process repres ing Synthesi ent. Distanc Avera Transfo e MOS zer Interna ge rm. Pre- l process repres ing y(k y’( ent. ) k) (Cernak & Rusko, 2005) 20
  14. 14. Speech Quality PredictionTTS quality prediction.Parametric approach for TTS quality prediction: Text Referenx’(Interna ce k) l Generati represSynthesiz on ent. Distanc Avera Transfo e MOS er Interna ge rm. Pre- l process repres ing y(k y’( ent. Paramet High additional ) k) ric noises Analysis Time-varying charact. (ITU-T Rec. P.563, 2004) Unnatural voice 21
  15. 15. Speech Quality Prediction Dimension-based quality prediction. Transmissio n Pre- Internal MOS System Processing Represent. Comparison Integration Transform. Pre- Internal Processing Represent. Discontinuity Indicator Idis Noisiness Indicator Inoi Coloration Indicator Icol Loudness2008; Wältermann et al., 2008b,c) Indicator Ilou 22
  16. 16. Speech Quality PredictionDimension-based quality prediction.ListeningListeningdimensiondimensionanalysisintegration Convers.Convers. Temporal Comm. Comm. Temporal dimension dimension dimension integration dimension integration analysisintegration analysisintegrationTalking Talkingdimensiondimensionanalysisintegration Call completio Double-talk d.Interactivity Calln degr. set-up Echo degr. degr. Sidetone degr. Loudness Talking Call Noisiness Comm. Quality Listening qualit quality Discontinuity Coloration Quality Conversational y Service quality quality 23
  17. 17. Speech Quality PredictionDimension-based quality prediction. Listening dimension models:ListeningListening  Overall quality: ITU-T Rec. P.863, ITU-T Rec. G.107dimensiondimensionanalysisintegrationDiagnostic Listening Quality Assessment (Coté,  Convers.Convers. Temporal Comm. Comm. 2011) Temporal dimension dimension dimension integration dimension integration analysisintegration analysisintegrationTalking Talkingdimensiondimensionanalysisintegration Call completio Double-talk d.Interactivity Calln degr. set-up Echo degr. degr. Sidetone degr. Loudness Talking Call Noisiness Comm. Quality Listening qualit quality Discontinuity Coloration Quality Conversational y Service quality quality 24
  18. 18. Speech Quality PredictionDimension-based quality prediction. Talking dimension models:ListeningListening  PESQM (Appel &dimensiondimensionanalysisintegration Beerends, 2002): Convers.Convers. Temporal Comm. Comm. Temporal dimension dimension dimension integration dimension integration analysisintegration analysisintegrationTalking Talkingdimensiondimensionanalysisintegration Call completio Double-talk d.Interactivity Calln degr. set-up Echo degr. degr. Sidetone degr. Loudness  Talking Double-talk capabilities (ITU-T Rec. P.340): Call Noisiness Comm. Quality Listening duplex qualit quality Discontinuity  Full Conversational Coloration Quality y Service  Partial duplex quality quality  No duplex 25
  19. 19. Speech Quality PredictionDimension-based quality prediction. Conversation dimension models:ListeningListening  Gueguin et al. (2008)dimensiondimensionanalysisintegration Convers.Convers. Temporal Comm. Comm. Temporal dimension dimension dimension integration dimension integration analysisintegration analysisintegrationTalking Talkingdimensiondimensionanalysisintegration Call completio Double-talk d.Interactivity Calln degr. set-up Echo degr. degr. Sidetone degr. Loudness Talking Call Noisiness Comm. Quality Listening qualit quality Discontinuity Coloration Quality Conversational y Service quality quality 26
  20. 20. Speech Quality PredictionDimension-based quality prediction.Stability dimension models: ListeningListening dimension dimension Call quality models (Weiss analysisintegration et al., 2009) Convers.Convers. Temporal Comm. Comm. Temporal dimension dimension dimension integration dimension integration analysisintegration analysisintegration Talking Talking dimension dimension analysisintegration Call completio Double-talk d.Interactivity Calln degr. set-up Echo  Averaging degr. degr. Sidetone degr.  Higher weight for Loudness Talking Call Noisiness events negative Comm. Quality qualit quality Discontinuity Listening  Higher weight for close Coloration Quality Conversational y Service to call-final quality quality judgments 27
  21. 21. Speech Quality PredictionDimension-based quality prediction.Communication dimension models: Kort (1983) predicting ListeningListening probabilities for dimension dimension analysisintegration  abandoning before Convers. Convers. dial tone Comm. Comm. Temporal Temporal  abandoning while dialing dimension dimension dimension integration dimension integration analysisintegration analysisintegration  abandoning before network Talking Talking response dimension dimension analysisintegration  terminating early Call completio  re-dialing Double-talk d.Interactivity Calln degr. set-up  Echo operator complaintsdegr. degr. Sidetone degr.Situational dimension models: Loudness Talking Call Noisiness factor in the E-model Comm. Advantage Quality qualit quality Discontinuity Listening (ITU-T Rec. G.107)Quality Coloration Conversational y Service quality qualityService dimension models: To be developed, similar to call- quality models 28
  22. 22. Speech Quality PredictionDimension-based quality prediction. ListeningListening dimensiondimension analysisintegration Convers.Convers. Temporal Comm. Comm. Temporal dimension dimension dimension integration dimension integration analysisintegration analysisintegration Talking Talking dimensiondimension analysisintegration Call completio Double-talk d.Interactivity Calln degr. set-upQuality integration models: Echo Sidetone degr. degr. degr. p: n-dimensional vector of the perceptual event Loudness Talking Call Noisiness Comm. qDiscontinuity Listening : n-dimensionalQuality of the desiredqualit vector event quality Coloration Quality N Conversational y Service quality 2 Q d ( p, q ) i pi qi quality i 1 29
  23. 23. Speech Quality PredictionDimension-based quality prediction.ListeningListeningdimensiondimensionanalysisintegration Convers.Convers. Temporal Comm. Comm. Temporal dimension dimension dimension integration dimension integration analysisintegration analysisintegrationTalking Talkingdimensiondimensionanalysisintegration Call completio Double-talk d.Interactivity Calln degr. set-up Echo degr. degr. Sidetone degr. Loudness Talking Call Noisiness Comm. Quality Listening qualit quality Discontinuity Coloration Quality Conversational y Service quality quality 30
  24. 24. Agenda.  Overview Quality & Usability Lab  Motivation  Speech Quality Prediction  Transmitted-speech Quality Prediction  TTS Quality Prediction  Dimension-based Quality Prediction  Spoken-dialogue Quality Prediction  Approach  Example  New Developments  Other Application Examples  Conclusions 31
  25. 25. Spoken-dialogue Quality PredictionPrinciple.Approaches: Linguist. Experi- Task Flexibility Attitudemotions E Motivati Backgr. ence Knowledge on, User Factors Goals Subjective Dialog Quality System Judgment Speech System Interaction ParameterParameter Signals s s Estimated Model Quality Index 32
  26. 26. Spoken-dialogue Quality PredictionMeMo Workbench.Idea:  Make assumptions about (models of) the behavior of user and application  Partially replace the user in initial evaluations by a user model System Behavior Model Simul. Eng. Usability Control Unit Prediction User‘s Automat Mental Model ed Testing  Set up a workbench for automated testing and usability prediction 33
  27. 27. Spoken-dialogue Quality PredictionMeMo Workbench.For usable applications three different world descriptionshave to match:  User’s Mental Model: Image the user has of the application User task (tasks to carry out, i.e. the user task model ? model, and how to reach the task goal, i.e. the user interaction model) User interaction  System Interaction Model: Model model User‘s mental model underlying the interaction, of the system coded in the application  System Task Model: Model System task of the task a user can model perform with the help of the application System interaction model User System 34
  28. 28. Spoken-dialogue Quality PredictionMeMo Workbench. Workbench set-up: System  Step 1: Model acquisition Task  Step 2: Workbench set-up Model  Step 3: Prediction algorithm System derivation Inter-  Step 4: Interaction simulation & action Model Simulatio Problem Usabilit problem detection Identificati y n User Inter- Engine on Predicti action Control Unit & Weighting on Model Test User Trainin User Task Automatic g Model Testing User Behavior Usabili Model ty Profil e 35
  29. 29. Spoken-dialogue Quality PredictionExample. 37
  30. 30. Spoken-dialogue Quality PredictionExample. Paramete Experiment Simulation r [mean ( )] [mean ( )] Turns 10.45 (3.11) 10.37 (1.87) Conzeps 1.53 (0.27) 1.46 (0.11) Duration 208.68 (74.82) 200.69 (43.13) Deletions 1.06 (1.46) 1.38 (1.17) Insertions 0.29 (0.59) 0.09 (0.32) Substitutio 0.35 (1.02) 0.06 (0.07) ns CER 0.1 (0.12) 0.09 (0.07) No Match 1.16 (1.27) 1.84 (0.98) 38
  31. 31. Spoken-dialogue Quality Prediction. Newdevelopments.Modality Selection: System model with multiple (serial) input modalities: Which modality should be used for interaction? Various influence factors of users’ modality selection  User side: familiarity/expertise, static/dynamic user attributes, cognitive workload  System side: errors (e.g. ASR), number of interaction steps ?  Task: complexity, dual-task, time  Environment: home, public User model needs a mechanism to adjust interaction probabilities Study: investigating efficiency- and effectiveness-guided modality selection 39
  32. 32. Spoken-dialogue Quality Prediction. Newdevelopments.Modality 100 Baseline Experiment, 0% ASR erros 100 Predicted Data, 20% ASR errorsSelection:Study & model 80 80data 60 60  x-axis: speech 40 40 shortcut [interaction 20 Model 20 steps] 0 Human 0 Model 0 1 2 3 4 5 0 1 2 3 4 5  y-axis: speech Experiment 2, 10% ASR errors Experiment 2, 30% ASR errors usage [%] 100 100  Model input 80 80 parameters: 60 60 no. of interaction 40 40 steps, ASR error rate 20 Model 20 Model Human Human  Mechanism 0 0 1 2 3 4 5 0 0 1 2 3 4 5 will be integrated 40
  33. 33. Agenda.  Overview Quality & Usability Lab  Motivation  Speech Quality Prediction  Transmitted-speech Quality Prediction  Dimension-based Quality Prediction  TTS Quality Prediction  Spoken-dialogue Quality Prediction  Approach  Example  New Developments  Other Application Examples  Conclusions 41
  34. 34. Other Application ExamplesTradeoff between usability and security. Modified Tetris Game to evaluate tradeoff between usability and security The game was attacked by viruses which stole the user rows (rows could be saved to be actual money) Users could choose security level:  High level has much false alarms, but warns each time before an attack occurs  Low level has less false alarms, but some attacks are not announced Parameters like security level changes and number of collected rows were analyzed 42
  35. 35. Other Application Examples.Results.Results of the user test: More security changes in a high attack likelihood condition Earlier saving of rows in a high attack likelihood condition Higher average security level in a high attack likelihood conditionSimulation of the user behavior: MeMo Workbench Probabilistic- and rule-based modeling Good qualitative prediction for security level changes for both conditions Good prediction of clearing rows up to seven rows Prediction of other aspects needs improvement Extension of the approach to more realistic scenarios (mWallet, others) 43
  36. 36. Quality Prediction of Speech-based ServicesConclusions.Modelling the human user: Model of Referenc es Judgment Description Percepti Model Model on Model Subjective Dialog Quality System Judgment Model of Goals Action Behavior Model of Model Model Experienc es 44
  37. 37. Thank you for your attention!Visit www.qu.tu-berlin.de for more information.

×