Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Assessing 3DTV QoE and beyond a look on testing methodologies


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Assessing 3DTV QoE and beyond a look on testing methodologies

  1. 1. Colloquium on Quality of Experience in Multimedia Systems and Services - Klagenfurt Assessing 3DTV QoE and beyond a look on testing methodologies Patrick Le Callet November 2012 1
  2. 2. Stereo images QA: preliminary approach Depth => disparity Coding/decoding 2
  3. 3. Stereo images QA 3
  4. 4. S-3D Objective Metric: a first approachQuality of stereoscopic images, A. Benoit, P. Le Callet, P. Campisi and R. Cousseau, EURASIP Journal on Image and Video Processing, special issue on 3D Image and Video Processing, vol. 2008, doi:10.1155/2008/659024, 2008. 4
  5. 5. Subjective assessment Viewing Conditions Rec. ITU score: quality scale, SAMVIQ 5
  6. 6. Performances ?2D metrics are able to estimate visual quality of 3D content …if this latter is measured subjectively with usual protocols !=> Need to define 3D QoE 6
  7. 7. S-3DTV quality: new issuesQuality & 3D => what should it be measured ? S-3D needs to accepted by end user…some have already announced its deathR. Ebert, “Why i hate 3-d (and you should too),” Newsweek, May 2010. [Online]. Available: hate-3-d-and-you-should-too.htmlM. Kermode, “Come in number 3d, your time is up,” BBC News, December 2009. [Online]. Available: in number 3d your time is.html 7
  8. 8. Depth: S-3D is cheating our perception
  9. 9. S-3D principle: playing with binocular depth 9
  10. 10. Depth: S-3D is cheating our perceptionBinocular cues binocular disparity and vergence …but also monocular cues accommodation, blur Texture gradient, shadows perspective, relative size motion parallax 10
  11. 11. Depth cues …sensitivity 11Figure adapted from Cutting & Vishton 1995
  12. 12. S-3D is cheating our perceptionDepth cues combination: correlation vs ambiguity … Dominance, Dissociation, Reinterpretation Question : Cue enhancement without reliability with others => cognitive load ? 12
  13. 13. S-3DTV quality: new issuesQuality & 3D => what should be measured ? S-3D needs to accepted by end user- Comfortable viewing experience- Added value compared to 2D services => enhanced experience Immersiveness, naturalness => Moving from visual quality evaluation to Quality of Experience evaluation 13
  14. 14. Definition of Quality of Experience Quality of Experience (QoE) is the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and / or enjoyment of the application or service in the light of the user’s personality and current state. [Qualinet White Paper on Definitions of Quality of Experience (2012). European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003 Available at 14
  15. 15. S-3DTV: from Visual Quality to visual Quality of ExperienceQuality & 3D => multidimensionnal3D visual experience (Seuntiëns 2006) + Visual fatigue ? Towards QoE of 3DTV Step 1: how to measure it with observers ? Step 2: objective metric 15
  16. 16. Measuring 3D QoE 16
  17. 17. Measuring QoE: explorative studiesExplorative studies : focus groupsFeeling and reactions towards 3D services are explored Texture (Seuntiëns, 2006) ITU-R quality scale with or Quality and Single Stimulus (Seuntiëns et al., 2006), without adjectives Sharpness (Lambooij et al., 2011) (Seuntiëns, 2006), Amount of Single Stimulus Numerical scale(0-5) Lambooij et al., 2011), Depth (Strohmeier et al., 2010) Quality of Single Stimulus, (IJsselsteijn et al., 2000), Numerical scale(0-10) Depth Pair Comparison (Barkowsky et al., 2009) Visual ITU impairment and (Wöpking, 1992), (Yano et comfort, Eye Single Stimulus, quality scale, adapted al., 2002),(Kooi and Toet, strain and SSCQE, DSIS impairment scale from 2004), (Seuntiëns et al., Visual ITU 2006) Annoyance 17
  18. 18. Measuring QoE: explorative studies (Yano et al., 2004, Hyung-Chul Questionnaire, objective measurementVisual Fatigue et al., 2008, Li et al., 2008, (e.g. EEG) Emoto et al., 2004) (Seuntiens et al., 2005), Viewing (Seuntiëns et al., 2006), experience, (Seuntiëns, 2006), Lambooij etoverall image Single Stimulus ITU quality scale al., 2011), (Goldmann et al.,quality, visual 2010b, Goldmann et al., 2010a), experience (Strohmeier et al., 2010) (IJsselsteijn et al., 2000), Numerical scale(0- (Seuntiens et al., 2005), Naturalness Single Stimulus 10), ITU quality (Seuntiëns, 2006), Lambooij et scale al., 2011)Presence and Single Stimulus ITU quality scale (Seuntiëns, 2006) enjoyment
  19. 19. Measuring QoE: ITU StatusVideo : ITU-R BT.1438 Recommendation=> lack of specification of new characteristics for assessing S-3DTV. ITU-R WP6 and ITU-T SG9 have addressed Question Q.2 and Q.12 and are making progress on the recommendation (draft) Recommendation Title Content Subjective Methods for the Recommendation covering Assessment of Stereoscopic ITU-R BT.[3DTV SubMEth] subjective assessment Three-Dimensional methods for 3DTV Television (3DTV) systems Subjective assessment Recommendation regarding ITU-T P.3D-sam methods for 3D video 3D assessment methods for quality the current 3D environment Assessment methods of Visual fatigue and safety ITU-T J.3D-fatigue visual fatigue and safety assessment guideline for 3D guideline for 3D video video Requirements for displays Display requirements for 3D ITU-T J.3D-disp-req used for 3D assessment video quality assessment 19 testing
  20. 20. Test methods: scales ? 20
  21. 21. Multidimension: measuring on different scales End-user Multidimensionnal: assessment of Visual quality Visual Comfort several attributes Excellent eye strain Bad Good Headache poor etc. Fair 3D Perception Visual quality Naturalness visual experience Presence etc. Visual comfort Depth • several scales => several interpretation Inter observer variability ? • Image quality measurement may be copied from traditional methods • New scale attributes have to be developed for the other scales*New requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010 21Wei Chen, Jérôme Fournier, Marcus Barkowsky, Patrick Le Callet - , Orange labs R&D - IRCCyN
  22. 22. Choosing attributes User centered approach: cf OPQ (Open Profiling of Quality) [Strohmeier, D., Jumisko-Pyykko, S., Kunze, K., & Bici, M. O. (2011). The Extended-OPQ Method for User-Centered Quality of Experience Evaluation: A Study for Mobile 3D Video Broadcasting over DVB-H. EURASIP Journal on Image and Video Processing, 2011(1)] Fixed attributes: Visual experience Depth Naturalness rendering 2D Image Depth Visual quality quantity comfort [Chen, W., Fournier, J., Barkowsky, M., & Le Callet, P.P. Seuntiëns, "Visual experience of 3D TV," PhD Thesis, (2012). ExplorationEindhoven University of Technology, 2006] of Quality of Experience of Stereoscopic Images: Binocular Depth. VPQM] 22
  23. 23. Scale interpretation & observer variability Co joint quality and comfort ratings for 4 observers*TOWARDS A FRAMEWORK OF INTER-OBSERVER ANALYSIS IN MULTIMEDIA QUALITY ASSESSMENT- QOMEX2011 - Ulrich Engelke, Yohann Pitrey, Patrick Le Callet 23
  24. 24. Scales vs Pair comparisonObservers are always capable of indicating a preference when confronted with two samples while they may not be able to project their decision onto scale valuesConversion to scale values possible using Bradley-Terry orThurstone-Mosteller modelsPaired Comparison experiments may be conducted:- Time sequential presentation, difficult for observers if conditions are close- Side-by-Side presentation on two displays, exact calibration of screens and temporal synchronization of playback is required 24
  25. 25. Pair comparison Large number of comparisons required: N(N-1)/2Suitable mostly for obtaining ground-truth data, i.e. effort of the Video Quality Experts Group (VQEG) – 3DTV….but, subset selection algorithms exist to reduce the number of pairs from O(N²) to O(N)Dykstra’s Square design method was evaluated recently An optimal selection criterion for the construction of the square was developed and evaluated(see Jing Li and al. ICIP 2012) 25
  26. 26. Challenge: measuring long term QoE attribute Short term Long term Visual Visual fatigue: experience a decrease in performance of Naturalness Depth the visual system. rendering A measurable criterion that is of particular value of 2D Image Depth Visual quality quantity comfort ascertaining long-term adaptive processes of visual system*New requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010 26Wei Chen, Jérôme Fournier, Marcus Barkowsky, Patrick Le Callet - , Orange labs R&D - IRCCyN
  27. 27. Challenge: measuring long term QoE factor questionnaries optometry End-user Visual quality Visual Comfort Excellent eye strain Bad Good Headache poor etc. Fair 3D Perception Naturalness visual experience Presence etc. EEG - EMG eyetracking*New requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010 27Wei Chen, Jérôme Fournier, Marcus Barkowsky, Patrick Le Callet - , Orange labs R&D - IRCCyN
  28. 28. Long term visual fatigue measurement by EEG signal [Li 2008], the authors found out that in most of the channels, the power of high frequencies (higher than 12 Hz) is stronger in 3D conditions than in 2D conditions and it tends to increase as presentation duration increases. But …conclusion might differ with the “quality” of the content 2D vs 3D Before vs After [Li 2008]: Li and al., “Measurement of 3D Visual Fatigue Using Event-Related Potential (ERP): 3D Oddball Paradigm”, in 3DTV Conference: The True Vision – Capture, Transmission and Display of 3D Video, pp 213- 216
  29. 29. Impact on task performance (pre et post 3D) performance measurement: Q & A=> eyetracking measurementInfluence of autostereoscopic 3D displays on subsequent task performance- SPIE StereoscopicDisplays and Applications 2010 29M. Barkowsky, P. Le Callet
  30. 30. Impact on performance (pre et post 3D) Performance measurement: psychophysics + optometry Performances are better while observer report discomfort (questionnary)Is visual fatigue changing the perceived depth accuracy on an autostereoscopic display?M. Barkowsky, R. Cousseau, P. Le Callet- in SPIE Stereoscopic Displays and Applications 2011 30
  31. 31. test conditions: the display
  32. 32. test conditions: the display on 2D, transparent (or almost: LCD) displays … Far to be the case in S-3D ! Issues: luminance rendering and depth renderingNew requirement of subjective video quality assessment methodologies for 3DTV - VPQM 2010W. Chen, J. Fournier, M. Barkowsky, P. Le Callet - , Orange labs R&D - IRCCyN 32
  33. 33. Luminance rendering : perceived crosstalk with autosteroscopic display*SUBJECTIVE CROSSTALK ASSESSMENT METHODOLOGY FOR AUTO-STEREOSCOPICDISPLAYS- IEEE ICME 2012 33L. Xing, J. Xu, K. Skildheim, A. Perkis, T. Ebrahimi
  34. 34. depth rendering and source content• Strong relation between shooting parameters and viewing configuration – Shooting parameters : focal length (f), inter-camera baseline (b), convergence distance (d) – Visualisation parameters : screen distance (D), screen size (M), inter-ocular distance (B) Restituted space = f(shooting parameters, visualisation parameters)• Conformity rules have to be defined between the real and the restituted spaces• Visual Comfort have to be considered regarding Human visual system. M Real object Screen Is it comfortable ? d D b f B Cameras Eyes
  35. 35. Depth distortion and Shape distortion example Only change camera baseline in a stereoscopic system optimal camera 0.5x camera 1.5x camera baseline baseline baselineSee for details:New stereoscopic video shooting rule based on stereoscopic distortionparameters and comfortable viewing zone _ SPIE EI/SDA 2012, W. Chen, J. Fournier, M.barkowsky, P. le callet
  36. 36. conformity of stereoscopic images• A compliant image – Looks natural – Avoids (or minimizes) visual fatigue and visual annoyance of observers• 3 types of conformities can be defined Total conformity Relative conformity Partial conformity of shapes and dimensions limited to a slice of space of shapes without 100 30 90 dimensions 80 25 Profondeur restituée (m) Profondeur restituée (m) Profondeur restituée (m) 80 70 Restituted depth Focale = 100 mm Focal length = 100 20 60 60 mm Focal length = 100 Focale = 100 mm 50 15 mm 40 40 10 Focale = 300 mm 30 Focal length =mm Focale = 300 Focal length = 300 mm 20 20 5 300 mm 10 0 0 0 0 20 40 60 80 100 0 10 20 30 40 50 0 10 20 30 40 50 Real depth Real depth Real depth Profondeur réelle (m) Profondeur réelle (m) Profondeur réelle (m)
  37. 37. Beyond 3D … 37
  38. 38. High Dynamic Range• HDR filling the gap between: the abilities of the human visual system. …capture and display technologies=> more realistic visual experience compared to current Low Dynamic Range (LDR) imaging 38
  39. 39. HDR and visual quality: new issuesImage capturealmost no native HDR sensor exists, the capture step is multi-phased. => Many quality issues can arise from such processing(geometric distortions, ghosting, noises, etc.).HDR Content deliveryNeed for efficient compression techniques to store HDR data=> impact on QoE ? 39
  40. 40. HDR and visual quality: new issuesHDR and LDR technologies will have to coexist for some time ! => HDR to LDR operations (tone mapping, TMO) and LDR to HDR operations (inverse tone-mapping, iTMO) will need to be used.Impact of TMO and iTMO on QoE ? can even change the original artistic intention simple contrast reduction with local tone mapper 40
  41. 41. HDR and visual quality: new issuesTMO perceptual evaluation so far …. quality and aesthetic appeal studies – TMOs comparison • Rating (features) [Drago2003] – Contrast, detail, naturalness • Paired comparison (preference) [Kuang2007] – Features study [Kuang2007] • Highlight details, shadow details, overall contrast, sharpness, colorfulness, artifacts • One feature rating can predict overall rating – Comparison with “real life” scenes [Yoshida2005] or HDR display [Ledda2005] • Naturalness, overall contrast, overall brightness, detail in dark and bright regions • Differences in rating if real or HDR image [Ashikimin2006]
  42. 42. TMO and QoE: VA as an indicator of Visual Experience • VA has many technological applications – Video coding / compression – Quality estimation – Computer vision – Etc. Bottom-up “intention” • Also a great ‘tool’ for artists – Guide user’s discovery of a scene – Highlight / hide parts of a scene – Convey a message or emotions • TMOs should preserve VA behavior – Need to study the effect of TMOs on Top-down “intention” visual attention deployment
  43. 43. TMO and QoE: A recent study • Eye-tracking experiments on 88 images (tone mapped by 11 TMOs) • Analysis* shows that different TMOs modify VA to different extents Tumblin iCam LinearM. Narwaria, M. Silva, P. Le Callet and R. Pepion “Effect of Tone Mapping on Visual AttentionDeployment”, SPIE Conference on Applications of Digital Image Processing XXVII (Special Sessionon High Dynamic Range Imaging), vol. 8499, 2012
  44. 44. Take away messages …• Quality of (Visual) Experience is not only Video Quality• The environment for subjective experiments need to be redefined• Subjective measurement methods need to be refined• Multiscale methods need to be validated• New test methodologies are required: Continuous measurements, non-intrusive measurements, …• Objectively measuring the observer’s response with psychophysical devices may be required 44
  45. 45. VQEG (Video Quality Expert Group)3DTV Group HDR Groupactivities: One mission:- Impact of viewing develop methods for conditions on assessing the quality quality of HDR video.- Methodologies for subjective QA 45
  46. 46. Next challenges: Ultra HDHigher resolutions, Ultra-HD 4K/8K, “retina” displays: • Exceeding the visual acuity of standard observers • Higher frame rates leading to fluent motion reconstructionHow to measure content quality / added value?Loss of reference system in reality: How to avoid simulatorsickness? 46