A metric for no reference video quality assessment for hd tv delivery based on saliency maps
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

A metric for no reference video quality assessment for hd tv delivery based on saliency maps

on

  • 1,340 views

 

Statistics

Views

Total Views
1,340
Views on SlideShare
1,340
Embed Views
0

Actions

Likes
0
Downloads
19
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

A metric for no reference video quality assessment for hd tv delivery based on saliency maps Presentation Transcript

  • 1. A METRIC FOR NO-REFERENCE VIDEO QUALITY ASSESSMENT FOR HD TV DELIVERY BASED ON SALIENCY MAPS
    H. BOUJUT*, J. BENOIS-PINEAU*, T. AHMED*, O. HADAR** & P. BONNET***
    *LaBRI UMR CNRS 5800, University of Bordeaux, France
    **Communication Systems Engineering Dept., Ben Gurion University of the Negev, Israel
    ***AudematWorldCast Systems Group, France
    ICME 2011 – Workshop on Hot Topics in Multimedia Delivery (HotMD’11)
    2011-07-11
  • 2. Overview
    Introduction
    Focus Of Attention and Saliency Maps
    Our approach: Weighted Macro Block Error Rate (WMBER) based on saliency maps a no reference video quality metric
    Prediction of subjective quality metrics from objective quality metrics
    Evaluation and results
    Conclusion and future work
  • 3. Introduction
    Motivation
    VQA for HD broadcast applications
    Measure the influence of transmission loss on perceived quality
    Video quality assessment protocol
    Full Reference (FR)
    SSIM (Z. Wang, A. Bovik)
    A novel perceptual metric for video compression (A. Bhat, I. Richardson) PCS’09
    Evaluation of temporal variation of video quality in packet loss networks (C. Yim, A. C. Bovik, 2011) Image Communication 26 (2011)
    Reduced Reference (RR)
    A Convolutional Neural Network Approach for Objective Video Quality Assessment (P. Le Callet, C. Viard-Gaudin, D. Barba) IEEE Transactions on Neural Networks 17.
    No Reference (NR)
    No-reference image and video quality estimation: Applications and human-motivated design (S. Hemami, A. Reibman) Image Communication 25 (2010)
    In this work:
    NR VQA with visual saliency in H.264/AVC framework
    Contributions:
    Visual saliency map during compression process
    WMBER NR quality metric
    Prediction of subjective quality metrics from objective quality metrics
  • 4. Focus of Attention and Saliency maps
    FOA is mostly attracted by salient areas which stand out from the visual scene.
    FOA is sequentially grabbed over the salient areas.
    Salient stimuli are mainly due to:
    High color
    Contrast
    Motion
    Edge orientation
    Original Frame
    Saliency map
    Tractor sequence (TUM/VQEG)
  • 5. Saliency maps (1/2)
    Several methods for saliency map extraction already exist in the literature.
    All methods work in the same way [O. Brouard, V. Ricordel and D. Barba, 2009], [S. Marat, et al., 2009]:
    Extraction of the spatial saliency map (static pathway)
    Extraction of the temporal saliency map (dynamic pathway)
    Fusion of the spatial and the temporal saliency maps (fusion)
    Temporal saliency map
    Spatial saliency map
    Spatio-temporal saliency map
  • 6. Saliency maps (2/2)
    In this work we re-used the saliency map extraction method published at IS&T Electronic Imaging 2011 :
    Based on the saliency map model from O. Brouard, V. Ricordeland D. Barba.
    Use partial decoding of H.264 stream to reach real-time performances.
    A fusion method to combine spatial and temporal saliency maps has been proposed.
    We propose a new fusion method
  • 7. Saliency map fusion (1/2)
    We use the multiplication fusion method and the logarithm fusion method , both weighted with a 5 visual deg. 2D Gaussian 2DGauss(s) to compare with our proposed fusion method.
    Spatio-temporal saliency map
  • 8. To produce spatio-temporal saliency map, we also propose a new fusion method
    Similar fusion properties as
    Gives more weight to regions which have both:
    High spatial saliency
    High temporal saliency
    Do not provide null spatio-temporal saliency when temporal saliency is very low.
    Saliency map fusion (2/2)
  • 9. WMBER Vq metric based on saliency maps (1/3)
    Weighted Macro Block Error Rate (WMBER) is a No Reference metric
    Visual attention is focused on the saliency map
    Video transmission artifacts may change the saliency map
    We propose to extract the saliency maps on the already broadcasted disturbed video stream.
    WMBER also relies on MB error detection in the bit stream
    DC/AC and MV error detection
    Error propagation according to H.264 decoding process
    WMBER is based on:
    MB error detection
    Weighted by Saliency maps
    Original transmission error
    Propagation of transmission errors
  • 10. WMBER Vq metric based on saliency maps (2/3)
    MB errormap
    &
    Decoder
    Decoded Frame
    Gradient energy
    X
    Σ
    GME
    SaliencyMap
    /
    Σ
    WMBER
  • 11. WMBER Vq metric based on saliency maps (3/3)
    When MB errors covers the whole frame and the energy of the gradient is high:
    WMBER is high (near 1.0)
    When there no MB errors or the energy of the gradient is low:
    WMBER is low (near 0.0)
    The WMBER of a video sequence is the average WMBER of the frames.
  • 12. Subjective Experiment
    Subjective experiment
    According to:
    VQEG Report on Validation of the Video Quality Models for High Definition Video Content (June 2010).
    ITU-R Rec. BT.500-11
    20 HDTV (1920x1080 pixels) video sources (SRC) from :
    The Open Video Project: www.open-video.org
    NTIA/ITS
    TUM/Taurus Media Technik
    French HDTV
    Measure the influence of transmission loss on perceived quality
    2 loss models:
    IP model (ITU-T Rec. G.1050)
    RF (Radio Frequency) model
    8 loss profiles were compared
    160 Processed Video Streams (PVS)
    35 participants were gathered
    MOS values were computed for each SRC and PVS.
    Experiment room
  • 13. Subjective experiment results
  • 14. We propose to use a supervised learning method to predict MOS values from WMBER or MSE
    This prediction method is called: Similarity-weighted average
    Requires a training data set of n known pairs (xi, yi) to predict y from x.
    Here (xi, yi) pairs are WMBERor MSE values associated with MOS values.
    y is the predicted MOS from a given WMBER/MSE x.
    The prediction is performed using (known as a weighted mean classifier):
    Prediction of subjective quality metrics from objective quality metrics
  • 15. Evaluation and results
    We compare 6 objective video quality metrics:
    MSE
    WMBER using the 5 v/deg 2D Gaussian (WMBER2DGauss)
    WMBER using the multiplication fusion (WMBERmul)
    WMBER using the log sum fusion (WMBERlog)
    WMBER using the square sum fusion (WMBERsquare)
    WMBER using the spatial saliency map (WMBERsp)
    All metrics are computed for each 160 PVS + 20 SRC.
    6data sets are built:
    180 pairs Objective Metric/MOS
    Each data set is split in 2 equal parts:
    Training set and Evaluation set
    The Pearson Correlation Coefficient (PCC) is used for the evaluation
    Cross validation
  • 16. Conclusion and future Work
    We were interested in the problem of objective video quality assessment over lossy channels.
    We followed the recent trends in the definition of spatio-temporal saliency maps for FOA.
    New no reference metric : the WMBER based on saliency maps.
    We bought a new solution for saliency maps fusion: the Square sum fusion.
    We proposed a supervised learning method to predict subjective quality metric MOS from objective quality metrics.
    Similarity weighted average.
    Gives better results than the conventional approach: polynomial fitting.
    We intend to improve the saliency model to better consider:
    Transmission artifacts
    Masking effect in the neighborhood of high saliency areas.
    We plan to evaluate the WMBER on the IRCCyN/IVC Eyetracker SD 2009_12 Database.
  • 17. Thank you for your attention. Any questions?