1. The study explored whether commonly used image quality metrics could be improved by incorporating visual fixation data from eye tracking experiments.
2. Five quality metrics were augmented using two sets of eye tracking data - one with no task and one with participants asked to judge quality. Most metrics showed improved correlation with subjective ratings when weighted by no-task fixation data.
3. While the improvements were generally small and not statistically significant, the results suggest that how people naturally view images provides information that could supplement current automated quality metrics. However, fixation data may be more useful for very low quality images.
1. Can Visual Fixation Patterns Improve Image Quality?
Eric C. Larson, Cuong Vu, and Damon M. Chandler, Members IEEE
Image Coding and Analysis Lab, Department of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74078
Introduction Results Results
From the bar graph, it can be seen that VIF shows the most improvement in
A computer cannot judge image quality. Although current algorithms have correlation and that WSNR ends with the highest correlation. Also notice that the
made great strides in predicting human ratings of fidelity, we are still do not no task condition regions were the most useful for augmenting the metrics.
have a foolproof method of judging the quality of distorted images. This
No Task Improve Tasked Improve
experiment explores if the missing link in image quality is that we need to The graphs of the “correlation space” show that the highest correlations
know where humans tend to look in an image. generally appear when weighting the region with the highest fixations most, and
PSNR 0.0137 0.0045 weighting some of the region with a mild number of fixations. This is true in all
Five common metrics of image fidelity were augmented using two sets of eye cases except VSNR, were most of the weight should be placed in the regions that
fixation data. The first set was obtained under task-free viewing conditions SSIM 0.0344 0.0032
people do not look (although VSNR has the least to gain from fixations).
and another set was obtained when viewers were asked to specifically “judge
VIF 0.0794 0.0292
image quality.” We then compared the augmented metrics to subjective All of the improvements are not statistically significant over the un-weighted
ratings of the images. VSNR 0.0022 0.0100
metric except for when weighting VIF by the no task condition fixations.
We then asked, WSNR 0.0096 0.0038
1. Can existing fidelity metrics be improved using eye fixation data?
2. If so, is it more appropriate to use eye fixations obtained under no task
Conclusions
viewing conditions or when viewers were asked to assess quality? A computational experiment was presented that segmented images based upon
Tasked Condition No Task Condition
3. Can PSNR be augmented using eye fixation data to perform as well as eye fixation data and augmented existing image fidelity metrics with the
SSIM, VIF, VSNR, or WSNR? Residual Residual Metric F-Statistic
Residual Residual segmentation regions. It was shown that,
Metric F-Statistic
4. Using a fixation based segmentation, can we quantify how important each Skewness Kurtosis Skewness Kurtosis
1. Existing fidelity metrics can be positively augmented using fixation data,
segmented region is for predicting human subjective ratings? PSNR 0.9943 0.6628 -0.1136 PSNR 0.9500 0.7149 -0.0810
with SSIM and VIF showing the greatest improvements (for common sense
WSNR 0.9920 0.8414 0.5133 WSNR 0.9433 0.9463 0.9048
VSNR 1.1030 1.3209 2.4445
weighting).
VSNR 0.9730 1.2734 2.1786
Methods SSIM
VIF
0.9724
0.8384
0.9383
1.5202
0.7314
2.9391
SSIM
VIF
0.8274
0.6285
0.8110
1.4874
0.4574
2.7594
2. The no task fixation condition showed the greatest improvements for all
metrics except VSNR.
3. Under no task conditions, the primary region of eye fixation corresponds to
Two types of visual fixation data were used: The first set of fixations was the most important region for PSNR, SSIM, and VIF. For VSNR, the non-ROI is
collected when the viewers were given no task (i.e., they simply looked at the the most important region.
images). The second set of fixations was collected when the viewers were No Task 4. PSNR can be augmented to perform better than original VIF, but not SSIM,
asked to assess image fidelity. VSNR, nor WSNR (under this image set). When all metrics are augmented,
Fixation PSNR has the worst performance.
The resulting eye tracking data was used to cluster images from the LIVE
database[1] into three regions – the regions where viewers gazed (1)with high Correlation Ultimately, the best way to augment metrics using ROI information and how to
frequency, (2) low frequency, and (3) not at all. Space cluster eye tracking data in the most meaningful manner for image fidelity
assessment remains an open question. However, it is clear from this experiment
and others (for example, see [5][6]) that fixation and ROI data is less important
for fidelity assessment than expected.
Future
Although fixation data proved ineffective when working with images of all
quality, it was observed over the course of the experiment that region of interest
might be useful for very low quality images.
MAD The Most Apparent Distortion
Tasked
Once the images were segmented using fixation data, we wanted to Fixation
investigate how much each region contributed to the subjective quality of the
image, and use it to augment five image quality metrics (PSNR, WSNR, SSIM[2],
Correlation
VIF[3], and VSNR[4]). Space Motivation
Methods
Specifically, we (1) weighted the three segmented regions in the images, (2)
used the metrics to calculate a new weighted quality of the image, and (3) Results
calculated the correlation between the new weighted quality predictions and
subjective ratings of quality.
By adjusting the weights (and constraining that they sum to one) we were
able to look at the “correlation space” for all possible weighting
combinations. This was done for both sets of fixation data (i.e. – “Tasked”
and “No Task”). [1] H. R. Sheikh, Z. Wang, A. C. Bovik, and L. K. Cormack, “Image and video quality assessment research at LIVE.” Online. http://live.ece.utexas.edu/research/quality/.
[2] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans. Image Process. 13, 600–612 (2004).
Etot = α1st-ROI E1st-ROI + α2nd-ROI E2nd-ROI + αnon-ROI
[3] H. R. Sheikh and A. C. Bovik, “Image Information and Visual Quality,” IEEE Trans. Image Process., Vol. 15, No. 2, pp. 430-444, 2006.
[4] D.M. Chandler and S.S. Hemami, “VSNR: a Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images,” IEEE Trans. Image Process., Vol. 16, No. 9, 2007.
[5] A. Ninassi, O Le Meur, P.L. Callet, and D. Barba, “Does where you gaze on an image affect your perception of quality? Applying visual attention to image quality,” in IEEE ICIP 2007,
OSU
2007.
Enon-ROI [6] E.C. Larson and D.M. Chandler, “Unveiling relationships between regions of interest and image fidelity metrics,” Conference on Visual Communications and Image Processing, 2007.
Image Coding and Analysis Lab, Department of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74078,
ECEN
Editor's Notes
Histogram
1.) Methods section may need to be shortened considerably. Possibly cutting much of the mean spectra discussion and keep the monitor calibration data on hand but not in the poster. I will be around to explain each. Would be nice to show Garst Image here, instead of wordy methods section.
2.) First two sections are 3rd person professional. Present tense used when referring to study. Past tense used when referring to steps in the methods.
3.) Could show quantized histograms of Intensity, Red, Green, and Blue (or LAB histograms) instead of the mean variance, skew, and kurtosis
4.) BAR GRAPH OVERLOAD!!!!!!!!!!!!!!!!!!!!!
5.) Would it be good to show the statistics of the environment and animal for the three cases of crypsis ????? wavelet and