Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SaltiNet: The Temporal Dimension of Visual Attention Models

1,325 views

Published on

https://imatge.upc.edu/web/publications/saltinet-scan-path-prediction-360-degree-images-using-saliency-volumes

We introduce SaltiNet, a deep neural network for scanpath prediction trained on 360-degree images. The first part of the network consists of a model trained to generate saliency volumes, whose parameters are learned by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes. Sampling strategies over these volumes are used to generate scanpaths over the 360-degree images. Our experiments show the advantages of using saliency volumes, and how they can be used for related tasks.

Awarded at the Salient360! IEEE ICME Grand Challenge 2017 for Best Scan-Path Prediction and Best Scan-Path Prediction Student Prize

Published in: Data & Analytics
  • Be the first to comment

SaltiNet: The Temporal Dimension of Visual Attention Models

  1. 1. SalTiNet The Temporal Dimension of Visual Attention Models Marc Assens Xavi Giro Kevin McGuiness Noel O’Connor
  2. 2. Why use deep learning? Data Human intelligence Low-level features High-level features Contours Shapes Colors Textures Keywords Description Classification Ontologies SEMANTIC GAP Matthias Kümmerer et al. Deep gaze i: Boosting saliency prediction with feature maps trained on imagenet (https://arxiv.org/abs/1411.1045) 2
  3. 3. Scanpaths of different users can be very different Worker 1 Worker 2 Worker 3 3 Pitfall: Scanpaths are very random
  4. 4. Why this problem? ● Using MSE loss results in predicting and averaged fixation ➡ center Mathieu et al. Deep multi-scale video prediction beyond mean square error, ICLR 2016 (https://arxiv.org/abs/1511.05440) 4 MSE Based Solution Possible future fixations
  5. 5. Options Less stochastic representation Could help in other areas Adversarial training 5
  6. 6. Scanpaths Scanpath 1 Scanpath 2 Scanpath 3 Saliency map Saliency volume Not very consistent Consistent! Consistent! 6
  7. 7. Saliency Volumes time width height 7
  8. 8. Saliency Volumes 00000000000000000 00000000000000000 00000000000000000 00000000000000000 00000000000000000 00000000000000000 00000000000000000 width height time 8
  9. 9. Saliency Volumes 00000000000000000 00000000000000000 00010000000010000 00000000000000000 00000000000000000 00000000010000000 00000000000000000 Fixation points : (x, y, t) width height time 9
  10. 10. Saliency Volumes 00000000000000000 00000000000000000 00010000000010000 00000000000000000 00000000000000000 00000000010000000 00000000000000000 Convolution Fixation points : (x, y, t) width height time 10 Multivariate Gaussian
  11. 11. Idea 1: Saliency volume. Quantize fixations in time, convolute multivariate gaussian. 11
  12. 12. Idea 1: Saliency volume. Quantize fixations in time, convolute multivariate gaussian. Nature dataset (40 workers) 12
  13. 13. Saliency Volumes 13
  14. 14. Our model for scanpath prediction [ 600 x 300 ] 14
  15. 15. Saliency volume prediction SalNet (Ported to Keras https://github.com/massens/salnet-keras) Pan et al. Shallow and Deep Convolutional Networks for Saliency Prediction, CVPR 2016 (https://arxiv.org/abs/1511.05440) 15
  16. 16. Saliency volume prediction End with dimension [ 600 x 300 x 20 ] 16
  17. 17. Saliency volume prediction Sampling strategies 17
  18. 18. Training SalNet Predict saliency volumes Predict 360 saliency volumes 10,000 train 5,000 val 5,000 test Transfer learning Transfer learning 40 train 25 test Salient360! Datasets 18
  19. 19. Sampling strategies: Limiting distance between fixations * 1 0
  20. 20. Sampling strategies: Limiting distance between fixations
  21. 21. Results 21
  22. 22. Qualitative Results: Scanpath prediction 22
  23. 23. Qualitative Results: Scanpath prediction 23
  24. 24. Qualitative Results: Saliency volume prediction 24
  25. 25. Qualitative Results: Saliency volume prediction 25
  26. 26. Dissemination Paper submission at ICCV 2017 Workshop Open source code and models arXiv preprint http://bit.ly/2rYavXf 26
  27. 27. Thank you! 27 Kevin McGuinness, Xavi Giro-i-Nieto, Noel E. O’Connor, Amaia Salvador, Manel Baradad, Albert Gil

×