SlideShare a Scribd company logo
1 of 44
Can you see it?
Annotating Image Regions
based on Users' Gaze
Information
Ansgar Scherp, Tina Walber, Steffen Staab

Technical University of Vienna
October 2012
Idea




          Benefiting of Eye Tracking
   Information for Image Region Annotation

 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 2 of 40
Eye-tracking Hardware




                                                                  X60
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images         Slide 3 of 40
Recorded Data




               Saccade                                            Fixation
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images              Slide 4 of 40
Scenario: Image Tagging
                      tree

                                                                  girl
       car

                                                                                  store

                                                                         people
       sidewalk
     Find specific objects in images
     Analyzing the user‟s gaze path
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images                   Slide 5 of 40
Investigation in 3 Steps



                 3 Interactive Tagging Application

                 2 Gaze + Automatic Segments

                 1 Gaze + Manual Regions


 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 6 of 40
1st Step


1.Best fixation measure to find the correct
  image region given a specific tag?



2. Can we differentiate two regions in the
   same image?


  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 7 of 40
3 Steps Conducted by Users




 Look at red blinking dot
 Decide whether tag can be seen (“y” or “n”)
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 8 of 40
Dataset
 LabelM        community images
   Manually drawn polygons
   Regions annotated with tags
 182.657 images (August 2010)
http://labelme.csail.mit.edu/Release3.0/


 High-quality segmentation and annotation
 Used as ground truth

 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 9 of 40
Dataset (continued)




 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 10 of 40
Experiment Images and Tags
 Randomly selected images from LabelMe
 Each image: at least two regions, 1000p x 700p

 Created three sets of 51 images each
 Assigned a tag to each image

 Tags are either “true” or “false”
   “true”  object described by tag can be seen
   “false”  object cannot be seen on the image
 Keep subjects concentrated during experiment
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 11 of 40
Subjects & Experiment System
 30 subjects
   21 male, 9 female (age: 22-45, Ø=28.7)
   Undergrads (10), PhD (17), office clerks (3)


 Experiment system
    Simple web page in Internet Explorer
    Standard notebook, resolution 1680x1050
    Tobii X60 eye-tracker (60 Hz, 0.5° accuracy)

  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 12 of 40
Conducting the Experiment
 Each user looked at 51 tag-image-pairs
 First tag-image-pair dismissed

 94.6% correct answers
 Roughly equal for true/false tags
 ~2.8s avg. until decision (true), ~3.8s avg. (false)

 Users felt comfortable during the experiment
  (avg.: 4.4, SD: 0.75)
   Eyetracker did not much influence comfort
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 13 of 40
Pre-processing of Eye-tracking Data
 Obtained 799 gaze paths from 30 users where
   Image has “true” tag assigned
   Users gave correct answers

 Fixation extraction
   Tobii Studio‟s velocity & distance thresholds
   Fixation: focus on particular point on screen

 One fixation inside or near the correct region
 656 gaze paths fulfill this requirement (82%)
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 14 of 40
Analysis of Gaze Fixations (1)
 Applied 13 fixation measures on the 656 paths
  (2 new, 7 standard Tobii , 4 literature)

 Fixation measure: function on users‟ gaze paths
 Calculated for each image region, over all users
  viewing the same tag-image-pair




  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 15 of 40
Considered Fixation Measures
Nr Name                              Favorite region r                                    Origin
1    firstFixation                   No. of fixations before 1st on r                     Tobii
2    secondFixation                  No. of fixations before 2nd on r                     [13]
3    fixationsAfter                  No. of fixations after last on r                     [4]
4    fixationsBeforeDecision fixationsAfter, but before decision                          New
5    fixationsAfterDecision          fixationsBeforeDecision and after                    New
6    fixationDuration                Total duration of all fixations on r                 Tobii
7    firstFixationDuration           Duration of first fixation on r                      Tobii
8    lastFixationDuration            Duration of last fixation on r                       [11]
9    fixationCount                   Number of fixations on r                             Tobii
10 maxVisitDuration                  Max time first fixation until outside r              Tobii
11 meanVisitDuration                 Mean time first fixation until outside r Tobii
12 visitCount                        No. of fixations until outside r                     Tobii
13 A.saccLength S. Staab – Identifying Objects in Imageslength, before fixation on rSlide[6]of 40
     Scherp, T. Walber,                Saccade                                            16
Analysis of Gaze Fixations (2)




 For every image region (b) the fixation
  measure is calculated over all gaze paths (c)
 Results are summed up per region
 Regions ordered according to fixation measure
 If favorite region (d) and tag (a) match, result is
  true positive (tp), otherwise false positive (fp)
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 17 of 40
Precision per Fixation Measure
                                                     lastFixationDuration
                                                                                                                             P
Sum of tp and fp assignments




            fixationsBeforeDecision                                                             meanVisitDuration

                                                                                      fixationDuration



                                                                       Fixation measures
                               A. Scherp, T. Walber, S. Staab – Identifying Objects in Images               Slide 18 of 40
Adding Boundaries and Weights
 Take eye-tracker inaccuracies into account
 Extension of region boundaries by 13 pixels




 Larger regions more likely to be fixated
 Give weight to regions < 5% of image size
 lastFixationDuration increases to P = 0.65
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 19 of 40
Weighted Measure Function




 Measure function fm(r) on region r with m=1…13
 Relative region size: sr
 Threshold when weighting is applied: T
 Maximum weighting value: M
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 20 of 40
Weighted Measure Function




 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 21 of 40
Examples: Tag-Region-Assignments




 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 22 of 40
Comparison with Baselines
P




 Naïve baseline: largest region r is favorite
 Salience baseline: Itti et al., TPAMI, 20(11), Nov 1998
 Random baseline: randomly select favorite r
 Gaze / Gaze* significantly better (all tests: p < 0.0015)
 Least significant result X2=(1,N=124)=10.723
    A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 23 of 40
Effect of Gaze Path Aggregation
 P




                                                                   # of gaze
                                                                   paths used
 Aggregation of precision P for Gaze*
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images     Slide 24 of 40
Research Questions


1.Best fixation measure to find the correct
  image region given a specific tag?
   lastFixationDuration with precision of 65%


2. Can we differentiate two regions in the
   same image?


  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 25 of 40
Experiment Images and Tags
 Randomly selected images from LabelMe
 Images contained at least two tagged regions
 Organized in three sets of 51 images each

 Assigned a tag to each image

 Tags are either “true” or “false”

 Two of the image sets share the same images
 Thus, these images have two tags each

  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 26 of 40
Differentiate Two Objects
 Use first and second tag set to identify different
  objects in the same images
 16 images (of our 51) have two “true” tags
 6 images had two correct regions identified
   Proportion of 38%

 Average precision for single object is 63%
  Correct tag assignment for two images: 40%


  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 27 of 40
Correctly Differentiated Objects




 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 28 of 40
Research Questions


1.Best fixation measure to find the correct
  image region given a specific tag?
   lastFixationDuration with precision of 65%


2. Can we differentiate two regions in the
   same image?
   Accuracy of 38%

  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 29 of 40
Investigation in 3 Steps



                 3 Interactive Tagging Application

                 2 Gaze + Automatic Segments

                 1 Gaze + Manual Regions


 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 30 of 40
So far …

 car           +                                                  +


                                                        For 63% of the images, we
                                                        can identify the correct region.

=                                                       T. Walber, A. Scherp, and S. Staab:
                                                        Identifying Objects in Images from
                                                        Analyzing the Users' Gaze Movements
                                         car           for Provided Tags, MMM, Klagenfurt,
                                                        Austria, 2012.

 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images                      Slide 31 of 40
Now:

 car           +                                                  +


                                                         Automatic segmentation
                                                         LabelMe segments only

=                                                         used as ground truth
                                                        T. Walber, A. Scherp, and S. Staab: Can
                                         car           you see it? Two Novel Eye-Tracking-Based
                                                        Measures for Assigning Tags to Image
                                                        Regions, MMM, Huangshan, China, 2013.
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images                       Slide 32 of 40
2nd Step: New Measure
 Automatic segmentation measure
 Berkeley Segmentation Data Set and
  Benchmarks 500 (BSDS500)
 Berkley„s bPb-owt-ucm algorithm
   Segmentation on different hierarchy levels
   Combination of contour detection and
   segmentation
   Oriented Watershed Transform and
    Ultrametric Contour Map
 P. Arbeléz, M. Maire, C. Fowlkes, and J. Malik. Contour detection and
 hierarachical image segmentation. IEEE TPAMI, 33(5):898–916, May 2011.
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 33 of 40
Segmentation Example
 Segmentations with different k = 0 … 0.4




  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 34 of 40
Automatic Segments + Gaze
 Conducted same computations as before
 But on the automatically extracted segments




  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 35 of 40
Results for different k’s: P/R/F
 P                                                                P




  Eye-tracking-based                                                  Golden sections
  automatic segmentation                                              rule baseline
  measure
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images                    Slide 36 of 40
Baseline: Golden Sections Rule




                                    a+b/a = a/b
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 37 of 40
Best Precision & Best F-measure




 Eye-tracking-based automatic segmentation measure
  significantly outperforms golden sections baseline
 Also shown: eye-tracking-based heatmap measure
  (no automatic segmentation)
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 38 of 40
Investigation in 3 Steps



                 3 Interactive Tagging Application

                 2 Gaze + Automatic Segments

                 1 Gaze + Manual Regions


 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 39 of 40
3rd Step: Interactive Application




 car ; house ; girl
► tree_
 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 40 of 40
APPENDIX




 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 41 of 40
Influence of Red Dot




 First 5 fixations, over all subjects and all images
  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 42 of 40
Experiment Data Cleaning
 Manually replaced images with
a) Tags that are incomprehensible, require
   expert-knowledge, or nonsense
b) Tag refers to multiple regions, but not all are
   drawn into the image (e.g., bicycle)
c) Obstructed objects (bicycle behind a car)
d) “False”-tag actually refers to a visible part of
   the image and thus were “true” tags


  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 43 of 40
How to Compute P/R?
 Rfav is calculated from
    Automatic segmentation measure
    Baseline measure




  A. Scherp, T. Walber, S. Staab – Identifying Objects in Images   Slide 44 of 40

More Related Content

Similar to Can you see it? Annotating Image Regions based on Users' Gaze Information

Lecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matchingLecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matchingcairo university
 
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- IAnish Acharya
 
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
JPEG XR objective and subjective evaluations
JPEG XR objective and subjective evaluationsJPEG XR objective and subjective evaluations
JPEG XR objective and subjective evaluationsTouradj Ebrahimi
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 
Denoising and Edge Detection Using Sobelmethod
Denoising and Edge Detection Using SobelmethodDenoising and Edge Detection Using Sobelmethod
Denoising and Edge Detection Using SobelmethodIJMER
 
Paper reading best of both world
Paper reading best of both worldPaper reading best of both world
Paper reading best of both worldShinagawa Seitaro
 

Similar to Can you see it? Annotating Image Regions based on Users' Gaze Information (12)

Lec12 review-part-i
Lec12 review-part-iLec12 review-part-i
Lec12 review-part-i
 
Lec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-systemLec07 aggregation-and-retrieval-system
Lec07 aggregation-and-retrieval-system
 
Land mine detection
Land mine detectionLand mine detection
Land mine detection
 
Lecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matchingLecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matching
 
Adaptive Spectral Projection
Adaptive Spectral ProjectionAdaptive Spectral Projection
Adaptive Spectral Projection
 
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- ITOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
TOWARDS OPTIMALITY OF IMAGE SEGMENTATION PART- I
 
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
 
JPEG XR objective and subjective evaluations
JPEG XR objective and subjective evaluationsJPEG XR objective and subjective evaluations
JPEG XR objective and subjective evaluations
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Denoising and Edge Detection Using Sobelmethod
Denoising and Edge Detection Using SobelmethodDenoising and Edge Detection Using Sobelmethod
Denoising and Edge Detection Using Sobelmethod
 
cv1.ppt
cv1.pptcv1.ppt
cv1.ppt
 
Paper reading best of both world
Paper reading best of both worldPaper reading best of both world
Paper reading best of both world
 

More from Ansgar Scherp

Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Ansgar Scherp
 
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...Ansgar Scherp
 
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Ansgar Scherp
 
A Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresA Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresAnsgar Scherp
 
Knowledge Discovery in Social Media and Scientific Digital Libraries
Knowledge Discovery in Social Media and Scientific Digital LibrariesKnowledge Discovery in Social Media and Scientific Digital Libraries
Knowledge Discovery in Social Media and Scientific Digital LibrariesAnsgar Scherp
 
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...Ansgar Scherp
 
Smart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interestSmart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interestAnsgar Scherp
 
Events in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, ApplicationEvents in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, ApplicationAnsgar Scherp
 
Linked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triplesLinked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triplesAnsgar Scherp
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataAnsgar Scherp
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataAnsgar Scherp
 
A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...Ansgar Scherp
 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudAnsgar Scherp
 
strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...Ansgar Scherp
 
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)Ansgar Scherp
 

More from Ansgar Scherp (15)

Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...
 
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...
 
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...
 
A Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresA Comparison of Approaches for Automated Text Extraction from Scholarly Figures
A Comparison of Approaches for Automated Text Extraction from Scholarly Figures
 
Knowledge Discovery in Social Media and Scientific Digital Libraries
Knowledge Discovery in Social Media and Scientific Digital LibrariesKnowledge Discovery in Social Media and Scientific Digital Libraries
Knowledge Discovery in Social Media and Scientific Digital Libraries
 
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...
 
Smart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interestSmart photo selection: interpret gaze as personal interest
Smart photo selection: interpret gaze as personal interest
 
Events in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, ApplicationEvents in Multimedia - Theory, Model, Application
Events in Multimedia - Theory, Model, Application
 
Linked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triplesLinked open data - how to juggle with more than a billion triples
Linked open data - how to juggle with more than a billion triples
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open Data
 
SchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open DataSchemEX -- Building an Index for Linked Open Data
SchemEX -- Building an Index for Linked Open Data
 
A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...A Model of Events for Integrating Event-based Information in Complex Socio-te...
A Model of Events for Integrating Event-based Information in Complex Socio-te...
 
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data CloudSchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud
 
strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...strukt - A Pattern System for Integrating Individual and Organizational Knowl...
strukt - A Pattern System for Integrating Individual and Organizational Knowl...
 
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
Linked Open Data (Entwurfsprinzipien und Muster für vernetzte Daten)
 

Recently uploaded

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Can you see it? Annotating Image Regions based on Users' Gaze Information

  • 1. Can you see it? Annotating Image Regions based on Users' Gaze Information Ansgar Scherp, Tina Walber, Steffen Staab Technical University of Vienna October 2012
  • 2. Idea Benefiting of Eye Tracking Information for Image Region Annotation A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 2 of 40
  • 3. Eye-tracking Hardware X60 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 3 of 40
  • 4. Recorded Data Saccade Fixation A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 4 of 40
  • 5. Scenario: Image Tagging tree girl car store people sidewalk  Find specific objects in images  Analyzing the user‟s gaze path A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 5 of 40
  • 6. Investigation in 3 Steps 3 Interactive Tagging Application 2 Gaze + Automatic Segments 1 Gaze + Manual Regions A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 6 of 40
  • 7. 1st Step 1.Best fixation measure to find the correct image region given a specific tag? 2. Can we differentiate two regions in the same image? A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 7 of 40
  • 8. 3 Steps Conducted by Users  Look at red blinking dot  Decide whether tag can be seen (“y” or “n”) A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 8 of 40
  • 9. Dataset  LabelM community images  Manually drawn polygons  Regions annotated with tags  182.657 images (August 2010) http://labelme.csail.mit.edu/Release3.0/  High-quality segmentation and annotation  Used as ground truth A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 9 of 40
  • 10. Dataset (continued) A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 10 of 40
  • 11. Experiment Images and Tags  Randomly selected images from LabelMe  Each image: at least two regions, 1000p x 700p  Created three sets of 51 images each  Assigned a tag to each image  Tags are either “true” or “false”  “true”  object described by tag can be seen  “false”  object cannot be seen on the image  Keep subjects concentrated during experiment A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 11 of 40
  • 12. Subjects & Experiment System  30 subjects  21 male, 9 female (age: 22-45, Ø=28.7)  Undergrads (10), PhD (17), office clerks (3)  Experiment system  Simple web page in Internet Explorer  Standard notebook, resolution 1680x1050  Tobii X60 eye-tracker (60 Hz, 0.5° accuracy) A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 12 of 40
  • 13. Conducting the Experiment  Each user looked at 51 tag-image-pairs  First tag-image-pair dismissed  94.6% correct answers  Roughly equal for true/false tags  ~2.8s avg. until decision (true), ~3.8s avg. (false)  Users felt comfortable during the experiment (avg.: 4.4, SD: 0.75)  Eyetracker did not much influence comfort A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 13 of 40
  • 14. Pre-processing of Eye-tracking Data  Obtained 799 gaze paths from 30 users where  Image has “true” tag assigned  Users gave correct answers  Fixation extraction  Tobii Studio‟s velocity & distance thresholds  Fixation: focus on particular point on screen  One fixation inside or near the correct region  656 gaze paths fulfill this requirement (82%) A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 14 of 40
  • 15. Analysis of Gaze Fixations (1)  Applied 13 fixation measures on the 656 paths (2 new, 7 standard Tobii , 4 literature)  Fixation measure: function on users‟ gaze paths  Calculated for each image region, over all users viewing the same tag-image-pair A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 15 of 40
  • 16. Considered Fixation Measures Nr Name Favorite region r Origin 1 firstFixation No. of fixations before 1st on r Tobii 2 secondFixation No. of fixations before 2nd on r [13] 3 fixationsAfter No. of fixations after last on r [4] 4 fixationsBeforeDecision fixationsAfter, but before decision New 5 fixationsAfterDecision fixationsBeforeDecision and after New 6 fixationDuration Total duration of all fixations on r Tobii 7 firstFixationDuration Duration of first fixation on r Tobii 8 lastFixationDuration Duration of last fixation on r [11] 9 fixationCount Number of fixations on r Tobii 10 maxVisitDuration Max time first fixation until outside r Tobii 11 meanVisitDuration Mean time first fixation until outside r Tobii 12 visitCount No. of fixations until outside r Tobii 13 A.saccLength S. Staab – Identifying Objects in Imageslength, before fixation on rSlide[6]of 40 Scherp, T. Walber, Saccade 16
  • 17. Analysis of Gaze Fixations (2)  For every image region (b) the fixation measure is calculated over all gaze paths (c)  Results are summed up per region  Regions ordered according to fixation measure  If favorite region (d) and tag (a) match, result is true positive (tp), otherwise false positive (fp) A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 17 of 40
  • 18. Precision per Fixation Measure lastFixationDuration P Sum of tp and fp assignments fixationsBeforeDecision meanVisitDuration fixationDuration Fixation measures A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 18 of 40
  • 19. Adding Boundaries and Weights  Take eye-tracker inaccuracies into account  Extension of region boundaries by 13 pixels  Larger regions more likely to be fixated  Give weight to regions < 5% of image size  lastFixationDuration increases to P = 0.65 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 19 of 40
  • 20. Weighted Measure Function  Measure function fm(r) on region r with m=1…13  Relative region size: sr  Threshold when weighting is applied: T  Maximum weighting value: M A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 20 of 40
  • 21. Weighted Measure Function A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 21 of 40
  • 22. Examples: Tag-Region-Assignments A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 22 of 40
  • 23. Comparison with Baselines P  Naïve baseline: largest region r is favorite  Salience baseline: Itti et al., TPAMI, 20(11), Nov 1998  Random baseline: randomly select favorite r  Gaze / Gaze* significantly better (all tests: p < 0.0015)  Least significant result X2=(1,N=124)=10.723 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 23 of 40
  • 24. Effect of Gaze Path Aggregation P # of gaze paths used  Aggregation of precision P for Gaze* A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 24 of 40
  • 25. Research Questions 1.Best fixation measure to find the correct image region given a specific tag?  lastFixationDuration with precision of 65% 2. Can we differentiate two regions in the same image? A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 25 of 40
  • 26. Experiment Images and Tags  Randomly selected images from LabelMe  Images contained at least two tagged regions  Organized in three sets of 51 images each  Assigned a tag to each image  Tags are either “true” or “false”  Two of the image sets share the same images  Thus, these images have two tags each A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 26 of 40
  • 27. Differentiate Two Objects  Use first and second tag set to identify different objects in the same images  16 images (of our 51) have two “true” tags  6 images had two correct regions identified  Proportion of 38%  Average precision for single object is 63%  Correct tag assignment for two images: 40% A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 27 of 40
  • 28. Correctly Differentiated Objects A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 28 of 40
  • 29. Research Questions 1.Best fixation measure to find the correct image region given a specific tag?  lastFixationDuration with precision of 65% 2. Can we differentiate two regions in the same image?  Accuracy of 38% A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 29 of 40
  • 30. Investigation in 3 Steps 3 Interactive Tagging Application 2 Gaze + Automatic Segments 1 Gaze + Manual Regions A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 30 of 40
  • 31. So far … car + + For 63% of the images, we can identify the correct region. = T. Walber, A. Scherp, and S. Staab: Identifying Objects in Images from Analyzing the Users' Gaze Movements car for Provided Tags, MMM, Klagenfurt, Austria, 2012. A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 31 of 40
  • 32. Now: car + +  Automatic segmentation  LabelMe segments only = used as ground truth T. Walber, A. Scherp, and S. Staab: Can car you see it? Two Novel Eye-Tracking-Based Measures for Assigning Tags to Image Regions, MMM, Huangshan, China, 2013. A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 32 of 40
  • 33. 2nd Step: New Measure  Automatic segmentation measure  Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500)  Berkley„s bPb-owt-ucm algorithm  Segmentation on different hierarchy levels  Combination of contour detection and segmentation  Oriented Watershed Transform and Ultrametric Contour Map P. Arbeléz, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarachical image segmentation. IEEE TPAMI, 33(5):898–916, May 2011. A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 33 of 40
  • 34. Segmentation Example  Segmentations with different k = 0 … 0.4 A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 34 of 40
  • 35. Automatic Segments + Gaze  Conducted same computations as before  But on the automatically extracted segments A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 35 of 40
  • 36. Results for different k’s: P/R/F P P Eye-tracking-based Golden sections automatic segmentation rule baseline measure A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 36 of 40
  • 37. Baseline: Golden Sections Rule a+b/a = a/b A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 37 of 40
  • 38. Best Precision & Best F-measure  Eye-tracking-based automatic segmentation measure significantly outperforms golden sections baseline  Also shown: eye-tracking-based heatmap measure (no automatic segmentation) A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 38 of 40
  • 39. Investigation in 3 Steps 3 Interactive Tagging Application 2 Gaze + Automatic Segments 1 Gaze + Manual Regions A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 39 of 40
  • 40. 3rd Step: Interactive Application car ; house ; girl ► tree_ A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 40 of 40
  • 41. APPENDIX A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 41 of 40
  • 42. Influence of Red Dot  First 5 fixations, over all subjects and all images A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 42 of 40
  • 43. Experiment Data Cleaning  Manually replaced images with a) Tags that are incomprehensible, require expert-knowledge, or nonsense b) Tag refers to multiple regions, but not all are drawn into the image (e.g., bicycle) c) Obstructed objects (bicycle behind a car) d) “False”-tag actually refers to a visible part of the image and thus were “true” tags A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 43 of 40
  • 44. How to Compute P/R?  Rfav is calculated from  Automatic segmentation measure  Baseline measure A. Scherp, T. Walber, S. Staab – Identifying Objects in Images Slide 44 of 40