Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Time sensitive egocentric image retrieval for finding objects in lifelogs

657 views

Published on

https://imatge.upc.edu/web/publications/time-sensitive-egocentric-image-retrieval-fidings-objects-lifelogs

This work explores diverse practices for conducting an object search from large amounts of egocentric images taking into account their temporal information. The application of this technology is to identify where personal belongings were lost or forgotten. We develop a pipeline-structured system. Firstly, the images of the day being scanned are sorted based on their probability to depict the forgotten object. This stage is solved by applying an existing visual search engine based on deep learning features. Secondly, a learned threshold selects the top ranked images as candidates to contain the object. Finally the images are reranked based on temporal and diversity criteria. Furthermore, we build a validation environment for assessing the system's performance aiming to find the optimal configuration of its parameters. Due to the lack of related works to be compared with, this thesis proposes an novel evaluation framework and metric to assess the problem.

Published in: Technology
  • You can try to use this service ⇒ www.HelpWriting.net ⇐ I have used it several times in college and was absolutely satisfied with the result.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • If you’re looking for a great essay service then you should check out ⇒ www.WritePaper.info ⇐. A friend of mine asked them to write a whole dissertation for him and he said it turned out great! Afterwards I also ordered an essay from them and I was very happy with the work I got too.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Time sensitive egocentric image retrieval for finding objects in lifelogs

  1. 1. Time-Sensitive Egocentric Image Retrieval for Finding Objects in Lifelogs Xavier Giró Cristian Reyes Author: Advisors: 14th July 2016 Eva Mohedano Kevin McGuinness
  2. 2. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 2
  3. 3. Lifelogging Extract value from this new data 3
  4. 4. 4
  5. 5. Can’t find my phone Motivation 5
  6. 6. Hundreds of images! Review Egocentric cameras may help Last time seen: At the CAFE 6
  7. 7. Goal: Retrieve a useful image to find the object. Task 7 How: Exploiting visual and temporal information.
  8. 8. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 8
  9. 9. System Overview 9
  10. 10. Visual Ranking - Descriptors Convolutional Neural Networks CNN design (vgg16) used for feature extraction. Conv-5_1 10Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O’Connor, and Xavier Giro-i Nieto. Bags of local convolutional features for scalable instance search. In Proceedings of the ACM International Conference on Multimedia. ACM, 2016.
  11. 11. Visual Ranking - Descriptors Bag of Words 11Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O’Connor, and Xavier Giro-i Nieto. Bags of local convolutional features for scalable instance search. In Proceedings of the ACM International Conference on Multimedia. ACM, 2016.
  12. 12. Visual Ranking - Queries 5 visual examples 12
  13. 13. Visual Ranking - Queries 3 masking strategies Full Image (FI) Hard Bounding Box (HBB) Soft Bounding Box (SBB) 13
  14. 14. Visual Ranking - Target 3 masking strategies Full Image (FI) Center Bias (CB) Saliency Mask (SM) Saliency Map 14Junting Pan, Kevin McGuinness, Elisa Sayrol, Noel O’Connor, and Xavier Giro-i Nieto. Shallow and deep convolutional networks for saliency prediction. CVPR 2016.
  15. 15. System Overview 15
  16. 16. Candidate Selection Absolute Threshold on Visual Similarity Scores (TVSS) Adaptive Nearest Neighbor Distance Ratio (NNDR) 2 thresholding strategies Parameters 16 LEARNT D. G. Loewe. Distinctive image features from scale-invariant keypoints, 2004.
  17. 17. System Overview 17
  18. 18. Temporal aware reranking Time-stamp sorting Time-stamp sorting 18
  19. 19. Redundancy 19
  20. 20. Temporal Diversity 20
  21. 21. Effect of Diversity New locations introduced 21
  22. 22. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 22
  23. 23. Compare different configurations 1. Dataset of images 2. Queries 3. Annotations 4. Metric Evaluate quantitatively 23
  24. 24. Dataset EDUB1 ● 4912 images ● 4 users ● 2 days/user ● Narrative Clip 1 NTCIR-Lifelog2 ● 88185 images ● 3 users ● 30 days/user ● Autographer General Behavior Wide Angle Lens 24 1. Marc Bolaños and Petia Radeva. Ego-object discovery. 2. C. Gurrin, H. Joho, F. Hopfgartner, L. Zhou, and R. Albatal. NTCIR Lifelog: The first test collection for lifelog research.
  25. 25. 25 EDUB NTCIR-Lifelog
  26. 26. Dataset - Query Definition 26
  27. 27. Annotation Strategy 2. Annotate only the last occurrence 3. Annotate all the scene of the last occurrence → Neighbor images may also be helpful → The system is expected to find all 31. Annotate the 3 last occurrences 27
  28. 28. Metric Mean Average Precision Mean Reciprocal Rank 28 ALL RELEVANT IMAGES THE BEST RANKED RELEVANT IMAGE
  29. 29. 9 Days 15 Days Training TRAINING TEST 29
  30. 30. Training - Codebook ~ 13,500 images ~18,000,000 feature vectors 25,000 clusters 30
  31. 31. Training - Thresholds 31
  32. 32. Training - Thresholds 32 TIME-STAMP SORTING INTERLEAVING FULL IMAGE CENTER BIAS SALIENCY MASK SAME OPTIMAL THRESHOLDS
  33. 33. Parameters summary 33
  34. 34. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 34
  35. 35. Discussion & Conclusions 35 FI SBB HBB QUERY APPROACH
  36. 36. Discussion & Conclusions 36 Full Image Saliency Mask Center Bias TARGET APPROACH
  37. 37. Discussion & Conclusions 37 Adaptive: NNDR Absolute: TVSS Time-stamp reordering +
  38. 38. Discussion & Conclusions 38 Adaptive: NNDR Absolute: TVSS Interleaving +
  39. 39. Discussion & Conclusions 39
  40. 40. Discussion & Conclusions The system helps the user 40
  41. 41. Diversity helps Discussion & Conclusions 41
  42. 42. Discussion & Conclusions Objects are not always in the center 42
  43. 43. Discussion & Conclusions Saliency Maps do not help here But they do here 43
  44. 44. Discussion & Conclusions Parameters are not independent. 44
  45. 45. 45
  46. 46. 46
  47. 47. 47 Lifelogging Tools and Applications
  48. 48. Acknowledgements Financial Support Noel E. O’Connor Cathal Gurrin Albert Gil 48
  49. 49. 49
  50. 50. 50 f(Q) NNDR TVSS NNDR + I TVSS + I Saliency Mask 0,201 0,217 0,210 0,226 Using Saliency Mask for target f(Q) NNDR TVSS NNDR + I TVSS + I Saliency Mask 0,176 0,271 0,190 0,279 Using Full Image for target f(Q) NNDR TVSS NNDR + I TVSS + I Saliency Mask 0,162 0,198 0,173 0,213 Using Center Bias for target
  51. 51. Visual Ranking - Descriptors Bag of Words A) There is a red ball in a white box. B) The red box contains a white ball. Sentence A Sentence B there 1 0 is 1 0 a 2 1 red 1 1 ball 1 1 in 1 0 white 1 1 box 1 1 the 0 1 contains 0 1 Cosine Similarity Score = 0.68 An example using text 51
  52. 52. 52 +
  53. 53. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results 5. Conclusions 53
  54. 54. Conclusions ● The system accomplishes its task. ● Thresholding and temporal reranking have improved performance ● Center Bias does not necessary improve performance. ● Saliency Maps have improved performance. ● Parameters are not independent when measuring with A-MRR. ● Good baseline for further research. 54

×