Advertisement
Advertisement

More Related Content

More from Universitat Politècnica de Catalunya(20)

Advertisement

Time sensitive egocentric image retrieval for finding objects in lifelogs

  1. Time-Sensitive Egocentric Image Retrieval for Finding Objects in Lifelogs Xavier Giró Cristian Reyes Author: Advisors: 14th July 2016 Eva Mohedano Kevin McGuinness
  2. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 2
  3. Lifelogging Extract value from this new data 3
  4. 4
  5. Can’t find my phone Motivation 5
  6. Hundreds of images! Review Egocentric cameras may help Last time seen: At the CAFE 6
  7. Goal: Retrieve a useful image to find the object. Task 7 How: Exploiting visual and temporal information.
  8. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 8
  9. System Overview 9
  10. Visual Ranking - Descriptors Convolutional Neural Networks CNN design (vgg16) used for feature extraction. Conv-5_1 10Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O’Connor, and Xavier Giro-i Nieto. Bags of local convolutional features for scalable instance search. In Proceedings of the ACM International Conference on Multimedia. ACM, 2016.
  11. Visual Ranking - Descriptors Bag of Words 11Eva Mohedano, Amaia Salvador, Kevin McGuinness, Ferran Marques, Noel E. O’Connor, and Xavier Giro-i Nieto. Bags of local convolutional features for scalable instance search. In Proceedings of the ACM International Conference on Multimedia. ACM, 2016.
  12. Visual Ranking - Queries 5 visual examples 12
  13. Visual Ranking - Queries 3 masking strategies Full Image (FI) Hard Bounding Box (HBB) Soft Bounding Box (SBB) 13
  14. Visual Ranking - Target 3 masking strategies Full Image (FI) Center Bias (CB) Saliency Mask (SM) Saliency Map 14Junting Pan, Kevin McGuinness, Elisa Sayrol, Noel O’Connor, and Xavier Giro-i Nieto. Shallow and deep convolutional networks for saliency prediction. CVPR 2016.
  15. System Overview 15
  16. Candidate Selection Absolute Threshold on Visual Similarity Scores (TVSS) Adaptive Nearest Neighbor Distance Ratio (NNDR) 2 thresholding strategies Parameters 16 LEARNT D. G. Loewe. Distinctive image features from scale-invariant keypoints, 2004.
  17. System Overview 17
  18. Temporal aware reranking Time-stamp sorting Time-stamp sorting 18
  19. Redundancy 19
  20. Temporal Diversity 20
  21. Effect of Diversity New locations introduced 21
  22. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 22
  23. Compare different configurations 1. Dataset of images 2. Queries 3. Annotations 4. Metric Evaluate quantitatively 23
  24. Dataset EDUB1 ● 4912 images ● 4 users ● 2 days/user ● Narrative Clip 1 NTCIR-Lifelog2 ● 88185 images ● 3 users ● 30 days/user ● Autographer General Behavior Wide Angle Lens 24 1. Marc Bolaños and Petia Radeva. Ego-object discovery. 2. C. Gurrin, H. Joho, F. Hopfgartner, L. Zhou, and R. Albatal. NTCIR Lifelog: The first test collection for lifelog research.
  25. 25 EDUB NTCIR-Lifelog
  26. Dataset - Query Definition 26
  27. Annotation Strategy 2. Annotate only the last occurrence 3. Annotate all the scene of the last occurrence → Neighbor images may also be helpful → The system is expected to find all 31. Annotate the 3 last occurrences 27
  28. Metric Mean Average Precision Mean Reciprocal Rank 28 ALL RELEVANT IMAGES THE BEST RANKED RELEVANT IMAGE
  29. 9 Days 15 Days Training TRAINING TEST 29
  30. Training - Codebook ~ 13,500 images ~18,000,000 feature vectors 25,000 clusters 30
  31. Training - Thresholds 31
  32. Training - Thresholds 32 TIME-STAMP SORTING INTERLEAVING FULL IMAGE CENTER BIAS SALIENCY MASK SAME OPTIMAL THRESHOLDS
  33. Parameters summary 33
  34. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results & Conclusions 5. Conclusions 34
  35. Discussion & Conclusions 35 FI SBB HBB QUERY APPROACH
  36. Discussion & Conclusions 36 Full Image Saliency Mask Center Bias TARGET APPROACH
  37. Discussion & Conclusions 37 Adaptive: NNDR Absolute: TVSS Time-stamp reordering +
  38. Discussion & Conclusions 38 Adaptive: NNDR Absolute: TVSS Interleaving +
  39. Discussion & Conclusions 39
  40. Discussion & Conclusions The system helps the user 40
  41. Diversity helps Discussion & Conclusions 41
  42. Discussion & Conclusions Objects are not always in the center 42
  43. Discussion & Conclusions Saliency Maps do not help here But they do here 43
  44. Discussion & Conclusions Parameters are not independent. 44
  45. 45
  46. 46
  47. 47 Lifelogging Tools and Applications
  48. Acknowledgements Financial Support Noel E. O’Connor Cathal Gurrin Albert Gil 48
  49. 49
  50. 50 f(Q) NNDR TVSS NNDR + I TVSS + I Saliency Mask 0,201 0,217 0,210 0,226 Using Saliency Mask for target f(Q) NNDR TVSS NNDR + I TVSS + I Saliency Mask 0,176 0,271 0,190 0,279 Using Full Image for target f(Q) NNDR TVSS NNDR + I TVSS + I Saliency Mask 0,162 0,198 0,173 0,213 Using Center Bias for target
  51. Visual Ranking - Descriptors Bag of Words A) There is a red ball in a white box. B) The red box contains a white ball. Sentence A Sentence B there 1 0 is 1 0 a 2 1 red 1 1 ball 1 1 in 1 0 white 1 1 box 1 1 the 0 1 contains 0 1 Cosine Similarity Score = 0.68 An example using text 51
  52. 52 +
  53. Outline 1. Motivation 2. Methodology 3. Experiments 4. Results 5. Conclusions 53
  54. Conclusions ● The system accomplishes its task. ● Thresholding and temporal reranking have improved performance ● Center Bias does not necessary improve performance. ● Saliency Maps have improved performance. ● Parameters are not independent when measuring with A-MRR. ● Good baseline for further research. 54
Advertisement