More details: https://imatge.upc.edu/web/publications/visual-summary-egocentric-photostreams-representative-keyframes
Author: Ricard Mestre
Thesis advisor: Xavier Giró-i-Nieto
This Final Degree Work approach the problem of the visual summarization of sets of images captured by an eggocentric camera for lifelogging purposes. In first place we try to group the images (which represent the day of person’s life) into distinguishable and significant events. For this purpose, we use visual features extracted with the software Caffe. In second place, we explain the design of extraction techniques of the representative images through similarity graphs. Finally we analyze the assessment scores given by different users whom we presented the different visual summaries obtained in this project. We achieve a 60% of favorable opinions of the quality of the visual summaries obtained with techniques developed in this project.
5. Motivation and goals
● Lifelogging with Narrative
Clip
● Up to 2000 images/day
● A visual summary can help
the memory of Alzheimer
affected people
5
6. Motivation and goals
● Extract a visual summary
of a day
○ Clustering strategy for
event detection
○ Automatic selection of
representative frames
6
8. State of the art
Chandrasekar et al, “Efficient retrieval from large-scale egocentric visual data using a sparse graph
representation” (CVPR Workshop 2014)
8
9. State of the art
Lu and Grauman, ”Story-driven summarization for egocentric video” (CVPR 2013)
9
12. Feature extraction
● Convolutional Neural Networks (CNN) trained with
ImageNet.
12
Jia et al, “Caffe: Convolutional Architecture for Fast Feature Embedding” (ACM MM 2014)
14. Clustering
● Obtain separated events
● Agglomerative clustering
14
cutoff parameter
Talavera, E., Dimiccoli, M., Bolaños, M., Aghaei, M., & Radeva, P. (2015). “R-Clustering for Egocentric Video
Segmentation”. In 7th Iberian Conference on Pattern Recognition and Image Analysis (ACCEPTED).
30. Contents
● Collaboration
● Motivation and goals
● State of the art
● Methodology
● Evaluation
○ Database
○ Clustering
○ Keyframe extraction
● Conclusions and future work
30
31. Evaluation: Database
● 5 days
● 3 users
● 4005 images
● Groundtruth available
31
Talavera, E., Dimiccoli, M., Bolaños, M., Aghaei, M., & Radeva, P. (2015). “R-Clustering for Egocentric
Video Segmentation”. In 7th Iberian Conference on Pattern Recognition and Image Analysis
(ACCEPTED).
32. Contents
● Collaboration
● Motivation and goals
● State of the art
● Methodology
● Evaluation
○ Database
○ Clustering
■ Jaccard index
■ Linkage effect
■ Relabelling effect
○ Keyframe extraction
● Conclusions and future work 32
36. Contents
● Collaboration
● Motivation and goals
● State of the art
● Methodology
● Evaluation
○ Database
○ Clustering
○ Keyframe extraction
■ Blind taste test
■ Representative quality of keyframe
■ Summary validations
● Conclusions and future work 36
38. ● Methodology: Blind taste test
38
Lu and Grauman, ”Story-driven summarization for egocentric video” (CVPR 2013)
Figure: brandchannel.com
Blind taste test: quality of keyframe
48. Conclusions and future work
● New methodology taking into account visual and
temporal information
● Keyframe extraction through graph-based approaches
48
49. Conclusions and future work
● 0.53 Jaccard index of segmentation
● 88-86% user acceptance with our summaries
● 58% users choose our summaries as best option
49
50. Conclusions and future work
● Temporal information causes important improvements
● First method of summary extraction for high temporal
resolution sets
50
51. Conclusions and future work
● Apply object detection
● Different criteria of representativity
● Clinical application of this work
51