Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre)

Visual Summary of Egocentric
Photostreams by Representative
Keyframes
Author: Ricard Mestre
Supervisor: Xavier Giró
Date: Tuesday, 17th of February 2015
1

Contents
● Collaboration
● Motivation and goals
● State of the art
● Methodology
● Evaluation
● Conclusions and future work
2

Collaboration
Collaboration with UB group BCNPCL
(Barcelona Percepture Computer Laboratory)
3

Contents
● Collaboration
● Methodology
● Evaluation
4

Motivation and goals
● Lifelogging with Narrative
Clip
● Up to 2000 images/day
● A visual summary can help
the memory of Alzheimer
affected people
5

Motivation and goals
● Extract a visual summary
of a day
○ Clustering strategy for
event detection
○ Automatic selection of
representative frames
6

Contents
● Collaboration
● Methodology
● Evaluation
7

State of the art
Chandrasekar et al, “Efficient retrieval from large-scale egocentric visual data using a sparse graph
representation” (CVPR Workshop 2014)
8

State of the art
Lu and Grauman, ”Story-driven summarization for egocentric video” (CVPR 2013)
9

Contents
● Collaboration
● Methodology
● Evaluation
10

Methodology
Feature extraction Clustering Division-fusion
Keyframe
extraction
11

Feature extraction
● Convolutional Neural Networks (CNN) trained with
ImageNet.
12
Jia et al, “Caffe: Convolutional Architecture for Fast Feature Embedding” (ACM MM 2014)

Methodology
Keyframe
extraction
13

Clustering
● Obtain separated events
● Agglomerative clustering
14
cutoff parameter
Talavera, E., Dimiccoli, M., Bolaños, M., Aghaei, M., & Radeva, P. (2015). “R-Clustering for Egocentric Video
Segmentation”. In 7th Iberian Conference on Pattern Recognition and Image Analysis (ACCEPTED).

Clustering: linkage method
● Different linkage methods
● Our case: average linkage
15

Methodology
Keyframe
extraction
16

Division
● Long events with short events inside
● Groundtruth labelling
17
1 2 3

Fusion
● Short clusters (less than 5 images) are not
representative
● Join the short events into larger ones
19
?
?

Example of good segmentation
21

Example of good segmentation
22

Example of bad segmentation
23

Methodology
Keyframe
extraction
24

Keyframe extraction
● Criterion: visual similarity-based keyframe
● Graph-based approach:
25Similarity Graph
Adjacency Matrix

Random walk
● One pedestrian moving along the graph
● The most visited the most representative
26

Minimum distance
● Adjacency matrix approach
● The minimum distance the most representative
27

Contents
● Collaboration
● Methodology
● Evaluation
○ Database
○ Clustering
○ Keyframe extraction
30

Evaluation: Database
● 5 days
● 3 users
● 4005 images
● Groundtruth available
31
Talavera, E., Dimiccoli, M., Bolaños, M., Aghaei, M., & Radeva, P. (2015). “R-Clustering for Egocentric
Video Segmentation”. In 7th Iberian Conference on Pattern Recognition and Image Analysis
(ACCEPTED).

Contents
● Collaboration
● Methodology
● Evaluation
○ Database
○ Clustering
■ Jaccard index
■ Linkage effect
■ Relabelling effect
● Conclusions and future work 32

Evaluation: Clustering
● Jaccard index:
33

Contents
● Collaboration
● Methodology
● Evaluation
○ Database
○ Clustering
■ Blind taste test
■ Representative quality of keyframe
■ Summary validations
● Conclusions and future work 36

Evaluation: keyframe extraction
● User Surveys:
○ Representative quality of keyframe
○ Quality of summary
37

● Methodology: Blind taste test
38
Lu and Grauman, ”Story-driven summarization for egocentric video” (CVPR 2013)
Figure: brandchannel.com
Blind taste test: quality of keyframe

Representative quality of keyframe
41
Do you think that the image of the left/center/right can represent the event?

Example of multi-event segmentation
42

Representative quality of keyframe
43
What image is more representative of the event, in your opinion?

Blind taste test: quality of summary
44

Summary validations
45
Can this set of images represent the complete day?

Summary validations
46
Which summary is the best, in your opinion?

Contents
● Collaboration
● Methodology
● Evaluation
47

Conclusions and future work
● New methodology taking into account visual and
temporal information
● Keyframe extraction through graph-based approaches
48

● 0.53 Jaccard index of segmentation
● 88-86% user acceptance with our summaries
● 58% users choose our summaries as best option
49

● Temporal information causes important improvements
● First method of summary extraction for high temporal
resolution sets
50

● Apply object detection
● Different criteria of representativity
● Clinical application of this work
51

52
Planned
submission:
March 30, 2015

Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre)

Recommended

Recommended

More Related Content

Similar to Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre)

Similar to Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre) (20)

More from Universitat Politècnica de Catalunya

More from Universitat Politècnica de Catalunya (20)

Recently uploaded

Recently uploaded (20)

Visual Summary of Egocentric Photostreams by Representative Keyframes (BSc Ricard Mestre)