Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Visual Object Analysis using
Regions and Local Features
Carles Ventura Royo
Co-advisors
Xavier Giró i Nieto
Verónica Vilap...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
•...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Introduction: Semantic segmentation
5
Instance
segmentation
Class
segmentation
boat
Introduction: Semantic segmentation
6
Part I: Single view Part II: Multiview
STATE OF
THE ART
OUR
RESULTS
Introduction: Visual Object Analysis
7
vs
Objects Scene
Introduction: Regions
8
Introduction: Regions
9
1 2
9
6
7
3
45
8
10
11
9 2
3
12 10
15 14
4 13
5 1
16 7
18 17
8 6
19
BINARY PARTITION TREE
Introduction: Regions
10
1 2
9
6
7
3
45
8
10
9
2
310
4
5
1
7
8
6
REGION ADJACENCY GRAPH
Introduction: Local Features
11
Local Features Global Features
Introduction: Local Features Aggregation
12
• Bag of Features (BoF) [1]
vector
quantization
codebook
Bag of Features
[1] G...
Introduction: Local Features Aggregation
13
• Pooling
1
𝑁
𝑖=1
𝑁
𝑥𝑖
1
𝑁
𝑖=1
𝑁
𝑥𝑖 𝑥𝑖
𝑇
First Order Average Pooling (O1P) [1]...
Part I
Context analysis
in semantic segmentation
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
•...
Introduction: Context
16
[2] A Rabinovich et al, Objects in Context. ICCV’07
Semantic context [1,2] Spatial context
[1] M ...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
•...
Related Work: Ideal scenario
18
Ground
truth
object
location
[1] J.R.R. Uijlings et al., The Visual Extent of an Object. I...
Related Work: Realistic scenario
• Pipeline [1]
19
Input
image
Generate
object
candidates
Rank
object
candidates
Predict
c...
Related Work: Realistic scenario
• How is each class predictor trained? [1]
20
0.8179
0.6861
0.9013
0.7381
0.7105
0.6462
T...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
•...
Contributions
• Figure-Border-Ground spatial pooling in the realistic scenario
22
os_1
os_2
os_N
SVR os = f([O2PF O2PB O2P...
Contributions
• Contour-based spatial pyramid [1]: crown-based
23
os_1
os_2
os_N
SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2P...
Contributions
• Contour-based spatial pyramid [1]: Cartesian-based
24
os_1
os_2
os_N
SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
•...
Experiments
• Pascal VOC segmentation challenge 2011 & 2012 [1]
• Train, validation and test subsets
• Train: 1,112 (2011)...
Experiments: Local Features Aggregation
27
• Pooling
1
𝑁
𝑖=1
𝑁
𝑥𝑖
1
𝑁
𝑖=1
𝑁
𝑥𝑖 𝑥𝑖
𝑇
First Order Average Pooling (O1P) [1]
...
Experiments
• Ideal scenario
• Train set: train11
• Test set: val11
28
F [1] F-B F-G [1] F-B-G
eSIFT [1] 63.9 66.2 66.4 68...
Experiments
• Ideal scenario
• Train set: train11
• Test set: val11
29
F [1] F-B F-B-G
Non SP 64.8 68.9 70.8
Crown-based S...
Experiments
• Ideal scenario
• Train set: train11
• Test set: val11
30
Figure SP (Figure) Border Ground AAC
eSIFT+eMSIFT+e...
Experiments
• Realistic scenario (CPMC [1])
• Train set: train11
• Test set: val11
31
Figure SP (Figure) Border Ground AAC...
Experiments
• Realistic scenario (CPMC [1])
• Train set: trainval11/12
• Test set: test11/12
32
[2] J Carreira et al, Sema...
Experiments
• Realistic scenario (MCG [1])
• Train set: train11
• Test set: val11
33
[2] J Carreira et al, Semantic segmen...
Experiments: Qualitative evaluation
34
F-G F-B-G F-G F-B-G
aeroplane
bicycle bicycle
cat bird
motorbike boat
bottle
bus
bu...
Experiments: Qualitative evaluation
35
F-G F-B-G F-G F-B-G
chair
diningtable
cow dog
person
horse
person motorbike
motorbi...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Introduction
• Related Work
• Contributions
•...
Conclusions
• Figure-Border-Ground spatial pooling improves the original Figure-
Ground pooling in both ideal and realisti...
Part II
Multiresolution co-clustering for
uncalibrated multiview segmentation
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Introduction
40
STATEOFTHEARTOURRESULTS
Introduction
• First goal: improving generic segmentation
41
• Motion-based region adjacency graph
• New resolution parame...
Introduction
• Co-segmentation
42
• Video segmentation
• Co-clustering
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Related Work: Co-clustering framework [1,2]
• Objective: Find the clusters that define the coherent regions across
the dif...
Related Work: Co-clustering framework [1,2]
• Objective: Find the clusters that define the coherent regions across
the dif...
Related Work: Co-clustering framework
• Representation with boundary variables
• Intra-image boundary variables: D1,2, D1,...
Related Work: Co-clustering framework
• How are the values of the boundary variables chosen?
47
view 1 view 2
LEAVES PARTI...
Related Work: Co-clustering framework
• Hierarchical constraint
48
view 1 view 2
1 2
3
4 5
6 Co-clustered partitions canno...
Related Work: Co-clustering framework
• Hierarchical constraint
49
view 1 view 2
1 3
2
4 5
6 Co-clustered partitions canno...
Related Work: Co-clustering framework
• Multiresolution parameterization
50
view 1 view 2
LEAVES PARTITIONS
…
R2
Related Work: Co-clustering framework
• Iterative approach
51
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Contribution I: Motion-based adjacency
53
View #i View #i-1
Contribution I: Motion-based adjacency
• Similarity computation
• RAG definition
54
View #i View #i-1
Contribution II: Resolution parameterization
55
view 1 view 2
LEAVES PARTITIONS …
Original parameterization
Proposed param...
Contribution III: Two-step iterative architecture
• Hierarchical constraints are not imposed in a second step
56
Contribution III: Two-step iterative architecture
57
First step Second step
Contribution III: Two-step iterative architecture
58
Contribution IV: Generic global co-clustering
59
• All co-clustered partitions
resulting from the iterative
architecture a...
Contribution V: Semantic global co-clustering
60
• Semantic information is
introduced in the global
optimization
Contribution V: Semantic global co-clustering
61
GENERIC
CO-CLUSTERING
SEMANTIC
SEGMENTATIONS
SEMANTIC
CO-CLUSTERING
Contribution VI: Automatic resolution selection
62
view 1 view 2
LEAVES PARTITIONS …
MULTIRESOLUTION
CO-CLUSTERING
• We pr...
Contribution VII: Coherent semantic partitions
63
view 1 view 2
LEAVES PARTITIONS
SEMANTIC PARTITIONS
SINGLE RESOLUTION
CO...
Contribution VII: Coherent semantic partitions
64
STATE OF
THE ART [1]
OUR
RESULTS
[1] S Zheng et al, Conditional Random F...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Experiments: Dataset
• Multiview dataset [1]
66[1] A. Kowdle et at, Multiple view object cosegmentation using appearance a...
Experiments: Generic co-clustering
67
Co-segmentation techniques
Video segmentation techniques
Co-clustering techniques
• ...
Experiments: Generic co-clustering
68
I-2S UCM+I-1S I-2S+GG [KX12] [JBP12] [XXC12] [GKHE10] [GCS13] UCM+Pr I-1S
BMW 0.72 0...
Experiments: Semantic co-clustering
69
Co-clustering techniques
• I-2S+GG(MR): Multiresolution global
generic co-clusterin...
Experiments: Qualitative assessment
70
Experiments: Qualitative assessment
71
Experiments: Qualitative assessment
72
leaves
partition
I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15]
[ZJRP+15] S Zheng et al, Condi...
Experiments: Qualitative assessment
73
leaves
partition
I-2S I-2S+GG I-2S+SG SCSS
[ZJRP+15] S Zheng et al, Conditional Ran...
Experiments: Qualitative assessment
74
Occlusion/Object Boundary Detection Dataset [GVB11]Ballet and Breakdancers datasets...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Conclusions
• The use of motion cues significantly improved the performance
• The new resolution parameterization allowed ...
Future Work
• Extending experiments to video datasets
• VSB100 (Video Segmentation Benchmark) [1]
• Cityscapes [2]
• Exten...
Outline
• Introduction
• Part I: Context Analysis in semantic segmentation
• Part II: Multiresolution co-clustering for un...
Conclusions
• Results achieved in the first part by considering new spatial
configurations are now obsolete after the outs...
Publications
• Related with the Thesis
• C. Ventura, D. Varas, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Semantically dri...
Publications
• Other publications:
• K. McGuinness, E. Mohedano, Z. Zhang, F. Hu, R. Albatal, Cathal Gurrin, N.E O'Connor,...
82
Introduction: Context
83Source: A. Oliva and A. Torralba, The role of context in object recognition
Introduction: Context
84Source: A. Oliva and A. Torralba, The role of context in object recognition
Introduction: Context
85Source: T. Malisiewicz and A. A. Efros, Improving spatial support for objects via multiple segment...
Related Work: Realistic scenario
86Source: J. Carreira et al., Semantic segmentation with second-order pooling
Input image...
Related Work: Realistic scenario
87Source: J. Carreira et al., Semantic segmentation with second-order pooling
Predict ove...
Related Work: Realistic scenario
88
0.8179
0.6861
0.9013
0.7381
0.7105
0.6462
TRAINING
DATA
TEST
DATA
?0.4905
[1] J Carrei...
Related Work: Co-clustering framework
• What are the contour elements?
89
view 1 view 2
LEAVES PARTITIONS Which contour el...
Related Work: Co-clustering framework
90
INTRA INTERACTIONS INTER INTERACTIONS
Related Work: Co-clustering framework
91
Related Work: Co-clustering framework
92
LINEAR PROGRAMMING RELAXATION
Related Work: Co-clustering framework
93
1
2
3
4
5
Intra: Q1,2 = -0.81
Q3,4 = -0.81, Q3,5 = -0.81, Q4,5 = -0.49
Inter: Q1,...
Related Work: Co-clustering framework
94
LEAVES PARTITIONS CO-CLUSTERED PARTITIONS
Related Work: Co-clustering framework
• Hierarchical constraint
95
PARENT NODE 11
Inter-sibling boundaries:
Intra-sibling ...
Related Work: Co-clustering framework
• Multiresolution parameterization
96
: Number of active contours
to encode leave co...
Related Work: Co-clustering framework
• Iterative approach
97
Contribution II: Resolution parameterization
98
Selected inter-sibling boundaries:
Contributions
• Semantic global co-clustering
99
1. Class assignment to regions 3. Optimization constraints
• Regions from...
Contribution VI: Automatic resolution selection
• Some applications require a single resolution
100
l1
l2
C1
C2
C3
l1 C1 C...
Experiments: Semantic co-clustering
101
Conclusions
• Multiresolution co-clustering framework for uncalibrated multiview
sequences
• Two-step architecture
• Globa...
Conclusions
• Part I: Improving spatial codification in semantic segmentation
• Figure-Border-Ground in realistic scenario...
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
CC NEW
Next
Upcoming SlideShare
CC NEW
Next
Download to read offline and view in fullscreen.

Share

Visual Object Analysis using Regions and Local Features

Download to read offline

The fi rst part of this dissertation focuses on an analysis of the spatial context in semantic image segmentation. First, we review how spatial context has been tackled in the literature by local features and spatial aggregation techniques. From a discussion about whether the context is bene ficial or not for object recognition, we extend a Figure-Border-Ground segmentation for local feature aggregation with ground truth annotations to a more realistic scenario where object proposals techniques are used instead. Whereas the Figure and Ground regions represent the object and the surround respectively, the Border is a region around the object contour, which is found to be the region with the richest contextual information for object recognition. Furthermore, we propose a new contour-based spatial aggregation technique of the local features within the object region by a division of the region into four subregions. Both contributions have been tested on a semantic segmentation benchmark with a combination of free and non-free context local features that allows the models automatically learn whether the context is benefi cial or not for each semantic category.

The second part of this dissertation addresses the semantic segmentation for a set of closely-related images from an uncalibrated multiview scenario. State-of-the-art semantic segmentation algorithms fail on correctly segmenting the objects from some viewpoints when the techniques are independently applied to each viewpoint image. The lack of large annotations available for multiview segmentation do not allow to obtain a proper model that is robust to viewpoint changes. In this second part, we exploit the spatial correlation that exists between the di erent viewpoints images to obtain a more robust semantic segmentation. First, we review the state-of-the-art co-clustering, co-segmentation and video segmentation techniques that aim to segment the set of images in a generic way, i.e. without considering semantics. Then, a new architecture that considers motion information and provides a multiresolution segmentation is proposed for the co-clustering framework and outperforms state-of-the-art techniques for generic multiview segmentation. Finally, the proposed multiview segmentation is combined with the semantic segmentation results giving a method for automatic resolution selection and a coherent semantic multiview segmentation.

  • Be the first to like this

Visual Object Analysis using Regions and Local Features

  1. 1. Visual Object Analysis using Regions and Local Features Carles Ventura Royo Co-advisors Xavier Giró i Nieto Verónica Vilaplana Besler Tutor Ferran Marqués Acosta
  2. 2. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 2
  3. 3. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 3
  4. 4. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 4
  5. 5. Introduction: Semantic segmentation 5 Instance segmentation Class segmentation boat
  6. 6. Introduction: Semantic segmentation 6 Part I: Single view Part II: Multiview STATE OF THE ART OUR RESULTS
  7. 7. Introduction: Visual Object Analysis 7 vs Objects Scene
  8. 8. Introduction: Regions 8
  9. 9. Introduction: Regions 9 1 2 9 6 7 3 45 8 10 11 9 2 3 12 10 15 14 4 13 5 1 16 7 18 17 8 6 19 BINARY PARTITION TREE
  10. 10. Introduction: Regions 10 1 2 9 6 7 3 45 8 10 9 2 310 4 5 1 7 8 6 REGION ADJACENCY GRAPH
  11. 11. Introduction: Local Features 11 Local Features Global Features
  12. 12. Introduction: Local Features Aggregation 12 • Bag of Features (BoF) [1] vector quantization codebook Bag of Features [1] G Csurka et al, Visual Categorization with Bags of Keypoints. ECCV’04
  13. 13. Introduction: Local Features Aggregation 13 • Pooling 1 𝑁 𝑖=1 𝑁 𝑥𝑖 1 𝑁 𝑖=1 𝑁 𝑥𝑖 𝑥𝑖 𝑇 First Order Average Pooling (O1P) [1] Second Order Average Pooling (O2P) [2] 𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 No need of codebook High dimensionality [1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  14. 14. Part I Context analysis in semantic segmentation
  15. 15. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 15
  16. 16. Introduction: Context 16 [2] A Rabinovich et al, Objects in Context. ICCV’07 Semantic context [1,2] Spatial context [1] M Bar, Visual Objects in Context. Nature Reviews Neuroscience 2004 GOAL: Analyze the influence of the spatial context in object recognition
  17. 17. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 17
  18. 18. Related Work: Ideal scenario 18 Ground truth object location [1] J.R.R. Uijlings et al., The Visual Extent of an Object. IJCV’12 Conclusion: Aggregating the local features over three region pools (interior, border and surround) increases the performance [1]
  19. 19. Related Work: Realistic scenario • Pipeline [1] 19 Input image Generate object candidates Rank object candidates Predict class scores Aggregate high-rank candidates [1] J Carreira et al, Object Recognition as Ranking Holistic Figure-Ground Hypotheses. CVPR’10 Semantic partition
  20. 20. Related Work: Realistic scenario • How is each class predictor trained? [1] 20 0.8179 0.6861 0.9013 0.7381 0.7105 0.6462 TRAINING DATA A SVR is used to learn the function that predicts the overlap for each class GOAL: CHANGE SPATIAL CODIFICATION O2PF O2PG overlap score os_1 os_2 os_N SVR os = f([O2PF O2PG]) [O2PF_1 O2PG_1] [O2PF_2 O2PG_2] [O2PF_1 O2PG_1] … [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  21. 21. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 21
  22. 22. Contributions • Figure-Border-Ground spatial pooling in the realistic scenario 22 os_1 os_2 os_N SVR os = f([O2PF O2PB O2PG]) [O2PF_1 O2PB_1 O2PG_1] [O2PF_2 O2PB_2 O2PG_2] [O2PF_N O2PB_N O2PG_N] …
  23. 23. Contributions • Contour-based spatial pyramid [1]: crown-based 23 os_1 os_2 os_N SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4]) [O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1] [O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2] [O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N] [1] S Lazebnik et al, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR’06 …
  24. 24. Contributions • Contour-based spatial pyramid [1]: Cartesian-based 24 os_1 os_2 os_N SVR os = f([O2PF O2PSR1 O2PSR2 O2PSR3 O2PSR4]) [O2PF_1 O2PSR1_1 O2PSR2_1 O2PSR3_1 O2PSR4_1] [O2PF_2 O2PSR1_2 O2PSR2_2 O2PSR3_2 O2PSR4_2] [O2PF_N O2PSR1_N O2PSR2_N O2PSR3_N O2PSR4_N] [1] S Lazebnik et al, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR’06 …
  25. 25. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 25
  26. 26. Experiments • Pascal VOC segmentation challenge 2011 & 2012 [1] • Train, validation and test subsets • Train: 1,112 (2011) / 1,464 (2012) • Validation: 1,111 (2011) / 1,449 (2012) • Test: 1,111 (2011) / 1,456 (2012) • 20 semantic classes • aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dinningtable, dog, horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor • Evaluation measure: Average Accuracy Classification 26[1] M Everingham et al, The PASCAL Visual Object Classes (VOC) Challenge. IJCV’10
  27. 27. Experiments: Local Features Aggregation 27 • Pooling 1 𝑁 𝑖=1 𝑁 𝑥𝑖 1 𝑁 𝑖=1 𝑁 𝑥𝑖 𝑥𝑖 𝑇 First Order Average Pooling (O1P) [1] Second Order Average Pooling (O2P) [2] 𝑥𝑖: 𝑙𝑜𝑐𝑎𝑙 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠 No need of codebook High dimensionality [1] Y Boureau et al, A Theoretical Analysis of Feature Pooling in Visual Recognition. ICML’10 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  28. 28. Experiments • Ideal scenario • Train set: train11 • Test set: val11 28 F [1] F-B F-G [1] F-B-G eSIFT [1] 63.9 66.2 66.4 68.6 eMSIFT [1] 64.8 68.9 67.7 70.8 [1] J Carreira et al, Semantic segmentation with second- order pooling. ECCV’12
  29. 29. Experiments • Ideal scenario • Train set: train11 • Test set: val11 29 F [1] F-B F-B-G Non SP 64.8 68.9 70.8 Crown-based SP 68.7 71.1 71.7 Cartesian-based SP 67.7 71.6 72.7 [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  30. 30. Experiments • Ideal scenario • Train set: train11 • Test set: val11 30 Figure SP (Figure) Border Ground AAC eSIFT+eMSIFT+eLBP eSIFT 72.98 [1] eSIFT+eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 73.84 eSIFT+eMSIFT+eLBP eMSIFT eSIFT+eMSIFT eSIFT+eMSIFT 75.86 [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  31. 31. Experiments • Realistic scenario (CPMC [1]) • Train set: train11 • Test set: val11 31 Figure SP (Figure) Border Ground AAC eSIFT eSIFT 28.6 [2] eSIFT eSIFT eSIFT 34.8 eSIFT+eMSIFT+eLBP eSIFT 37.2 [2] eSIFT eSIFT eSIFT eSIFT 37.4 eSIFT+eMSIFT+eLBP eSIFT eSIFT eSIFT 39.6 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12 [1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10
  32. 32. Experiments • Realistic scenario (CPMC [1]) • Train set: trainval11/12 • Test set: test11/12 32 [2] J Carreira et al, Semantic segmentation with second- order pooling. ECCV’12 F-G [2] F-B-G SP(F)-B-G VOC11 38.8 43.8 40.3 VOC12 39.9 42.2 40.8 [1] J Carreira et al, Constrained parametric min-cuts for automatic object segmentation. CVPR’10
  33. 33. Experiments • Realistic scenario (MCG [1]) • Train set: train11 • Test set: val11 33 [2] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12 F-G [2] F-B-G SP(F)-B-G CPMC 37.2 38.9 39.6 MCG 30.9 34.1 36.1 [1] P Arbeláez et al, Multiscale combinatorial grouping. CVPR’14
  34. 34. Experiments: Qualitative evaluation 34 F-G F-B-G F-G F-B-G aeroplane bicycle bicycle cat bird motorbike boat bottle bus bus motorbike car chair cat chair chair horse bird cow
  35. 35. Experiments: Qualitative evaluation 35 F-G F-B-G F-G F-B-G chair diningtable cow dog person horse person motorbike motorbike motorbike person pottedplant bottle sheep sofa cat bus train train tvmonitor
  36. 36. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Conclusions 36
  37. 37. Conclusions • Figure-Border-Ground spatial pooling improves the original Figure- Ground pooling in both ideal and realistic scenarios • The Border region pool carries the richest contextual information • The Cartesian-based spatial pyramid outperforms the crown-based spatial pyramid, but both of them may result in overfitting • Both Figure-Border-Ground pooling and Cartesian-based spatial pyramid have been validated with MCG object candidates • Published in ICIP’15 37
  38. 38. Part II Multiresolution co-clustering for uncalibrated multiview segmentation
  39. 39. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 39
  40. 40. Introduction 40 STATEOFTHEARTOURRESULTS
  41. 41. Introduction • First goal: improving generic segmentation 41 • Motion-based region adjacency graph • New resolution parameterization • Relaxing hierarchical constraints with a two-step architecture • Practical framework for a global optimization • Second goal: improving semantic segmentation • Semantic-based generic segmentation • Automatic resolution selection technique • Generic segmentation based semantic segmentation
  42. 42. Introduction • Co-segmentation 42 • Video segmentation • Co-clustering
  43. 43. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 43
  44. 44. Related Work: Co-clustering framework [1,2] • Objective: Find the clusters that define the coherent regions across the different views at multiple resolutions 44 [2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15 [1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS INPUT IMAGES HIERARCHIES
  45. 45. Related Work: Co-clustering framework [1,2] • Objective: Find the clusters that define the coherent regions across the different views 45 view 1 view 2 view 1 view 2 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS [2] D Varas et al, Multiresolution hierarchy co-clustering for semantic segmentation in sequences with small variations. ICCV’15 [1] D Glasner et al, Contour-based joint clustering of multiple segmentations. CVPR’11 R2
  46. 46. Related Work: Co-clustering framework • Representation with boundary variables • Intra-image boundary variables: D1,2, D1,3, D2,3, D4,5, D5,6 • Inter-image boundary variables: D1,4, D1,5, D2,4, D2,5, D3,6 46 view 1 view 2 view 1 view 2 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS D1,2 = 0 D1,4 = 0 D1,3 = 1 D1,5 = 0 D2,3 = 1 D2,4 = 0 D4,5 = 0 D2,5 = 0 D5,6 = 1 D3,6 = 0 R2
  47. 47. Related Work: Co-clustering framework • How are the values of the boundary variables chosen? 47 view 1 view 2 LEAVES PARTITIONS INTRA INTERACTIONS INTER INTERACTIONS Q1,2, Q1,3, Q2,3, Q4,5, Q5,6 Q1,4, Q1,5, Q2,4, Q2,5, Q3,6 R2
  48. 48. Related Work: Co-clustering framework • Hierarchical constraint 48 view 1 view 2 1 2 3 4 5 6 Co-clustered partitions cannot violate the hierarchical structures R2
  49. 49. Related Work: Co-clustering framework • Hierarchical constraint 49 view 1 view 2 1 3 2 4 5 6 Co-clustered partitions cannot violate the hierarchical structures R2
  50. 50. Related Work: Co-clustering framework • Multiresolution parameterization 50 view 1 view 2 LEAVES PARTITIONS … R2
  51. 51. Related Work: Co-clustering framework • Iterative approach 51
  52. 52. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 52
  53. 53. Contribution I: Motion-based adjacency 53 View #i View #i-1
  54. 54. Contribution I: Motion-based adjacency • Similarity computation • RAG definition 54 View #i View #i-1
  55. 55. Contribution II: Resolution parameterization 55 view 1 view 2 LEAVES PARTITIONS … Original parameterization Proposed parameterization = ??? = 2 R2
  56. 56. Contribution III: Two-step iterative architecture • Hierarchical constraints are not imposed in a second step 56
  57. 57. Contribution III: Two-step iterative architecture 57 First step Second step
  58. 58. Contribution III: Two-step iterative architecture 58
  59. 59. Contribution IV: Generic global co-clustering 59 • All co-clustered partitions resulting from the iterative architecture are fed into a global optimization • The reduction on the number of regions makes the global optimization feasible
  60. 60. Contribution V: Semantic global co-clustering 60 • Semantic information is introduced in the global optimization
  61. 61. Contribution V: Semantic global co-clustering 61 GENERIC CO-CLUSTERING SEMANTIC SEGMENTATIONS SEMANTIC CO-CLUSTERING
  62. 62. Contribution VI: Automatic resolution selection 62 view 1 view 2 LEAVES PARTITIONS … MULTIRESOLUTION CO-CLUSTERING • We propose a method that automatically selects the resolution that best fits with the semantic information SEMANTIC PARTITIONS SINGLE RESOLUTION CO-CLUSTERING R2
  63. 63. Contribution VII: Coherent semantic partitions 63 view 1 view 2 LEAVES PARTITIONS SEMANTIC PARTITIONS SINGLE RESOLUTION CO-CLUSTERING COHERENT SEMANTIC PARTITIONS R2
  64. 64. Contribution VII: Coherent semantic partitions 64 STATE OF THE ART [1] OUR RESULTS [1] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
  65. 65. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 65
  66. 66. Experiments: Dataset • Multiview dataset [1] 66[1] A. Kowdle et at, Multiple view object cosegmentation using appearance and stereo cues (ECCV’12)
  67. 67. Experiments: Generic co-clustering 67 Co-segmentation techniques Video segmentation techniques Co-clustering techniques • I-1S: Motion-compensated one-step iterative (baseline) • I-2S: Two-step iterative • UCM+I-1S: First step is replaced by a cut from a hierarchical segmentation algorithm • I-2S+GG: Two-step iterative followed by generic global optimization
  68. 68. Experiments: Generic co-clustering 68 I-2S UCM+I-1S I-2S+GG [KX12] [JBP12] [XXC12] [GKHE10] [GCS13] UCM+Pr I-1S BMW 0.72 0.68 0.70 0.42 0.56 0.70 0.65 0.63 0.62 0.67 Chair 0.79 0.77 0.76 0.53 0.78 0.80 0.76 0.47 0.59 0.78 Couch 0.93 0.95 0.94 0.78 0.90 0.85 0.88 0.73 0.89 0.90 GardenChair 0.84 0.63 0.87 0.31 0.52 0.70 0.68 0.63 0.84 0.80 Motorbike 0.76 0.77 0.77 0.39 0.39 0.71 0.73 0.46 0.54 0.70 Teddy 0.92 0.92 0.92 0.69 0.87 0.88 0.84 0.85 0.82 0.90 Average 0.83 0.79 0.83 0.52 0.67 0.77 0.76 0.63 0.72 0.79 CO-CLUSTERING CO-SEGMENTATION VIDEO SEGMENTATION BASELINES • Two-step iterative co-clustering techniques (I-2S and I-2S+GG) outperform other state-of-the-art techniques
  69. 69. Experiments: Semantic co-clustering 69 Co-clustering techniques • I-2S+GG(MR): Multiresolution global generic co-clustering • I-2S+SG(MR): Multiresolution global semantic co-clustering • I-2S+GG(SR): Single resolution global generic co-clustering • I-2S+SG(SR): Single resolution global semantic co-clustering Semantic segmentation techniques • SCSS: Semantic co-clustering based semantic segmentation • GCSS: Generic co-clustering based semantic segmentation • [ZJRP+15]: state-of-the-art [ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
  70. 70. Experiments: Qualitative assessment 70
  71. 71. Experiments: Qualitative assessment 71
  72. 72. Experiments: Qualitative assessment 72 leaves partition I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15] [ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15
  73. 73. Experiments: Qualitative assessment 73 leaves partition I-2S I-2S+GG I-2S+SG SCSS [ZJRP+15] S Zheng et al, Conditional Random Fields as Recurrent Neural Networks. ICCV’15 [ZJRP+15]
  74. 74. Experiments: Qualitative assessment 74 Occlusion/Object Boundary Detection Dataset [GVB11]Ballet and Breakdancers datasets [ZKU+04]
  75. 75. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 75
  76. 76. Conclusions • The use of motion cues significantly improved the performance • The new resolution parameterization allowed us to have a more uniform distribution of resolutions • The two-step architecture improved the performance of the original one- step architecture • Although global optimization is now feasible, there is no clear gain for generic co-clustering. However, it is useful for semantic co-clustering. • A small decrease in performance is achieved as a result of applying the resolution selection technique • Submitted to ECCV’16 (waiting decision) 76
  77. 77. Future Work • Extending experiments to video datasets • VSB100 (Video Segmentation Benchmark) [1] • Cityscapes [2] • Extending experiments to calibrated scenarios • Training end-to-end CNNs for multiview semantic segmentation 77 [1] F Galasso et al, A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis. ICCV’13 [2] M Cordts et al, The cityscapes dataset for semantic urban scene understanding. CVPR’16
  78. 78. Outline • Introduction • Part I: Context Analysis in semantic segmentation • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Introduction • Related Work • Contributions • Experiments • Conclusions • Conclusions 78
  79. 79. Conclusions • Results achieved in the first part by considering new spatial configurations are now obsolete after the outstanding results achieved by deep learning techniques. • Results from deep learning techniques were used in the second part. • The proposed multiresolution co-clustering has improved state-of- the-art results, but we should consider an end-to-end deep learning approach to achieve a more significant improvement. • Semantic segmentation techniques evolve really fast, making this field very competitive and challenging. 79
  80. 80. Publications • Related with the Thesis • C. Ventura, D. Varas, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Semantically driven multiresolution co-clustering for uncalibrated multiview segmentation. Submitted to the European Conference on Computer Vision (ECCV) 2016. In process of review. • C. Ventura, X. Giro-i-Nieto, V. Vilaplana, K. McGuinness, F. Marques, Noel E O'Connor. Improving spatial codication in semantic segmentation. International Conference on Image Processing (ICIP) 2015. • C. Ventura. Visual object analysis using regions and interest points. ACM international conference on Multimedia 2013. 80
  81. 81. Publications • Other publications: • K. McGuinness, E. Mohedano, Z. Zhang, F. Hu, R. Albatal, Cathal Gurrin, N.E O'Connor, A. F. Smeaton, A. Salvador, X. Giro-i-Nieto, C. Ventura. Insight Centre for Data Analytics (DCU) at TRECVid 2014: instance search and semantic indexing tasks. TRECVID Workshop 2014. • C. Ventura, V. Vilaplana, X. Giro-i-Nieto, F. Marques. Improving retrieval accuracy of Hierarchical Cellular Trees for generic metric spaces. Multimedia Tools and Applications, 2014. • C. Ventura, X. Giro-i-Nieto, V. Vilaplana, D. Giribet, E. Carasusan. Automatic keyframe selection based on mutual reinforcement algorithm. International Workshop on Content-Based Multimedia Indexing (CBMI) 2013. • C. Ventura, M. Tella-Amo, X. Giro-i-Nieto. UPC at MediaEval 2013 Hyperlinking Task. MediaEval 2013. • C. Ventura, M. Martos, X. Giro-i-Nieto, V. Vilaplana, F. Marques. Hierarchical navigation and visual search for video keyframe retrieval. International Conference on Multimedia Modeling 2012. 81
  82. 82. 82
  83. 83. Introduction: Context 83Source: A. Oliva and A. Torralba, The role of context in object recognition
  84. 84. Introduction: Context 84Source: A. Oliva and A. Torralba, The role of context in object recognition
  85. 85. Introduction: Context 85Source: T. Malisiewicz and A. A. Efros, Improving spatial support for objects via multiple segmentations.
  86. 86. Related Work: Realistic scenario 86Source: J. Carreira et al., Semantic segmentation with second-order pooling Input image Object segment hypotheses Ranked object segment hypotheses (class independent) object plausibility score
  87. 87. Related Work: Realistic scenario 87Source: J. Carreira et al., Semantic segmentation with second-order pooling Predict overlap estimate of each segment to each object class and sort segments by maximal score Aggregate high-rank segments
  88. 88. Related Work: Realistic scenario 88 0.8179 0.6861 0.9013 0.7381 0.7105 0.6462 TRAINING DATA TEST DATA ?0.4905 [1] J Carreira et al, Semantic segmentation with second-order pooling. ECCV’12
  89. 89. Related Work: Co-clustering framework • What are the contour elements? 89 view 1 view 2 LEAVES PARTITIONS Which contour elements are considered to compute Q1,4? • Contour elements of R1 • Contour elements of R4
  90. 90. Related Work: Co-clustering framework 90 INTRA INTERACTIONS INTER INTERACTIONS
  91. 91. Related Work: Co-clustering framework 91
  92. 92. Related Work: Co-clustering framework 92 LINEAR PROGRAMMING RELAXATION
  93. 93. Related Work: Co-clustering framework 93 1 2 3 4 5 Intra: Q1,2 = -0.81 Q3,4 = -0.81, Q3,5 = -0.81, Q4,5 = -0.49 Inter: Q1,3 = 2.81e+03 Q1,4 = -1.36e+03 Q1,5 = -1.45e+03 Q2,3 = -2.81e+03 Q2,4 = 1.36e+03 Q2,5 = 1.45e+03 x 0 x 0 x 1 Q4,5 = -0.49 D4,5 = 1 ?? 𝐷4,5 ≤ 𝐷4,2 + 𝐷2,5 D4,2 = 0, D2,5 = 0 D4,5 = 0
  94. 94. Related Work: Co-clustering framework 94 LEAVES PARTITIONS CO-CLUSTERED PARTITIONS
  95. 95. Related Work: Co-clustering framework • Hierarchical constraint 95 PARENT NODE 11 Inter-sibling boundaries: Intra-sibling boundaries:
  96. 96. Related Work: Co-clustering framework • Multiresolution parameterization 96 : Number of active contours to encode leave contours : Maximum fraction to describe the r-th coarse level : Maximum difference between consecutive levels = 9 = 0.5 = 0.1 4.53.6
  97. 97. Related Work: Co-clustering framework • Iterative approach 97
  98. 98. Contribution II: Resolution parameterization 98 Selected inter-sibling boundaries:
  99. 99. Contributions • Semantic global co-clustering 99 1. Class assignment to regions 3. Optimization constraints • Regions from same partition with same class • Regions from different partitions with diferent class 2. Similarity penalizations • Regions from same partition with different classes
  100. 100. Contribution VI: Automatic resolution selection • Some applications require a single resolution 100 l1 l2 C1 C2 C3 l1 C1 C2U l2 C2 C2 l1 or l2 ? l1
  101. 101. Experiments: Semantic co-clustering 101
  102. 102. Conclusions • Multiresolution co-clustering framework for uncalibrated multiview sequences • Two-step architecture • Global optimization • Semantic-based co-clustering with resolution selection • Submitted to ECCV’16 (waiting decision) 102
  103. 103. Conclusions • Part I: Improving spatial codification in semantic segmentation • Figure-Border-Ground in realistic scenario • Contour-based spatial pyramid • Part II: Multiresolution co-clustering for uncalibrated multiview segmentation • Results from Part I are replaced by SoA deep learning techniques • Generic co-clustering for multiview sequences • Semantic co-clustering for multiview sequences 103

The fi rst part of this dissertation focuses on an analysis of the spatial context in semantic image segmentation. First, we review how spatial context has been tackled in the literature by local features and spatial aggregation techniques. From a discussion about whether the context is bene ficial or not for object recognition, we extend a Figure-Border-Ground segmentation for local feature aggregation with ground truth annotations to a more realistic scenario where object proposals techniques are used instead. Whereas the Figure and Ground regions represent the object and the surround respectively, the Border is a region around the object contour, which is found to be the region with the richest contextual information for object recognition. Furthermore, we propose a new contour-based spatial aggregation technique of the local features within the object region by a division of the region into four subregions. Both contributions have been tested on a semantic segmentation benchmark with a combination of free and non-free context local features that allows the models automatically learn whether the context is benefi cial or not for each semantic category. The second part of this dissertation addresses the semantic segmentation for a set of closely-related images from an uncalibrated multiview scenario. State-of-the-art semantic segmentation algorithms fail on correctly segmenting the objects from some viewpoints when the techniques are independently applied to each viewpoint image. The lack of large annotations available for multiview segmentation do not allow to obtain a proper model that is robust to viewpoint changes. In this second part, we exploit the spatial correlation that exists between the di erent viewpoints images to obtain a more robust semantic segmentation. First, we review the state-of-the-art co-clustering, co-segmentation and video segmentation techniques that aim to segment the set of images in a generic way, i.e. without considering semantics. Then, a new architecture that considers motion information and provides a multiresolution segmentation is proposed for the co-clustering framework and outperforms state-of-the-art techniques for generic multiview segmentation. Finally, the proposed multiview segmentation is combined with the semantic segmentation results giving a method for automatic resolution selection and a coherent semantic multiview segmentation.

Views

Total views

765

On Slideshare

0

From embeds

0

Number of embeds

203

Actions

Downloads

14

Shares

0

Comments

0

Likes

0

×