4. Data
!4
“make use of the best ally we have: the
unreasonable effectiveness of data.”
Alon Halevy, Peter Norvig, and Fernando Pereira, The unreasonable effectiveness of data. IEEE
Intelligent Systems, 24(2), 8-12. 2009
5. Effectiveness of data in deep learning
!5
Sun C, Shrivastava A, Singh S, Gupta A. Revisiting unreasonable effectiveness of data in deep learning
era. InComputer Vision (ICCV), 2017 IEEE International Conference on 2017 Oct 22 (pp. 843-852).
IEEE. Image from arXiv preprint version
MSCOCO PASCAL VOC 2007
Object detection performance
6. Why is data useful?
!6
“… perhaps when it comes to natural
language processing … will never have
the elegance of physical equations…”
Alon Halevy, Peter Norvig, and Fernando Pereira, The unreasonable effectiveness of data. IEEE
Intelligent Systems, 24(2), 8-12. 2009
7. Using data
• Learn the limitations of your data
• Understand how data is acquired
• Identify where the mathematical elegance
becomes impractical
• Domain knowledge
8. Using data
• Learn the limitations of your data
• Understand how data is acquired
• Identify where the mathematical elegance
becomes impractical
• Domain knowledge
11. C1
Hotel Images are in the public domain. Modified to simulate 3D rotation
Multi-view Geometry
!11
12. C1
C2
Hotel Images are in the public domain. Modified to simulate 3D rotation
Multi-view Geometry
!12
13. C1
C2
Hotel Images are in the public domain. Modified to simulate 3D rotation
How did the camera move?
Multi-view Geometry
!13
14. Hotel Images are in the public domain. Modified to simulate 3D rotation
Drone image is from parrot. Reproduced for educational purposes.
Multi-view Geometry
!14
15. Hotel Images are in the public domain. Modified to simulate 3D rotation
Drone image is from parrot. Reproduced for educational purposes.
Multi-view Geometry
!15
Car image is CC0
18. C1
C2
Hotel Images are in the public domain. Modified to simulate 3D rotation
Multi-view Geometry
How did the camera move?
!18
19. C1
C2
Hotel Images are in the public domain. Modified to simulate 3D rotation
Multi-view Geometry
Find corresponding points and
triangulate!
!19
20. C1
C2
Hotel Images are in the public domain. Modified to simulate 3D rotation
Multi-view Geometry
Find corresponding points and
triangulate!
!20
21. C1
C2
Hotel Images are in the public domain. Modified to simulate 3D rotation
Multi-view Geometry
Find corresponding points and
triangulate!
!21
22. Best tool for matching points across images.
SIFT (Lowe, ICCV’99) started the trend: ~68k citations.
Interest Points
!22
23. LIFT: Learned Invariant Feature Transform
DET Crop
ORI Rot DESC
LIFT pipeline
SCORE MAP
softargmax
description
vector
!23
Y. Verdie, K.M. Yi, P. Fua, V. Lepetit:
"TILDE: A Temporally Invariant
Learned DEtector", CVPR 2015.
K.M. Yi, Y. Verdie, V. Lepetit,
P. Fua : ”Learning to Assign
Orientations to Feature
Points", CVPR 2016 (Oral)
K.M. Yi, E. Trulls, V. Lepetit, P. Fua:
“LIFT: Learned Invariant Feature
Transform", ECCV 2016 (Spotlight)
24. Quantitative results
0.165
0.22
SIFT SURF ORB Daisy sGLOH MROGH LIOP BiCE
BRISK FREAK VGG DeepDesc PN-Net KAZE LIFT (pic) LIFT (rf)
0
0.1
0.2
0.3
0.4
Avg. matching score on ‘Strecha’
0
0.08
0.16
0.24
0.32
Avg. matching score on ‘DTU’
0
0.055
0.11
0.165
0.22
Avg. matching score on ‘Webcam’
LIFT with ‘pic’ dataset
LIFT with ‘rf’ dataset
• Best performance on all datasets, with either ‘pic’ or ‘rf’.
• Surprising? SIFT remains #3 overall (#1: ours, #2: VGG).
!24
30. TL; DR
• End-to-end pipeline for local feature matching
• Learning with non-differentiable components within Deep Learning
• Tighter formulation —> better performance
!30
31. TL; DR
• End-to-end pipeline for local feature matching
• Learning with non-differentiable components within Deep Learning
• Tighter formulation —> better performance
!31
Beyond?
34. Image Matching: Local Features and Beyond
https://image-matching-workshop.github.io
Vassileios Balntas (Scape), Vincent Lepetit (U. Bordeaux), Johannes Schönberger (Microsoft), Eduard
Trulls (Google), Kwang Moo Yi (U. Victoria)
38. The phototourism challenge: Data
● 25k images in total for training.
● “Quasi” ground truth data is generated by
performing SfM with COLMAP with all
images.
○ Assumption: Images registered in
COLMAP are accurate given enough
images.
● Valid pairs are generated via simple visibility
check.
38
39. The phototourism challenge: Data
● 4k images in total for testing.
● Random bags of images are
subsampled to form test subsets
(size: 3, 5, 10, 25).
39
40. The phototourism challenge: local features
Hotel Images are in the public domain. Modified to simulate 3D rotation
● Submission: Features
● IMW evaluates them via a typical
stereo/SfM pipeline
○ Nearest neighbor matching
○ 1-to-1 matching
○ RANSAC_F
○ COLMAP
40
41. The phototourism challenge: matches
Hotel Images are in the public domain. Modified to simulate 3D rotation
● Submission: Features + Matches
● IMW evaluates them via a typical
stereo/SfM pipeline
○ Nearest neighbor matching
○ 1-to-1 matching
○ RANSAC_F
○ COLMAP
41
42. The phototourism challenge: poses
Hotel Images are in the public domain. Modified to simulate 3D rotation
● Submission: Poses
● IMW evaluates them via a typical
stereo/SfM pipeline
○ Nearest neighbor matching
○ 1-to-1 matching
○ RANSAC_F
○ COLMAP
42
43. Improving with descriptors (multi-view task)
+12%
+23%
+26% +28%
+30% +32%
Full results: https://image-matching-workshop.github.io/leaderboard 43
44. Improving with matching (multi-view task)
+11%
+37%
+14%
+35%
SuperPoint: Self-Supervised Interest Point Detection and Description. DeTone et al., 2018.
ContextDesc: Local Descriptor Augmentation with Cross-Modality Context. Luo et al., CVPR'19
Learning to Find Good Correspondences. Yi et al., CVPR'18
44
45. End-to-end pipelines
SuperPoint: Self-Supervised Interest Point Detection and Description. DeTone et al., 2018.
D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. Dusmanu et al., CVPR'19 45
46. Image Matching: Local Features and Beyond
https://image-matching-workshop.github.io
Vassileios Balntas (Scape), Vincent Lepetit (U. Bordeaux), Johannes Schönberger (Microsoft), Eduard
Trulls (Google), Kwang Moo Yi (U. Victoria)
58. !58
[Angles et. al, arXiv, 2019]
MIST
Multiple Instance Spatial Transformer Networks
Learning to localize & understand is easy when there are
only single instances of the object in the scene
74. !74
[Jiang et. al, arXiv, 2019]
Linearized Multi-Sampling
Bilinear sampling Our method
Visualization of gradients w.r.t. crop location.
Should point towards centre.
93. Using data
• Learn the limitations of your data
• Understand how data is acquired
• Identify where the mathematical elegance
becomes impractical
• Domain knowledge
94. Using data
• Learn the limitations of your data
• Understand how data is acquired
• Identify where the mathematical elegance
becomes impractical
• Domain knowledge
95. Using data
• Learn the limitations of your data
• Understand how data is acquired
• Identify where the mathematical elegance
becomes impractical
• Domain knowledge
101. !101
[Jin et. al, arXiv, 2019]
Accelerated MRI
Learning
both to
acquire
data and
use data
(Reconstructed)
Image
Residual
Samplingpattern
inFourierSpace
108. !108
[Jin et. al, arXiv, 2019]
Accelerated MRI
Progressive sampling
Decompose & Simplify
• ReconNet learns to
reconstruct
• SampleNet learns to predict
the next best sample position
109. !109
[Jin et. al, arXiv, 2019]
Accelerated MRI
Self-supervision through MCTS
with implicit minimax
Enhance via Self-supervision
• MCTS provides better
direction
• Supervision to improve, not
ground-truth
110. !110
[Jin et. al, arXiv, 2019]
Accelerated MRI
Progressive sampling
Self-supervision through MCTS
with implicit minimax
111. !111
[Jin et. al, arXiv, 2019]
Accelerated MRI
Performs best when using both components of our method together.
112. !112
[Jin et. al, arXiv, 2019]
Accelerated MRI
When reconstructing vis simple zero filling inverse Fourier
Transform, learned sampling does not perform well.
Performs best when using both components of our method together.
113. !113
[Jin et. al, arXiv, 2019]
Accelerated MRI
When reconstructing vis simple zero filling inverse Fourier
Transform, learned sampling does not perform well.
Performs best when using both components of our method together.
Neither does the learned reconstruction when used with
other sampling patterns.
114. !114
[Jin et. al, arXiv, 2019]
Accelerated MRI
When reconstructing vis simple zero filling inverse Fourier
Transform, learned sampling does not perform well.
Neither does the learned reconstruction when used with
other sampling patterns.
Performs best when using both components of our method together.
119. Data
!119
“make use of the best ally we have: the
unreasonable effectiveness of data.”
Alon Halevy, Peter Norvig, and Fernando Pereira, The unreasonable effectiveness of data. IEEE
Intelligent Systems, 24(2), 8-12. 2009
120. Thank you!
People behind our research (in the order of appearance)
Code and Datasets: https://github.com/vcg-uvic