Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Large Scale Image Retrieval 2022.pdf
1. Large Scale Image Retrieval
and Specific Object Search
Ondra Chum
Center for Machine Perception
Czech Technical University in Prague
2. Outline
• The correspondence problem
– Local features
– Descriptors
– Matching
– Geometry
• Retrieval with local features
– Bag of Words
– Geometry in image retrieval
– Beyond visual nearest neighbour search
• Image retrieval with CNNs
– Efficient network training
– Day / Night retrieval
4. 4
The Problem
Given a pair of images, find corresponding pixels
YES !
Semantic correspondence
NOT in this lecture
Image stitching
3D reconstruction
Augmented reality
Localization / camera position
5. 5
due to large viewpoint
change (including scale)
=>
the wide-baseline
stereo problem
Applications:
- pose estimation
- 3D reconstruction
- location recognition
Finding correspondences is not easy
6. 6
Finding correspondences is not easy
due to large viewpoint
change (including scale)
=>
the wide-baseline
stereo problem
7. 7
Applications:
- location recognition
- summarization of image
collections
Finding correspondences is not easy
due to large viewpoint
change (including scale)
=>
the wide-baseline
stereo problem
8. 8
Applications:
- historical reconstruction
- location recognition
- photographer
recognition
- camera type recognition
MPV course 2022, CTU in Prague
Finding correspondences is not easy
due to large
time difference
=>
the temporal-baseline
stereo problem
11. 11
Local Features
aka feature points, key points, anchor points, distinguished regions, …
• Repeatable features
• Feature descriptor: patch to a vector
• Similar features have similar descriptors – nearest neighbour search
• Retrieval – matching millions of images at the same time
• Detect features in images independently, local = robust to occlusions
13. 13
Corners Saddle points Blobs
Local (Handcrafted) Features
1. Enumerate all regions / level sets
2. Compute responses / stability
3. Local Non-Maxima Suppression
Regions
Harris [Harris’88]
Susan [Smith’97]
FAST/ ORB
[Rosten’06][Rublee’11]
Hessian [Lindeberg’91]
SADDLE [Aldana’16]
Hessian
DoG [Lowe’04]
MSER [Matas’02]
Tuytelaars
Simple idea – a distinguished feature should be different (at least)
from all its immediate neighbourhoods
Commonly
used for
deep features
14. 14
Deep Local Features
DELF – classification loss, landmark labelled images
[Noh, Araujo, Sim, Weyand, Han: Large-scale image retrieval with attentive deep local features. CVPR’17]
HOW – contrastive loss, SfM Retrieval – 3D reconstruction, image level
[Tolias, Jenicek, Chum: Learning and aggregating deep local descriptors for instance-level recognition ECCV’20]
D2 net – point correspondence supervision from 3D
[Dusmanu et al.: D2-net: A trainable CNN for joint detection and description of local features. CVPR’19]
R2D2 – point correspondence supervision from optical flow
[Revaud et.al., R2D2: Reliable and Repeatable Detector and Descriptor, NeurIPS 2019]
SuperPoint – synthetic images, augmentations
[DeTone, Malisiewicz, Rabinovich: SuperPoint: Self-supervised interest point detection and description, CVPRW’18]
R2D2 – Revaud 2019
DELF – Noh 2017
15. 15
Local Features from CNN Activations
Simeoni, Avrithis, Chum: Local Features and Visual Words Emerge in Activations, CVPR 2019
Convolutional layers Activation tensor Activation channel
(output of a detector)
• Treat the activation channel as an input to handcrafted feature detector (MSER)
• Use channel id as a descriptor (visual word)
…
…
18. 18
Affine Shape with CNNs
Mishkin, Radenović, Matas:
Repeatability Is Not Enough: Learning Affine Regions via Discriminability, ECCV 2018
AffNet
19. 19
Descriptors of Local Features
Direct description of a measurement region: e.g. moments
Local
feature
Measurement
region
20. 20
Descriptors of Local Features
Local
feature
Measurement
region
Normalize region to a canonical form first
Histogram of gradients
(root) SIFT
21. 21
Descriptors of Local Features
Bin Fan Yurun Tian and Fuchao Wu. L2-Net: Deep learning of discriminative patch
descriptor in euclidean space. CVPR 2017.
Anastasiya Mishchuk, Dmytro Mishkin, Filip Radenovic, Jiri Matas: Working hard
to know your neighbor's margins: Local descriptor learning loss, NIPS 2017
23. Toy example for illustration: matching with OpenCV SIFT
Try yourself: https://github.com/ducha-aiki/matching-strategies-comparison
24. Toy example for illustration: matching with OpenCV SIFT
Recovered 1st to 2nd image projection,
ground truth 1st to 2nd image project,
inlier correspondences
25. Nearest neighbor (NN) strategy
Features from img1 are
matched to features from img2
You can see, that it is asymmetric and
allowing “many-to-one” matches
26. Nearest neighbor (NN) strategy
OpenCV RANSAC failed to find a good model
with NN matching
Features from img1 are
matched to features from img2
27. Mutual nearest neighbor (MNN) strategy
Features from img1 are
matched to features from img2
Only cross-consistent
(mutual NNs) matches are retained.
28. Mutual nearest neighbor (MNN) strategy
OpenCV RANSAC failed to find a good
model with MNN matching
No one-to-many connections, but still bad
Features from img1 are
matched to features from img2
29. Feature space outlier rejection
• How can we tell which putative matches are more reliable?
• Heuristic: compare distance of the nearest neighbor to that of the
second nearest neighbor
– Ratio will be high for features that are not distinctive
– Threshold of 0.8 provides good separation
David Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004.
30. Second nearest neighbor ratio (SNN) strategy
1stNN
2ndNN
2ndNN
1stNN
2ndNN
1stNN
1stNN / 2ndNN > 0.8, drop
1stNN / 2ndNN < 0.8, keep
- we look for 2 nearest neighbors
- If both are too similar (1stNN/2ndNN
ratio > 0.8) → discard
- If 1st NN is much closer
(1stNN/2ndNN ratio ≤ 0.8) → keep
Features from img1 are
matched to features from img2
31. Second nearest neighbor ratio (SNN) strategy
1stNN
2ndNN
2ndNN
1stNN
1stNN / 2ndNN < 0.8, keep
OpenCV RANSAC found a model roughly
correct
32. 1st geometrically inconsistent nearest neighbor ratio (FGINN)
strategy
32
MPV course 2022, CTU in Prague
SNN ratio is good, but
what about symmetrical,
or too closely detected
features?
Ratio test will kill them.
Solution: look for 2nd
nearest neighbor, which
is spatially far enough
from 1st nearest.
Mishkin et al.,“MODS: Fast and Robust Method for Two-View Matching”, CVIU 2015
33. SNN vs FGINN
Mishkin et al., “MODS: Fast and Robust Method for Two-View Matching”, CVIU 2015
SNN: roughly
correct
FGINN: more
correspondences,
better geometry
found
34. 34
Idea: verify a tentative match “+“ by comparing neighboring features
[Schmid and Mohr: Local Greyvalue Invariants for Image Retrieval. PAMI 1997]
+
+
+
+
+
+
+
+
+ +
+
+
matching features
Local Geometric Constraints
image 1 image 2
35. 35
Cosegmentation / Seed Growing
Start from a seed – a signle strong match and try to locally “grow” the match
- at pixel or feature level
[Ferrari, Tuytelaars,Van Gool, ECCV 2004]
[Cech, Matas, Perdoch CVPR 08]
[Cavalli, Larsson, Oswald, Sattler, Pollefeys: AdaLAM, ECCV’20]
Seeds – semantic objects
Benbihi, Pradalier and Chum: Object-Guided Day-Night Visual Localization in Urban Scenes, ICPR’22
38. 38
Robust Estimation: Hough vs. RANSAC
Voting:
• discretized parameter space
• votes for parameters consistent
with the measurements
• more votes higher support
+ multiple models
+ can be very fast
- memory demanding
- distances measured in the
parameter space
RANSAC:
• hypothesize and verify loop
- randomized (unless you try it all)
- typically slower than voting
+ no extra memory required
+ measures distances in pixels!
42. 42
RANSAC
• Select sample of m points at
random
• Calculate model
parameters that fit the data
in the sample
43. 43
RANSAC
• Select sample of m points at
random
• Calculate model parameters
that fit the data in the sample
• Calculate error function
for each data point
44. 44
RANSAC
• Select sample of m points at
random
• Calculate model parameters
that fit the data in the sample
• Calculate error function for
each data point
• Select data that support
current hypothesis
45. 45
RANSAC
• Select sample of m points at
random
• Calculate model parameters
that fit the data in the sample
• Calculate error function for
each data point
• Select data that support
current hypothesis
• Repeat sampling
46. 46
RANSAC
• Select sample of m points at
random
• Calculate model parameters
that fit the data in the sample
• Calculate error function for
each data point
• Select data that support
current hypothesis
• Repeat sampling
47. 47
RANSAC
• Select sample of m points at
random
• Calculate model parameters
that fit the data in the sample
• Calculate error function for
each data point
• Select data that support
current hypothesis
• Repeat sampling
48. 48
RANSAC
k … number of samples
drawn
m … minimal sample size
N … number of data points
I … time to compute a
single model
p … confidence in the
solution (.95)
log (1- )
log(1 – p)
I m
Nm
k =
50. 50
RANSAC [Fischler, Bolles ’81]
In: U = {xi} set of data points, |U| = N
function f computes model parameters p given a sample S from U
the cost function for a single data point x
Out: p* p*, parameters of the model maximizing the cost function
k := 0
Repeat until P{better solution exists} < η (a function of C* and no. of steps k)
k := k + 1
I. Hypothesis
(1) select randomly set , sample size
(2) compute parameters
II. Verification
(3) compute cost
(4) if C* < Ck then C* := Ck, p* := pk
end
51. 51
Advanced RANSAC
In: U = {xi} set of data points, |U| = N
function f computes model parameters p given a sample S from U
the cost function for a single data point x
Out: p* p*, parameters of the model maximizing the cost function
k := 0
Repeat until P{better solution exists} < η (a function of C* and no. of steps k)
k := k + 1
I. Hypothesis
(1) select randomly set , sample size
(2) compute parameters
II. Verification
(3) compute cost
(4) if C* < Ck then C* := Ck, p* := pk
end
Non-uniform sampling
Error scale estimation
Potential degeneracy tests
Randomized verification
Preemptive scoring
Improving precision
54. 54
Image Retrieval
Find this …
… in a large (millions+) collection of images
?
• Find images of the same object
• What is this? Nearest neighbor classifier
• Where is this? Visual localization
• How did this look in the past?
• Is there anything interesting here?
56. 56
Feature Based Retrieval
• Affine invariant features
• Efficient descriptors
• Corresponding regions in images have similar
descriptors – measured by some distance in
the features space
• Images of the same object have many
correspondences in common
57. 57
Video Google
• Feature detection and description
• Vector quantization
• Bag of Words representation
• Scoring
• Verification
Sivic & Zisserman – ICCV 2003
Video Google: A Text Retrieval Approach to Object Matching in Videos
59. 59
Feature Distance Approximation
Partition the feature space
(k – means clustering)
Feature distance
0 : features in the same cell
∞ : features in different cells
+ most of the features are not
considered (infinitely distant)
+ near-by descriptors accessible
instantly – storing a list of
features for each cell
60. 60
Feature Distance Approximation
- quantization effects
- large (even unbounded) cells
Feature distance
0 : features in the same cell
∞ : features in different cells
61. 61
Vector Quantization via k-Means
Initialize cluster
centres
Find nearest cluster to each
datapoint (slow) O(N k)
Re-compute cluster
centres as centroids
Iterate
62. 62
Bags of Words Image Representation
A
C
D
B
A
C
D
B
1
0
0
2
0
3
0
1
Images
…
Visual
vocabulary
Images are represented by vector / histogram of
visual words present in them
Term-frequency (tf) – visual word D is twice in the image
sparse
64. 64
Efficient Scoring
bag of words representation
(up to 1,000,000 D)
0
3
0
1
α1 ( 1 0 0 2 )
α2 ( 0 2 0 1 )
α3 ( 1 0 0 0 )
…
Database Query
• =
Score
αq
s2
s3
…
A C D
B
A
C
D
B
s1
65. 65
1 2 3 4 5 6 7 8 9 10
BoW and Inverted File
6 7 7 …
1 3 6
…
5 6 8
…
2 4 10 …
A
C
D
B
Visual
vocabulary
…
A C
D B
A A
B
B
C
C
D
D
…
…
…
…
…
…
…
…
…
…
66. 66
1 2 3 4 5 6 7 8 9 10
BoW and Inverted File
6 7 7 …
1 3 6 …
5 6 8 …
query visual word 1
query visual word 2
query visual word 3
D
B
G
67. 67
1 2 3 4 5 6 7 8 9 10
BoW and Inverted File
Efficient (fast)
Linear complexity (in # documents)
Can be interpreted as voting
68. 69
Geometric Verification and Re-ranking
Query
Results
reject
verify
localize
Philbin, Chum, Isard, Sivic, Zisserman: Object retrieval with large
vocabularies and fast spatial matching, CVPR’07
71. 72
Hierarchical k-means
+ fast O(N log k)
+ incremental construction
- not so good quantization
- often imbalanced
Nistér & Stewénius: Scalable recognition with a vocabulary tree. CVPR 2006
72. 73
Approximate k-means
+ fast O(N log k)
+ reasonable quantization
- Can be inconsistent when ANN fails
Philbin, Chum, Isard, Sivic, and Zisserman – CVPR 2007
Object retrieval with large vocabularies and fast spatial matching
Initialize cluster
centres
Find approximate nearest
cluster to each datapoint
Re-compute cluster
centres as centroids
Iterate
73. 74
Hamming Embedding
+ good quantization
+ elegant idea
- huge memory footprint
0 1
0
1
1
1
0 0
0
0
1
1
Hamming
distance
1
1
2
Jegou, Douze, and Schmid – ECCV 2008
Hamming embedding and weak geometric consistency for large scale image search
random projections
74. 75
Soft Assignment
(Approximate) k-means
- database side
- query side
Hierarchical k-means
Philbin, Chum, Isard, Sivic, and Zisserman – CVPR 2008
Lost in Quantization
Nistér & Stewénius – CVPR 2006 Scalable
recognition with a vocabulary tree
75. 76
Learning Fine Vocabularies
Fine vocabulary (16 million visual words)
Using wide-baseline stereo matches on 6 million images to learn what is similar
Mikulik, Perdoch, Chum, and Matas: Learinig a Fine Vocabulary, ECCV 2010
76. 77
Appearance Variance of a Single Feature
Mikulik, Perdoch, Chum, Matas: Learning Vocabularies over a Fine Quantization, IJCV 2012
• over 5 million images
• almost 20k clusters of 750k images (visual word based)
• 733k successfully matched in WBS matching (raw descriptor based)
• over 111 M feature tracks established (12.3 M with 6+ features)
• 564 M features in the tracks (319.5 M in tracks of 6+ features)
http://cmp.felk.cvut.cz/~qqmikula/publications/ijcv2012/index.html
77. 78
Short Codes – (Joint) Dimensionality Reduction
Jegou & Chum: Negative evidences and co-occurrences in image retrieval: the benefit of PCA and
whitening, ECCV 2012
Radenovic, Jegou & Chum: Multiple Measurements and Joint Dimensionality Reduction for
Large Scale Image Search with Short Vectors ICMR 2015
78. 79
Aggregating Local Descriptors
A
C
D
B
VLAD descriptor
[Jégou, Douze, Schmid and Pérez, CVPR’10]
Fischer Kernel approach
[Perronnin and Dance, CVPR’07]
often combined with dimensionality
reduction by PCA – short codes
• High discriminability needed
• BOW increases the number of visual words
• only assignments are recorded
Idea: using higher order statistics
• small vocabulary (fast assignment)
• dense vectors (ANN search)
• high disriminability
79. 80
Aggregating Local Descriptors
A
C
D
B
VLAD descriptor
[Jégou, Douze, Schmid and Pérez, CVPR’10]
1. compute assignments
2. compute difference to means
3. sum differences per visual word
80. 81
• Fit a GMM to training data (SIFT)
• diagonal covariance matrix
• whitened data
• Image represented as a sum (over image
features) of gradients of log-likelihood
• fixed size representation (#parameters)
Aggregating Local Descriptors
A
C
D
B
Fischer Kernel approach
[Perronnin and Dance, CVPR’07]
Intuition: direction in which the parameters λ
of the general model should we modified to
better fit the specific sample (current image
data).
87. 89
Context expansion
• the model of the object is grown beyond the boundaries of the
initial query,
• a feature added into the model that is not inside the context is
inactive until confirmed by feature(s) from another image with
the same visual word and similar geometry.
• Once a feature is confirmed, it adds the neighbourhood around
its center to the context.
Chum, Mikulik, Perdoch, Matas: Total Recall II: Query Expansion Revisited, CVPR 2011
88. 90
• the model of the object is grown beyond the boundaries of the
initial query,
• a feature added into the model that is not inside the context is
inactive until confirmed by feature(s) from another image with
the same visual word and similar geometry.
• Once a feature is confirmed, it adds the neighbourhood around
its center to the context.
Context expansion
Chum, Mikulik, Perdoch, Matas: Total Recall II: Query Expansion Revisited, CVPR 2011
90. 92
How Much Do We Need to See?
Oxford landmarks – 3 queries
100%, 50%, and 10% of the query bounding box
Context learned from the full bounding box
Context learned from 50% of the bounding box
Context learned from 10% of the bounding box
91. 93
Effects of decreasing the
query bounding-box size
Baseline:
spatial verification +
full bounding box
Context QE at the baseline
performance needs only:
• 20% of the BB on the
Paris dataset
• 40% of the BB on the
Oxford dataset
94. 96
Retrieval for Browsing
Query 1
Query 2
Mikulik, Chum, Matas: Image Retrieval for Online Browsing in Large Image Collections, SISAP 2013.
95. 97
New Problem Formulation
Retrieve relevant images subject to a constraint
• Geometric
– Maximize number of relevant pixels
– Maximize scale change
– Change of viewpoint
• Other
– High photometric change (day / night)
96. 98
New Problem Formulation
Results
• Low rank in standard similarity measure
– Geometry for verification and constraint enforcement
– Geometry in the inverted file (DAAT)
• Standard similarity measure can be 0
– Matching through a path of images (query expansion)
99. 102
Highest Resolution Transform
Given a query and a dataset, for every pixel in the query image:
Find the database image with the maximum resolution depicting the pixel
37.3x 27.0x 22.8x 21.9x 21.6x
101. 104
Level of Interest Transform
Given a query and a dataset, for every pixel in the query image:
Find the frequency with which it is photographed in detail
0 – 1 % 1 – 3 % 3 – 10 %
detail
size
104. 107
Tight Coupling of Retrieval and SfM
Schoenberger, Radenovic, Chum, and Frahm:
From Single Image Query to Detailed 3D Reconstruction , CVPR’15
105. Beyond Nearest Neighbour
Looking around the corner
• Zoom out – getting a context of the image
• All details – getting transition to the object details
• Sidewise crawl
108. 111
Efficient Search with Global Descriptors
Find this … … in a large collection of images
?
Mapping into high dimensional space
k ~ 512 … 2048
Image similarity – distance
descriptor space Rk
109. 112
Efficient Search with Global Descriptors
Find this … … in a large collection of images
descriptor space Rk
110. 113
CNN Descriptors for Image Retrieval
…
Max pooling
+ L2-norm
K x 1
MAC
vec.
Image Convolutional Layers MAC Layer Descriptor
𝑤𝑤 ×ℎ×3 𝑊𝑊 ×𝐻𝐻 ×𝐾𝐾 𝐾𝐾 ×1
𝑤𝑤 × ℎ – image width and height
𝑊𝑊 × 𝐻𝐻 – number of activations for feature map 𝑘𝑘 ∈ {1 … 𝐾𝐾}
𝐾𝐾 – number of feature maps in the last convolutional layer
MAC – Maximum Activations of Convolutions
111. 114
CNN Descriptors for Image Retrieval
…
Image Convolutional Layers
𝑤𝑤 ×ℎ×3 𝑊𝑊 ×𝐻𝐻 ×𝐾𝐾
𝑤𝑤 × ℎ – image width and height
𝑊𝑊 × 𝐻𝐻 – number of activations for feature map 𝑘𝑘 ∈ {1 … 𝐾𝐾}
𝐾𝐾 – number of feature maps in the last convolutional layer
Sum pooling
+ L2-norm
K x 1
SPoC
vec.
SPcC Layer Descriptor
𝐾𝐾 ×1
SPoC – sum-pooled convolutional
112. 115
CNN Descriptors for Image Retrieval
…
Image Convolutional Layers
𝑤𝑤 ×ℎ×3 𝑊𝑊 ×𝐻𝐻 ×𝐾𝐾
𝑤𝑤 × ℎ – image width and height
𝑊𝑊 × 𝐻𝐻 – number of activations for feature map 𝑘𝑘 ∈ {1 … 𝐾𝐾}
𝐾𝐾 – number of feature maps in the last convolutional layer
Descriptor
GeM pooling
+ L2-norm
K x 1
GeM
vec.
GeM Layer
𝐾𝐾 ×1
GeM– Generalized Mean
p = 1
average pooling
p = inf
max pooling
119. “Lots of Training Examples”
Large Internet
photo collection
…
Convolutional Neural
Network (CNN)
Image annotations
Training
120. “Lots of Training Examples”
Large Internet
photo collection
…
Convolutional Neural
Network (CNN)
Not accurate
Expensive $$
Manual cleaning of
the training data
done by Researchers
Very expensive $$$$
Automated extraction
of training data
Very accurate
Free $
121. • Image representation created from CNN activations
of a network pre-trained for classification task
[Gong et al. ECCV’14, Razavian et al. arXiv’14, Babenko et al.
ICCV’15, Kalantidis et al. arXiv’15, Tolias et al. ICLR’16]
+ Retrieval accuracy suggests generalization of CNNs
- Trained for image classification, NOT retrieval task
CNN Image Retrieval
122. • Image representation created from CNN activations
of a network pre-trained for classification task
[Gong et al. ECCV’14, Razavian et al. arXiv’14, Babenko et al.
ICCV’15, Kalantidis et al. arXiv’15, Tolias et al. ICLR’16]
+ Retrieval accuracy suggests generalization of CNNs
- Trained for image classification, NOT retrieval task
CNN Image Retrieval
Same Class
Image from ImageNet.org
123. CNN Image Retrieval
• CNN network re-trained using a dataset that contains
landmarks and buildings as object classes.
[Babenko et al. ECCV’14]
+ Training dataset closer to the target task
- Final metric different to the one actually optimized
- Constructing training datasets requires manual effort
124. CNN Image Retrieval
• CNN network re-trained using a dataset that contains
landmarks and buildings as object classes.
[Babenko et al. ECCV’14]
+ Training dataset closer to the target task
- Final metric different to the one actually optimized
- Constructing training datasets requires manual effort
Same Class
Image from [Babenko et al. ECCV’14]
125. CNN Image Retrieval
• NetVLAD: end-to-end fine-tuning for image retrieval.
Geo-tagged dataset for weakly supervised fine-tuning.
[Arandjelovic et al. CVPR’16]
+ Training dataset corresponds to the target task
+ Final metric corresponds to the one actually optimized
- Training dataset requires geo-tags
126. CNN Image Retrieval
• NetVLAD: end-to-end fine-tuning for image retrieval.
Geo-tagged dataset for weakly supervised fine-tuning.
[Arandjelovic et al. CVPR’16]
+ Training dataset corresponds to the target task
+ Final metric corresponds to the one actually optimized
- Training dataset requires geo-tags
query
Camera Orientation Unknown
unknown
127. CNN learns from BoW – Training Data
Input: Large unannotated dataset
1. Initial clusters created by grouping of spatially
related images [Chum & Matas PAMI’10]
2. Clustered images used as queries for a retrieval-
SfM pipeline [Schonberger et al. CVPR’15]
Output: Non-overlapping 3D models
551 (134k) training / 162 (30k) validation
Camera Orientation Known
Number of Inliers Known
128. CNN learns from BoW – Positives
1. Descriptor distance: Image with the lowest global
descriptor distance is chosen (NetVLAD use this)
2. Maximum inliers: Image with the highest number of
co-observed 3D points with the query image is chosen
3. Relaxed inliers: Random image close to the query, with
enough inliers and not an extreme scale change is chosen
query m 1 m 2 m 3
129. CNN learns from BoW – Negatives
K-nearest neighbors of the query image are selected from
all non-matching clusters, using different methods:
1. No constraint: chosen images often near identical.
2. At most one image per cluster: higher variability.
query hardest negative N 1 N 2
133. 136
Day – Night Retrieval
Day – Night training image pairs – sequences of images day – evening - night
Photometric normalization
134. 137
Contrast Limited Adaptive Histogram Equalization
• Semi local (windows)
• Linear interpolation
• Only values more frequent than
the clipping limit are redistributed
clipping
limit
Original Historam Equalization (global) CLAHE
[Jenicek, Chum: No Fear of the Dark: Image Retrieval under Varying Illumination Conditions, ICCV 2019]