This document presents a method for combining domain separation and adaptive active learning. It aims to efficiently label samples from a new target domain using available labeled data from a related source domain. The method first uses domain separation to identify informative regions of the target domain distinct from the source. It then applies adaptive active learning via breaking ties to iteratively select the most uncertain target samples for labeling. A TrAdaBoost algorithm is used to differentially reweight source and target samples as more target data is incorporated, in order to adapt the classifier to the target domain. The method is tested on QuickBird imagery of Zurich from different seasons, demonstrating a dataset shift. Results show the combined approach can effectively label samples for the target domain using minimal supervision
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
Domain Separation and Adaptive Active Learning for Efficient Land Cover Mapping
1. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
E XPERIMENTS
C ONCLUSIONS
D OMAIN SEPARATION FOR EFFICIENT ADAPTIVE
ACTIVE LEARNING
Giona Matasci1 , Devis Tuia2 , Mikhail Kanevski1
1
Institute of Geomatics and Analysis of Risk, University of
Lausanne, Lausanne, Switzerland.
2
Image Processing Laboratory (IPL), Universitat de València,
València, Spain.
Email: giona.matasci@unil.ch
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 1/ 18
2. I NTRODUCTION
M OTIVATION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
C URRENT SOLUTIONS
E XPERIMENTS
P ROPOSED SOLUTION
C ONCLUSIONS
M OTIVATION
Collection of ground truth for remote sensing image
classification is a demanding task.
Label the least possible number of samples from the
newly acquired images.
Take advantage of the availability of already collected
reference data on previously acquired images with
similar properties.
⇒ Rapid and large scale mapping of the land cover based
on minimal labeled datasets.
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 2/ 18
3. I NTRODUCTION
M OTIVATION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
C URRENT SOLUTIONS
E XPERIMENTS
P ROPOSED SOLUTION
C ONCLUSIONS
C URRENT SOLUTIONS
Domain Adaptation (DA) to adapt an existing classifier
already trained on a source image to classify the newly
acquired target image [Bruzzone & Prieto, TGRS, 2001].
⇒ Problem: when available, the target domain samples are
passively obtained at once.
Active Learning (AL) to guide the intelligent sampling of
ground truth data [Tuia et al., JSTSP, 2011].
⇒ Problem: not designed to handle a shift in class
distributions.
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 3/ 18
4. I NTRODUCTION
M OTIVATION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
C URRENT SOLUTIONS
E XPERIMENTS
P ROPOSED SOLUTION
C ONCLUSIONS
P ROPOSED SOLUTION
⇒ Combination of the DA and AL approaches:
AL via Breaking Ties to identify the best pixels to label
in the target image [Tuia et al., RSE, 2011].
TrAdaBoost to differently re-weight the samples of the
training set progressively extended by AL (following [Jun &
Ghosh, IGARSS, 2008]).
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 4/ 18
5. I NTRODUCTION
M OTIVATION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
C URRENT SOLUTIONS
E XPERIMENTS
P ROPOSED SOLUTION
C ONCLUSIONS
P ROPOSED SOLUTION
⇒ Combination of the DA and AL approaches:
AL via Breaking Ties to identify the best pixels to label
in the target image [Tuia et al., RSE, 2011].
TrAdaBoost to differently re-weight the samples of the
training set progressively extended by AL (following [Jun &
Ghosh, IGARSS, 2008]).
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 4/ 18
6. I NTRODUCTION
M OTIVATION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
C URRENT SOLUTIONS
E XPERIMENTS
P ROPOSED SOLUTION
C ONCLUSIONS
P ROPOSED SOLUTION
⇒ Combination of the DA and AL approaches:
Domain Separation [Rai et al., ALNLP, 2010] to pre-select
target pixels worth the labeling.
AL via Breaking Ties to identify the best pixels to label
in the target image [Tuia et al., RSE, 2011].
TrAdaBoost to differently re-weight the samples of the
training set progressively extended by AL (following [Jun &
Ghosh, IGARSS, 2008]).
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 4/ 18
7. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
D OMAIN SEPARATION STEPS
⇒ Identify the informative regions of the target domain.
Classify source VS. target domain and predict target
domain probabilities p(target|xj ) for the unlabeled set of
candidates xj .
Remove candidate examples xj with p(target|xj ) < pT .
⇒ Avoid the re-sampling of the input space covered by
source data.
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 5/ 18
8. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
D OMAIN SEPARATION CONCEPT
SOURCE
DOMAIN +1
Source
classifier
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 6/ 18
9. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
D OMAIN SEPARATION CONCEPT
+1
SOURCE
DOMAIN +1
-1
Source
TARGET
classifier DOMAIN
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 6/ 18
10. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
D OMAIN SEPARATION CONCEPT
+1
DOMAIN
SEPARATION
SOURCE
DOMAIN +1
-1
Source
TARGET
classifier DOMAIN
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 6/ 18
11. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
D OMAIN SEPARATION CONCEPT
+1
DOMAIN
SEPARATION
SOURCE
DOMAIN +1
-1
Source
TARGET
classifier DOMAIN
uninteresting target
-1 domain region
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 6/ 18
12. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
D OMAIN SEPARATION CONCEPT
interesting
region for AL
+1
DOMAIN
SEPARATION
SOURCE
DOMAIN +1
-1
Source
TARGET
classifier DOMAIN
uninteresting target
-1 domain region
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 6/ 18
13. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
A DAPTIVE AL
Main AL loop with a query of the candidate samples xj to
label in the target domain by Breaking Ties:
ˆBT
x = arg min max p(yj∗ = cl|xj ) − max p(yj∗ = cl|xj )
xj ∈U cl∈Ω cl∈Ωcl +
Append the target samples to the initial source
training set.
At each AL cycle a nested TrAdaBoost loop is run to
update misclassified training samples’ weights:
decrease weights of xi ∈ source domain
increase weights of xi ∈ target domain
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 7/ 18
14. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
A DAPTIVE AL CONCEPT
+1
SOURCE
DOMAIN +1
-1
Source
TARGET
classifier DOMAIN
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 8/ 18
15. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
A DAPTIVE AL CONCEPT
+1
uncertain
region
SOURCE identiyfied
DOMAIN +1 by AL
-1
Source
TARGET
classifier DOMAIN
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 8/ 18
16. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
A DAPTIVE AL CONCEPT
+1
TrAdaBoost
to adapt the
classifier uncertain
region
SOURCE identiyfied
DOMAIN +1 by AL
-1
Source
TARGET
classifier DOMAIN
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 8/ 18
17. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
A DAPTIVE AL CONCEPT
+1
TrAdaBoost
to adapt the
classifier uncertain
region
SOURCE identiyfied
DOMAIN +1 by AL
-1
Source
TARGET
classifier DOMAIN
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 8/ 18
18. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING D OMAIN SEPARATION
E XPERIMENTS A DAPTIVE ACTIVE LEARNING
C ONCLUSIONS
A DAPTIVE AL CONCEPT
+1 Adapted
TrAdaBoost target
to adapt the classifier
classifier uncertain
region
SOURCE identiyfied
DOMAIN +1 by AL
-1
Source
TARGET
classifier DOMAIN
-1
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 8/ 18
19. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
DATASETS : Q UICK B IRD IMAGES OF Z URICH
Source domain: image taken in autumn.
Target domain: image taken in summer.
⇒ Dataset shift due to
different look angle: different shadowing of the objects.
different season: shifted vegetation signatures.
different materials for building roofs and roads.
Spatial resolution: 0.6 m (after pansharpening).
Histogram matching to reduce the shift.
⇒ 15 features (4 MS bands, 1 PAN band, 4 textural f., 6
morphological f.)
5 classes: buildings, grass, vegetation, roads, shadows.
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 9/ 18
20. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
S OURCE IMAGE ( AUTUMN ): TARGET IMAGE ( SUMMER ):
FALSE COLOR FALSE COLOR
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 10/ 18
21. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
S OURCE IMAGE ( AUTUMN ): TARGET IMAGE ( SUMMER ):
GROUND TRUTH GROUND TRUTH
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 11/ 18
22. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
S OURCE TRAINING SET TARGET TEST SET
2.5 2.5
buildings buildings
grass grass
2 vegetation 2 vegetation
roads roads
shadows shadows
1.5 1.5
1
standardized NIR band
1
standardized NIR band
0.5 0.5
0 0
−0.5 −0.5
−1 −1
−1.5 −1.5
−1 0 1 2 3 4 5 −1 0 1 2 3 4 5
standardized R band standardized R band
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 12/ 18
23. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
E XPERIMENTAL SETUP
Base classifier: SVM with Gaussian kernel (integrating
instance weights for TrAdaBoost).
10 experiments with random initialization of the training
sets with 1000 pixels.
AL with addition of 10 target samples per iteration.
Domain Separation ruling out samples with p(target|xj )
lower than pT = 0.8.
Target test set: 26’797 pixels from spatially separate ROIs.
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 13/ 18
24. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
S OURCE DOMAIN VS. TARGET P OSTERIOR TARGET PROB . FOR THE
DOMAIN CANDIDATES
2.5
2.5 unlabeled candidate pixels
source pixels
target pixles 2
2
1.5
1.5
standardized NIR band
1
standardized NIR band
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
−1.5
−1.5
−2 −1 0 1 2 3 4 5
−2 −1 0 1 2 3 4 5 standardized R band
standardized R band
< 0.7 0.75 0.8 0.85 0.9 0.95 1
p(target|xj )
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 14/ 18
25. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
K APPA STATISTIC CURVES
0.85
Faster convergence
to the optimal 0.8
accuracy using Source
Domain Separation. Target
DS_Adap veAL_BT
0.75 JunAdaptiveAL_BT
Target model κ AL_BT
RandomS
performance is even
exceed using 0.7
adaptive AL
approaches. 0.65
1000 1050 1100 1150 1200 1250 1300
Training set size
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 15/ 18
26. I NTRODUCTION
DATASETS
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
S ETUP
E XPERIMENTS
R ESULTS
C ONCLUSIONS
Significant M C N EMAR ’ S TEST
superiority at H 1 rejection region at α = 0.05
α = 0.05 of the 10 DS_AdaptiveAL_BT VS. AL_BT
DS_AdaptiveAL_BT VS. JunAdaptiveAL_BT
Domain Separation 8
approach over the 6
two baselines in the
McNemar’s test z
4
first key iterations. 2
Superiority lasting 0
until convergence −2
(with 1130 pixels in
−4
the training set) when
compared to the −6
1050 1100 1150 1200 1250 1300
Training set size
standard BT method.
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 16/ 18
27. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
E XPERIMENTS
C ONCLUSIONS
S UMMARY
Domain Separation is an efficient yet intuitive technique
to pre-select interesting examples to label in a domain
adaptation setting.
Properly reduces the space to be searched by AL.
TrAdaBoost proved potential in meaningfully adapting
samples’ weights way according to their origin (source or
target domain).
⇒ The user is smartly guided in the collection of ground
truth samples when dealing with a newly acquired target
image.
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 17/ 18
28. I NTRODUCTION
D OMAIN SEPARATION AND ADAPTIVE ACTIVE LEARNING
E XPERIMENTS
C ONCLUSIONS
T HE END
Thank you for your attention !
Any questions ?
G. M ATASCI et al. IGARSS 2011, VANCOUVER , C ANADA 18/ 18