SlideShare a Scribd company logo
Transfer learning for unsupervised influenza-like illness
models from online search data
Bin Zou
Vasileios Lampos
(
lampos.net
)
Ingemar J. Cox
Department of Computer Science
University College London
From online searches to influenza-like illness rates
50 60 70 80 90 100
of queries
queries
op-scoring queries to include in the
2004 2005 2006
Year
2007 2008
2
4
6
8
10
ILIpercentage
0
12
Figure 2 | A comparison of model estimates for the mid-Atlantic region
(black) against CDC-reported ILI percentages (red), including points over
which the model was fit and validated. A correlation of 0.85 was obtained
over 128 points from this region to which the model was fit, whereas a
correlation of 0.96 was obtained over 42 validation points. Dotted lines
indicate 95% prediction intervals. The region comprises New York, New
LETTERS
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 1/29
From online searches to influenza-like illness rates
Online data for health (2/3)
Google Flu Trends (discontinued)Google Flu Trends (discontinued)
popularising an established idea
Ginsberg et al. (2009)
Eysenbach (2006); Polgreen et al. (2008)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 1/29
From online searches to influenza-like illness rates
Task abstraction
• input – frequency of search queries over time: X∈Rn×s
• output – corresponding influenza-like illness (ILI) rate: y∈Rn
• regression task, i.e. learn f : X → y
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 2/29
From online searches to influenza-like illness rates
Task abstraction
• input – frequency of search queries over time: X∈Rn×s
• output – corresponding influenza-like illness (ILI) rate: y∈Rn
• regression task, i.e. learn f : X → y
Modelling
• originally proposed models were evidently not good solutions1
• new families of methods seem to work OK in various geographies2
1
Cook et al. (2011); Olson et al. (2013); Lazer et al. (2014)
2
Lampos et al. (2015a); Yang et al. (2015); Lampos et al. (2017); Wagner et al. (2018)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 2/29
Why estimate ILI rates from online search statistics?
Common arguments for:
• complements traditional syndromic surveillance
✓ timeliness
✓ broader demographic coverage, larger cohort
✓ broader geographical coverage
✓ not affected by closure days or national holidays
✓ lower cost
• applicable to locations that lack an established health system
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 3/29
Why estimate ILI rates from online search statistics?
Common arguments for:
• complements traditional syndromic surveillance
✓ timeliness
✓ broader demographic coverage, larger cohort
✓ broader geographical coverage
✓ not affected by closure days or national holidays
✓ lower cost
• applicable to locations that lack an established health system
✓ oxymoron (supervised learning)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 3/29
Why estimate ILI rates from online search statistics?
Common arguments for:
• complements traditional syndromic surveillance
✓ timeliness
✓ broader demographic coverage, larger cohort
✓ broader geographical coverage
✓ not affected by closure days or national holidays
✓ lower cost
• applicable to locations that lack an established health system
✓ oxymoron (supervised learning)
✓ motivated this paper
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 3/29
Our contribution in a nutshell
Main task
• train a model for a source location where historical syndromic
surveillance data is available, and
• transfer it to a target location where syndromic surveillance data is
not available or, in our experiments, ignored
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 4/29
Our contribution in a nutshell
Main task
• train a model for a source location where historical syndromic
surveillance data is available, and
• transfer it to a target location where syndromic surveillance data is
not available or, in our experiments, ignored
Transfer learning steps
1. Learn a linear regularised regression model for a source location
2. Map search queries from the source to the target domain
(languages may differ)
3. Transfer the source weights to the target domain
(might involve weight re-adjustment)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 4/29
Transfer learning task definition
query frequency xij =
#query j issued during ∆ti
#all queries issued during ∆ti
for a location
Source domain
• DS =
{
(xi, yi)
}
, i∈{1, ..., n}
• xi ∈Rs
= {xij}, j ∈{1, ..., s}: frequency of source queries
• yi ∈R: ILI rate for time interval i
Target domain
• DT = {x′
i}, i∈{1, ..., m}
• x′
i ∈Rt
: frequency of target queries
• note that t need not equal s
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 5/29
Transfer learning task definition
query frequency xij =
#query j issued during ∆ti
#all queries issued during ∆ti
for a location
Source domain
• DS =
{
(xi, yi)
}
, i∈{1, ..., n}
• xi ∈Rs
= {xij}, j ∈{1, ..., s}: frequency of source queries
• yi ∈R: ILI rate for time interval i
Target domain
• DT = {x′
i}, i∈{1, ..., m}
• x′
i ∈Rt
: frequency of target queries
• note that t need not equal s
§
¦
¤
¥
Aim: Given DS and DT, estimate y′
i
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 5/29
Step 1 – Learn a regression function in the source domain
Source domain
• xi ∈Rs
= {xij}, j ∈{1, ..., s}: frequency of source queries
• yi ∈R: ILI rate for time interval i
Elastic net1
(constrained)
argmin
w,β
n∑
i=1
(
yi − β −
( s∑
j=1
xijwj
))2
+ λ1
s∑
j=1
|wj| + λ2
s∑
j=1
w2
j
subject to w ≥ 0
1
Zou and Hastie (2005)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 6/29
Step 1 – Learn a regression function in the source domain
Elastic net (constrained)
argmin
w,β
n∑
i=1
(
yi − β −
( s∑
j=1
xijwj
))2
+ λ1
s∑
j=1
|wj| + λ2
s∑
j=1
w2
j
subject to w ≥ 0
Why use elastic net?
• more straightforward to transfer
• few training instances
• previous successful application1
• combines ℓ1- and ℓ2-norm regularisation: sparse solution, model
consistency under collinearity
1
Lampos et al. (2015a,b); Zou et al. (2016); Lampos et al. (2017)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 7/29
Step 1 – Learn a regression function in the source domain
Elastic net (constrained)
argmin
w,β
n∑
i=1
(
yi − β −
( s∑
j=1
xijwj
))2
+ λ1
s∑
j=1
|wj| + λ2
s∑
j=1
w2
j
subject to w ≥ 0
Why apply a non-negative weight constraint?
• (how?) coordinate descent restricting negative updates to 0
• worse performing model for the source location
• but enables a more comprehensive transfer
• better performance at the target location
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 7/29
Step 1 – Learn a regression function in the source domain
Selecting queries prior to applying elastic net
• hybrid feature selection similarly to previous work1
• derive query embeddings eq using fastText2
• define a flu context/topic: T = {‘flu’, ‘fever’}
• compute each query’s similarity to T using
g (q, T ) = cos (eq, eT1 ) × cos (eq, eT2 )
cos(·, ·) is mapped to [0, 1]
1
Zou et al. (2016); Lampos et al. (2017); Zou et al. (2018)
2
Bojanowski et al. (2017)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 8/29
Step 1 – Learn a regression function in the source domain
Selecting queries prior to applying elastic net
• hybrid feature selection similarly to previous work1
• derive query embeddings eq using fastText2
• define a flu context/topic: T = {‘flu’, ‘fever’}
• compute each query’s similarity to T using
g (q, T ) = cos (eq, eT1 ) × cos (eq, eT2 )
cos(·, ·) is mapped to [0, 1]
• filter out queries with either g ≤ 0.5 or r ≤ 0.3 (corr. with ILI)
§
¦
¤
¥
QS: remaining queries after applying elastic net
1
Zou et al. (2016); Lampos et al. (2017); Zou et al. (2018)
2
Bojanowski et al. (2017)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 8/29
Step 2 – Mapping source to target queries
Task: map QS to a subset of PT (pool of target queries)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 9/29
Step 2 – Mapping source to target queries
Task: map QS to a subset of PT (pool of target queries)
How?
• direct translation does not work
— invalid search queries
— worse performance
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 9/29
Step 2 – Mapping source to target queries
Task: map QS to a subset of PT (pool of target queries)
How?
• direct translation does not work
— invalid search queries
— worse performance
• semantic similarity, Θs: (cross-lingual) word embeddings
• temporal similarity, Θc: correlation between frequency time series
• hybrid similarity: Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1]
• consider 1-to-k mappings
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 9/29
Step 2 – Semantic similarity (Θs)
Same language in both domains?
• Use cosine similarity on query embeddings
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 10/29
Step 2 – Semantic similarity (Θs)
Same language in both domains?
• Use cosine similarity on query embeddings
If not, derive bi-lingual embeddings1
• m core translation pairs, σ→τ, with embeddings Eσ, Eτ ∈Rm×d
• learn a transformation matrix, W∈Rd×d
, by minimising:
argmin
W
∥EσW − Eτ ∥
2
2 , subject to W⊤
W = I
1
Smith et al. (2016)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 10/29
Step 2 – Semantic similarity (Θs)
Same language in both domains?
• Use cosine similarity on query embeddings
If not, derive bi-lingual embeddings1
• m core translation pairs, σ→τ, with embeddings Eσ, Eτ ∈Rm×d
• learn a transformation matrix, W∈Rd×d
, by minimising:
argmin
W
∥EσW − Eτ ∥
2
2 , subject to W⊤
W = I
• orthogonality constraint:
— Eτ ≈ EσW and Eσ ≈ Eτ W⊤
— improves the performance of machine translation2
• solution: W = VU⊤
, where E⊤
τ Eσ = UΣV⊤
(SVD)
1
Smith et al. (2016) 2
Artetxe et al. (2016)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 10/29
Step 2 – Semantic similarity (Θs)
Compute a query (source) to query (target) similarity matrix
• source, target query embedding: eqi
, eqj
∈R1×d
• cosine similarity matrix Ω∈Rs×|PT|
, ωij =
(
eqi
W e⊤
qj
)
(
∥eqi
W∥2∥eqj
∥2
)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 11/29
Step 2 – Semantic similarity (Θs)
Compute a query (source) to query (target) similarity matrix
• source, target query embedding: eqi
, eqj
∈R1×d
• cosine similarity matrix Ω∈Rs×|PT|
, ωij =
(
eqi
W e⊤
qj
)
(
∥eqi
W∥2∥eqj
∥2
)
Inverted softmax
• using ωij directly for translations can generate hubs
— target query is similar to way too many different source queries
— reduces performance of machine translation1
• instead, given a source query qi, find a target qj that maximises
Pj→i =
exp (η ωij)
αj
s∑
z=1
exp (η ωiz)
1
Dinu et al. (2014); Smith et al. (2016)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 11/29
Step 2 – Semantic similarity (Θs)
Pj→i =
exp (η ωij)
αj
s∑
z=1
exp (η ωiz)
• αj: ensures Pj→i is a probability
• s: number of source queries
• η: learned by maximising the log probability over the alignment
dictionary (σ→τ): argmax
η
∑
pairs ij
ln (Pj→i)
Inverted softmax
• probability that a target query translates back to the source query
• hub target query =⇒ large denominator
• top-k target queries are selected as possible mappings of qi
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 12/29
Step 2 – Semantic similarity (Θs)
Inverted softmax
• probability that a target query translates back to the source query
• hub target query =⇒ large denominator
• top-k target queries are selected as possible mappings of qi
Determine the semantic similarity score by
• using these top-k queries (average if k > 1)
• and computing
Θs(qi, qj) =
(
eqi
W e⊤
qj
)
/
(
∥eqi
W∥2∥eqj
∥2
)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 13/29
Step 2 – Temporal similarity (Θc)
Exploit query relationship in the frequency space:
• important relationship; based on the core statistical input
information
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 14/29
Step 2 – Temporal similarity (Θc)
Exploit query relationship in the frequency space:
• important relationship; based on the core statistical input
information
• compute pair-wise correlation between the frequency time series of
source and target queries
• flu seasons may be offset in different locations
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 14/29
Step 2 – Temporal similarity (Θc)
Exploit query relationship in the frequency space:
• important relationship; based on the core statistical input
information
• compute pair-wise correlation between the frequency time series of
source and target queries
• flu seasons may be offset in different locations
✓ compute all correlations using a shifting window of ±ξ weeks
✓ optimal window lij (source query qi, target query qj) is
independently computed for each target query
Θc(qi, qj) = ρ
(
xi(t), xj(t + lij)
)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 14/29
Step 3 – Determining weights for target queries
Previous steps
• source query qi allocated weight wi
• source query qi mapped to a set Ti of k ≥ 1 target queries
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 15/29
Step 3 – Determining weights for target queries
Previous steps
• source query qi allocated weight wi
• source query qi mapped to a set Ti of k ≥ 1 target queries
Weight transfer
• if k = 1, directly assign wi to the single target query
• if k > 1, wi is distributed across the k identified target queries
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 15/29
Step 3 – Determining weights for target queries
Previous steps
• source query qi allocated weight wi
• source query qi mapped to a set Ti of k ≥ 1 target queries
Weight transfer
• if k = 1, directly assign wi to the single target query
• if k > 1, wi is distributed across the k identified target queries
Weighting schemes
• uniform: w′
j = wi/k
• based on Θij, j ∈ {2, . . . , k}: w′
j =
wiΘij
∑
qj ∈Ti
Θij
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 15/29
Experiments – Transfer tasks
Source location: United States (US)
Target locations
• France (FR): from English to French
• Spain (ES): from English to Spanish
• Australia (AU): from English to English, different hemisphere,
greater temporal difference in flu outbreaks
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 16/29
Experiments – Transfer tasks
Source location: United States (US)
Target locations
• France (FR): from English to French
• Spain (ES): from English to Spanish
• Australia (AU): from English to English, different hemisphere,
greater temporal difference in flu outbreaks
Why choose locations where syndromic surveillance systems exist?
• more robust evaluation at this preliminary stage
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 16/29
Experiments – Data
Search query frequencies from Google
• retrieved from the Google Correlate endpoint
• z-scored (by default)
• weekly rates
• September 2007 to August 2016 (both inclusive)
• # queries: 34,121 (US), 29,996 (FR), 15,673 (ES), 8,764 (AU)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 17/29
Experiments – Data
Search query frequencies from Google
• retrieved from the Google Correlate endpoint
• z-scored (by default)
• weekly rates
• September 2007 to August 2016 (both inclusive)
• # queries: 34,121 (US), 29,996 (FR), 15,673 (ES), 8,764 (AU)
Influenza-like illness (ILI) rates
• data from health organisations in these countries
(CDC, SN, SISSS, ASPREN)
• same date range, weekly ILI rates
• z-scored as the metric systems vary in these countries
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 17/29
Experiments – ILI rates in the source vs. target country
How similar are they?
US
2008 2009 2010 2011 2012 2013 2014 2015 2016
-1
0
1
2
3
4
5
ILIrates(z-scored)
US
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
Experiments – ILI rates in the source vs. target country
How similar are they?
US vs. FR
2008 2009 2010 2011 2012 2013 2014 2015 2016
-1
0
1
2
3
4
5
ILIrates(z-scored)
US
FR
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
Experiments – ILI rates in the source vs. target country
How similar are they?
US vs. ES
2008 2009 2010 2011 2012 2013 2014 2015 2016
-1
0
1
2
3
4
5
ILIrates(z-scored)
US
ES
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
Experiments – ILI rates in the source vs. target country
How similar are they?
US vs. AU
2008 2009 2010 2011 2012 2013 2014 2015 2016
-1
0
1
2
3
4
5
ILIrates(z-scored)
US
AU
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
Experiments – Evaluation
Protocol
• train a model using 5 flu seasons, test it on the next
• evaluate performance on the the last 4 flu seasons of our data set
• Θc: use a window of ξ = ±6 weeks
• source query → k = {1, ..., 5} target queries
• Pearson correlation, mean absolute error (MAE), root mean
squared error (RMSE)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 19/29
Experiments – Evaluation
Protocol
• train a model using 5 flu seasons, test it on the next
• evaluate performance on the the last 4 flu seasons of our data set
• Θc: use a window of ξ = ±6 weeks
• source query → k = {1, ..., 5} target queries
• Pearson correlation, mean absolute error (MAE), root mean
squared error (RMSE)
Baseline models
• worst case baseline (R): random shuffling of identified query pairs
• unsupervised learning (U) using most semantically relevant queries
• best case threshold (S): supervised learning using elastic net
• transfer component analysis (TCA)1
1
Pan et al. (2009)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 19/29
Experiments – General observations
In general:
• semantic similarity (Θs) is performing better than temporal similarity
(Θc) when used in isolation
• using semantic or temporal similarity in isolation provides inferior
performance, i.e. hybrid similarity works best
• values for k > 1 did not help the hybrid similarity to improve
• when k > 1, the non-uniform way of weighting was performing better
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 20/29
Experiments – General observations
In general:
• semantic similarity (Θs) is performing better than temporal similarity
(Θc) when used in isolation
• using semantic or temporal similarity in isolation provides inferior
performance, i.e. hybrid similarity works best
• values for k > 1 did not help the hybrid similarity to improve
• when k > 1, the non-uniform way of weighting was performing better
Closer look at results for γ = 0, γ = 1 and the best choice of γ
§
¦
¤
¥
Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1]
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 20/29
Experiments – Results for France
§
¦
¤
¥
Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1]
Avg. correlation
0
0.2
0.4
0.6
0.8
1
γ = 0 γ = 1 γ = .5
0.9590.956
0.835
Avg. MAE
0
13
26
39
52
65
γ = 0 γ = 1 γ = .5
34.05
46.79
61.53
Avg. RMSE
0
21
42
63
84
105
γ = 0 γ = 1 γ = .5
52.15
65.37
100.06
R: 0.911
U: 0.916
S: 0.984
R: 87.729
U: NA
S: 25.088
R: 101.845
U: NA
S: 42.349
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 21/29
Experiments – Results for France
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 22/29
Experiments – Results for Spain
§
¦
¤
¥
Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1]
Avg. correlation
0
0.2
0.4
0.6
0.8
1
γ = 0 γ = 1 γ = .2
0.9180.944
0.827
Avg. MAE
0
7
14
21
28
35
γ = 0 γ = 1 γ = .2
22.66
33.22
25.99
Avg. RMSE
0
9
18
27
36
45
γ = 0 γ = 1 γ = .2
32.30
38.57
41.68
R: 0.872
U: 0.925
S: 0.971
R: 40.311
U: NA
S: 22.120
R: 47.204
U: NA
S: 30.600
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 23/29
Experiments – Results for Spain
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 24/29
Experiments – Results for Australia
§
¦
¤
¥
Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1]
Avg. correlation
0
0.2
0.4
0.6
0.8
1
γ = 0 γ = 1 γ = .9
0.9210.915
0.7
Avg. MAE
0
9
18
27
36
45
γ = 0 γ = 1 γ = .9
22.04
30.28
42.35
Avg. RMSE
0
12
24
36
48
60
γ = 0 γ = 1 γ = .9
25.59
34.33
55.32
R: 0.875
U: 0.862
S: 0.916
R: 25.792
U: NA
S: 17.829
R: 30.080
U: NA
S: 21.782
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 25/29
Experiments – Results for Australia
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 26/29
Experiments – Results for different values of γ
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
Experiments – Results for different values of γ
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
Experiments – Results for different values of γ
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
Experiments – Results for different values of γ
• hybrid similarity optima differ
per target country
• optimal γ depends on the
characteristics of the input
space
• µ(Θc)/µ(Θs) across queries
relates to optimal γ: 1.143
(FR), 0.982 (ES), 2.261 (AU)
• identifying optimal γ
automatically is an open task
• γ = 0.5 provides better results
than non hybrid similarities
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
Experiments – Where do some of the errors come from?
Error analysis setup
• investigate the models for the optimal gammas
• compute the mean ILI estimate impact (%) during the 10 weeks
with highest MAE across all test periods per target country
• identify the worst-5 query pairings
— – — – — – —
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
Experiments – Where do some of the errors come from?
Error analysis setup
• investigate the models for the optimal gammas
• compute the mean ILI estimate impact (%) during the 10 weeks
with highest MAE across all test periods per target country
• identify the worst-5 query pairings
— – — – — – —
France – from English (US) to French
• 24 hour flu → grippe intestinale
• influenza a treatment → grippe traitement
• remedies for colds → rhume de cerveau
• child temperature → température du corps
• child fever → fièvre adulte
(13.24%)
(8.07%)
(6.75%)
(6.37%)
(6.04%)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
Experiments – Where do some of the errors come from?
Error analysis setup
• investigate the models for the optimal gammas
• compute the mean ILI estimate impact (%) during the 10 weeks
with highest MAE across all test periods per target country
• identify the worst-5 query pairings
— – — – — – —
Spain – from English (US) to Spanish
• mucinez for kids → tratmiento de la grippe
• child fever → sinusitis
• influenza a treatment → con gripe
• symptoms pneumonia → bronquitis
• child temperature → temperatura corporal
(20.76%)
(7.76%)
(7.02%)
(6.04%)
(5.62%)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
Experiments – Where do some of the errors come from?
Error analysis setup
• investigate the models for the optimal gammas
• compute the mean ILI estimate impact (%) during the 10 weeks
with highest MAE across all test periods per target country
• identify the worst-5 query pairings
— – — – — – —
Australia – from English (US) to English (AU)
• 24 hour flu → flu duration
• child temperature → warmer
• how to treat a fever → have a fever
• tamiflu and breastfeeding → flu while pregnant
• robitussin cf → colds
(11.51%)
(9.77%)
(6.94%)
(6.81%)
(5.18%)
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
Conclusions and future work
Summary of outcomes
• previous efforts were heavily based on supervised learning models
• transfer learning method to enable modelling in areas that lack an
established syndromic surveillance system
— unsupervised (no ground truth data at the target location)
— core operation: how to map source to target queries
• satisfactory performance (e.g. r > .92)
• 21.6% increase in RMSE compared to a fully supervised model
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 29/29
Conclusions and future work
Summary of outcomes
• previous efforts were heavily based on supervised learning models
• transfer learning method to enable modelling in areas that lack an
established syndromic surveillance system
— unsupervised (no ground truth data at the target location)
— core operation: how to map source to target queries
• satisfactory performance (e.g. r > .92)
• 21.6% increase in RMSE compared to a fully supervised model
Future work
• study where target location is a low or middle income country
— harder to evaluate; qualitative analysis by experts
• investigate parameters γ (similarity balance) and k (number of
target queries in a mapping) further and learn them from the data
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 29/29
Questions
?
Acknowledgements
• Funded by the EPSRC project “i-sense” (EP/K031953/1, EP/R00529X/1)
• SISSS and Amparo Larrauri (Spain) for providing syndromic surveillance data
• Simon Moura and Peter Hayes for offering constructive feedback
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.
References
Artetxe, M., Labaka, G., and Agirre, E. (2016). Learning Principled Bilingual Mappings of Word
Embeddings while Preserving Monolingual Invariance. In Proceedings of the 2016 Conference
on Empirical Methods in Natural Language Processing, pages 2289–2294.
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2017). Enriching Word Vectors with
Subword Information. Transactions of the Association of Computational Linguistics,
5(1):135–146.
Cook, S., Conrad, C., Fowlkes, A. L., and Mohebbi, M. H. (2011). Assessing Google Flu Trends
Performance in the United States during the 2009 Influenza Virus A (H1N1) Pandemic. PLOS
ONE, 6(8).
Dinu, G., Lazaridou, A., and Baroni, M. (2014). Improving Zero-shot Learning by Mitigating the
Hubness Problem. arXiv preprint arXiv:1412.6568.
Eysenbach, G. (2006). Infodemiology: tracking flu-related searches on the web for syndromic
surveillance. Proc. of AMIA Annual Symposium, pages 244–248.
Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., and Brilliant, L. (2009).
Detecting Influenza Epidemics using Search Engine Query Data. Nature, 457(7232):1012–1014.
Lampos, V., Miller, A. C., Crossan, S., and Stefansen, C. (2015a). Advances in Nowcasting
Influenza-like Illness Rates using Search Query Logs. Scientific Reports, 5(12760).
Lampos, V., Yom-Tov, E., Pebody, R., and Cox, I. J. (2015b). Assessing the Impact of a Health
Intervention via User-Generated Internet Content. Data Mining and Knowledge Discovery,
29(5):1434–1457.
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.
References
Lampos, V., Zou, B., and Cox, I. J. (2017). Enhancing Feature Selection Using Word Embeddings:
The Case of Flu Surveillance. In Proceedings of the 26th International Conference on World
Wide Web, pages 695–704.
Lazer, D., Kennedy, R., King, G., and Vespignani, A. (2014). The Parable of Google Flu: Traps in
Big Data Analysis. Science, 343(6176):1203–1205.
Olson, D. R., Konty, K. J., Paladini, M., Viboud, C., and Simonsen, L. (2013). Reassessing Google
Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative
Epidemiological Study at Three Geographic Scales. PLOS Computational Biology, 9(10).
Pan, S. J., Tsang, I. W., Kwok, J. T., and Yang, Q. (2009). Domain Adaptation via Transfer
Component Analysis. In Proceedings of the 21st International Joint Conference on Artificial
Intelligence, pages 1187–1192.
Polgreen, P. M., Chen, Y., Pennock, D. M., Nelson, F. D., and Weinstein, R. A. (2008). Using
Internet Searches for Influenza Surveillance. Clinical Infectious Diseases, 47(11):1443–1448.
Smith, S. L., Turban, D. H. P., Hamblin, S., and Hammerla, N. Y. (2016). Offline Bilingual Word
Vectors, Orthogonal Transformations and the Inverted Softmax. arXiv preprint
arXiv:1702.03859.
Wagner, M., Lampos, V., Cox, I. J., and Pebody, R. (2018). The added value of online
user-generated content in traditional methods for influenza surveillance. Scientific reports,
8(1):13963.
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.
References
Yang, S., Santillana, M., and Kou, S. C. (2015). Accurate Estimation of Influenza Epidemics using
Google Search Data via ARGO. Proceedings of the National Academy of Sciences,
112(47):14473–14478.
Zou, B., Lampos, V., and Cox, I. J. (2018). Multi-Task Learning Improves Disease Models from
Web Search. In Proceedings of the 2018 World Wide Web Conference, pages 87–96.
Zou, B., Lampos, V., Gorton, R., and Cox, I. J. (2016). On Infectious Intestinal Disease
Surveillance using Social Media Content. In Proceedings of the 6th International Conference on
Digital Health, pages 157–161.
Zou, H. and Hastie, T. (2005). Regularization and Variable Selection via the Elastic Net. Journal
of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301–320.
Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.

More Related Content

Similar to Transfer learning for unsupervised influenza-like illness models from online search data

Modeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural NetworksModeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural Networks
Josh Patterson
 
Text REtrieval Conference (TREC) Dynamic Domain Track 2015
Text REtrieval Conference (TREC) Dynamic Domain Track 2015Text REtrieval Conference (TREC) Dynamic Domain Track 2015
Text REtrieval Conference (TREC) Dynamic Domain Track 2015
Grace Hui Yang
 
(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...
(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...
(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...
Gemma Derrick
 
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Sergey Sosnovsky
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Joel Saltz
 
[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routine[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routine
Hsing-chuan Hsieh
 
GAN Evaluation
GAN EvaluationGAN Evaluation
GAN Evaluation
Dongheon Lee
 
Spatially explicit individual based modeling
Spatially explicit individual based modelingSpatially explicit individual based modeling
Spatially explicit individual based modeling
Mauricio Toro-Bermudez, PhD
 
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
The Statistical and Applied Mathematical Sciences Institute
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Jonathan Eisen
 
2013 10-30-sbc361-reproducible designsandsustainablesoftware
2013 10-30-sbc361-reproducible designsandsustainablesoftware2013 10-30-sbc361-reproducible designsandsustainablesoftware
2013 10-30-sbc361-reproducible designsandsustainablesoftware
Yannick Wurm
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
Frank van Harmelen
 
A spatio-temporal visual analysis tool for historical dictionaries.
A spatio-temporal visual analysis tool for historical dictionaries. A spatio-temporal visual analysis tool for historical dictionaries.
A spatio-temporal visual analysis tool for historical dictionaries.
Technological Ecosystems for Enhancing Multiculturality
 
Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016
Roy Clariana
 
Jillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian Aurisano
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive Models
Christos Argyropoulos
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)Michael Atkins
 
Answering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrievalAnswering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrieval
San Kim
 

Similar to Transfer learning for unsupervised influenza-like illness models from online search data (20)

Modeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural NetworksModeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural Networks
 
Text REtrieval Conference (TREC) Dynamic Domain Track 2015
Text REtrieval Conference (TREC) Dynamic Domain Track 2015Text REtrieval Conference (TREC) Dynamic Domain Track 2015
Text REtrieval Conference (TREC) Dynamic Domain Track 2015
 
(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...
(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...
(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample S...
 
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
 
[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routine[Digest] Eigenbehaviors- identifying structure in routine
[Digest] Eigenbehaviors- identifying structure in routine
 
GAN Evaluation
GAN EvaluationGAN Evaluation
GAN Evaluation
 
Spatially explicit individual based modeling
Spatially explicit individual based modelingSpatially explicit individual based modeling
Spatially explicit individual based modeling
 
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
Climate Extremes Workshop - Networks and Extremes: Review and Further Studies...
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
SAX-VSM
SAX-VSMSAX-VSM
SAX-VSM
 
2013 10-30-sbc361-reproducible designsandsustainablesoftware
2013 10-30-sbc361-reproducible designsandsustainablesoftware2013 10-30-sbc361-reproducible designsandsustainablesoftware
2013 10-30-sbc361-reproducible designsandsustainablesoftware
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Internship
InternshipInternship
Internship
 
A spatio-temporal visual analysis tool for historical dictionaries.
A spatio-temporal visual analysis tool for historical dictionaries. A spatio-temporal visual analysis tool for historical dictionaries.
A spatio-temporal visual analysis tool for historical dictionaries.
 
Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016Teaching & Learning with Technology TLT 2016
Teaching & Learning with Technology TLT 2016
 
Jillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-ja
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive Models
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 
Answering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrievalAnswering complex open domain questions with multi-hop dense retrieval
Answering complex open domain questions with multi-hop dense retrieval
 

More from Vasileios Lampos

Quicksort
QuicksortQuicksort
Quicksort
Vasileios Lampos
 
Topic models, vector semantics and applications
Topic models, vector semantics and applicationsTopic models, vector semantics and applications
Topic models, vector semantics and applications
Vasileios Lampos
 
Mining online data for public health surveillance
Mining online data for public health surveillanceMining online data for public health surveillance
Mining online data for public health surveillance
Vasileios Lampos
 
Mining socio-political and socio-economic signals from social media content
Mining socio-political and socio-economic signals from social media contentMining socio-political and socio-economic signals from social media content
Mining socio-political and socio-economic signals from social media content
Vasileios Lampos
 
Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...
Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...
Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...
Vasileios Lampos
 
An introduction to digital health surveillance from online user-generated con...
An introduction to digital health surveillance from online user-generated con...An introduction to digital health surveillance from online user-generated con...
An introduction to digital health surveillance from online user-generated con...
Vasileios Lampos
 
User-generated content: collective and personalised inference tasks
User-generated content: collective and personalised inference tasksUser-generated content: collective and personalised inference tasks
User-generated content: collective and personalised inference tasks
Vasileios Lampos
 
Bilinear text regression and applications
Bilinear text regression and applicationsBilinear text regression and applications
Bilinear text regression and applications
Vasileios Lampos
 
Extracting interesting concepts from large-scale textual data
Extracting interesting concepts from large-scale textual dataExtracting interesting concepts from large-scale textual data
Extracting interesting concepts from large-scale textual data
Vasileios Lampos
 
Assessing the impact of a health intervention via user-generated Internet con...
Assessing the impact of a health intervention via user-generated Internet con...Assessing the impact of a health intervention via user-generated Internet con...
Assessing the impact of a health intervention via user-generated Internet con...
Vasileios Lampos
 

More from Vasileios Lampos (10)

Quicksort
QuicksortQuicksort
Quicksort
 
Topic models, vector semantics and applications
Topic models, vector semantics and applicationsTopic models, vector semantics and applications
Topic models, vector semantics and applications
 
Mining online data for public health surveillance
Mining online data for public health surveillanceMining online data for public health surveillance
Mining online data for public health surveillance
 
Mining socio-political and socio-economic signals from social media content
Mining socio-political and socio-economic signals from social media contentMining socio-political and socio-economic signals from social media content
Mining socio-political and socio-economic signals from social media content
 
Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...
Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...
Inferring the Socioeconomic Status of Social Media Users based on Behaviour a...
 
An introduction to digital health surveillance from online user-generated con...
An introduction to digital health surveillance from online user-generated con...An introduction to digital health surveillance from online user-generated con...
An introduction to digital health surveillance from online user-generated con...
 
User-generated content: collective and personalised inference tasks
User-generated content: collective and personalised inference tasksUser-generated content: collective and personalised inference tasks
User-generated content: collective and personalised inference tasks
 
Bilinear text regression and applications
Bilinear text regression and applicationsBilinear text regression and applications
Bilinear text regression and applications
 
Extracting interesting concepts from large-scale textual data
Extracting interesting concepts from large-scale textual dataExtracting interesting concepts from large-scale textual data
Extracting interesting concepts from large-scale textual data
 
Assessing the impact of a health intervention via user-generated Internet con...
Assessing the impact of a health intervention via user-generated Internet con...Assessing the impact of a health intervention via user-generated Internet con...
Assessing the impact of a health intervention via user-generated Internet con...
 

Recently uploaded

filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 

Recently uploaded (20)

filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 

Transfer learning for unsupervised influenza-like illness models from online search data

  • 1. Transfer learning for unsupervised influenza-like illness models from online search data Bin Zou Vasileios Lampos ( lampos.net ) Ingemar J. Cox Department of Computer Science University College London
  • 2. From online searches to influenza-like illness rates 50 60 70 80 90 100 of queries queries op-scoring queries to include in the 2004 2005 2006 Year 2007 2008 2 4 6 8 10 ILIpercentage 0 12 Figure 2 | A comparison of model estimates for the mid-Atlantic region (black) against CDC-reported ILI percentages (red), including points over which the model was fit and validated. A correlation of 0.85 was obtained over 128 points from this region to which the model was fit, whereas a correlation of 0.96 was obtained over 42 validation points. Dotted lines indicate 95% prediction intervals. The region comprises New York, New LETTERS Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 1/29
  • 3. From online searches to influenza-like illness rates Online data for health (2/3) Google Flu Trends (discontinued)Google Flu Trends (discontinued) popularising an established idea Ginsberg et al. (2009) Eysenbach (2006); Polgreen et al. (2008) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 1/29
  • 4. From online searches to influenza-like illness rates Task abstraction • input – frequency of search queries over time: X∈Rn×s • output – corresponding influenza-like illness (ILI) rate: y∈Rn • regression task, i.e. learn f : X → y Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 2/29
  • 5. From online searches to influenza-like illness rates Task abstraction • input – frequency of search queries over time: X∈Rn×s • output – corresponding influenza-like illness (ILI) rate: y∈Rn • regression task, i.e. learn f : X → y Modelling • originally proposed models were evidently not good solutions1 • new families of methods seem to work OK in various geographies2 1 Cook et al. (2011); Olson et al. (2013); Lazer et al. (2014) 2 Lampos et al. (2015a); Yang et al. (2015); Lampos et al. (2017); Wagner et al. (2018) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 2/29
  • 6. Why estimate ILI rates from online search statistics? Common arguments for: • complements traditional syndromic surveillance ✓ timeliness ✓ broader demographic coverage, larger cohort ✓ broader geographical coverage ✓ not affected by closure days or national holidays ✓ lower cost • applicable to locations that lack an established health system Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 3/29
  • 7. Why estimate ILI rates from online search statistics? Common arguments for: • complements traditional syndromic surveillance ✓ timeliness ✓ broader demographic coverage, larger cohort ✓ broader geographical coverage ✓ not affected by closure days or national holidays ✓ lower cost • applicable to locations that lack an established health system ✓ oxymoron (supervised learning) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 3/29
  • 8. Why estimate ILI rates from online search statistics? Common arguments for: • complements traditional syndromic surveillance ✓ timeliness ✓ broader demographic coverage, larger cohort ✓ broader geographical coverage ✓ not affected by closure days or national holidays ✓ lower cost • applicable to locations that lack an established health system ✓ oxymoron (supervised learning) ✓ motivated this paper Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 3/29
  • 9. Our contribution in a nutshell Main task • train a model for a source location where historical syndromic surveillance data is available, and • transfer it to a target location where syndromic surveillance data is not available or, in our experiments, ignored Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 4/29
  • 10. Our contribution in a nutshell Main task • train a model for a source location where historical syndromic surveillance data is available, and • transfer it to a target location where syndromic surveillance data is not available or, in our experiments, ignored Transfer learning steps 1. Learn a linear regularised regression model for a source location 2. Map search queries from the source to the target domain (languages may differ) 3. Transfer the source weights to the target domain (might involve weight re-adjustment) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 4/29
  • 11. Transfer learning task definition query frequency xij = #query j issued during ∆ti #all queries issued during ∆ti for a location Source domain • DS = { (xi, yi) } , i∈{1, ..., n} • xi ∈Rs = {xij}, j ∈{1, ..., s}: frequency of source queries • yi ∈R: ILI rate for time interval i Target domain • DT = {x′ i}, i∈{1, ..., m} • x′ i ∈Rt : frequency of target queries • note that t need not equal s Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 5/29
  • 12. Transfer learning task definition query frequency xij = #query j issued during ∆ti #all queries issued during ∆ti for a location Source domain • DS = { (xi, yi) } , i∈{1, ..., n} • xi ∈Rs = {xij}, j ∈{1, ..., s}: frequency of source queries • yi ∈R: ILI rate for time interval i Target domain • DT = {x′ i}, i∈{1, ..., m} • x′ i ∈Rt : frequency of target queries • note that t need not equal s § ¦ ¤ ¥ Aim: Given DS and DT, estimate y′ i Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 5/29
  • 13. Step 1 – Learn a regression function in the source domain Source domain • xi ∈Rs = {xij}, j ∈{1, ..., s}: frequency of source queries • yi ∈R: ILI rate for time interval i Elastic net1 (constrained) argmin w,β n∑ i=1 ( yi − β − ( s∑ j=1 xijwj ))2 + λ1 s∑ j=1 |wj| + λ2 s∑ j=1 w2 j subject to w ≥ 0 1 Zou and Hastie (2005) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 6/29
  • 14. Step 1 – Learn a regression function in the source domain Elastic net (constrained) argmin w,β n∑ i=1 ( yi − β − ( s∑ j=1 xijwj ))2 + λ1 s∑ j=1 |wj| + λ2 s∑ j=1 w2 j subject to w ≥ 0 Why use elastic net? • more straightforward to transfer • few training instances • previous successful application1 • combines ℓ1- and ℓ2-norm regularisation: sparse solution, model consistency under collinearity 1 Lampos et al. (2015a,b); Zou et al. (2016); Lampos et al. (2017) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 7/29
  • 15. Step 1 – Learn a regression function in the source domain Elastic net (constrained) argmin w,β n∑ i=1 ( yi − β − ( s∑ j=1 xijwj ))2 + λ1 s∑ j=1 |wj| + λ2 s∑ j=1 w2 j subject to w ≥ 0 Why apply a non-negative weight constraint? • (how?) coordinate descent restricting negative updates to 0 • worse performing model for the source location • but enables a more comprehensive transfer • better performance at the target location Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 7/29
  • 16. Step 1 – Learn a regression function in the source domain Selecting queries prior to applying elastic net • hybrid feature selection similarly to previous work1 • derive query embeddings eq using fastText2 • define a flu context/topic: T = {‘flu’, ‘fever’} • compute each query’s similarity to T using g (q, T ) = cos (eq, eT1 ) × cos (eq, eT2 ) cos(·, ·) is mapped to [0, 1] 1 Zou et al. (2016); Lampos et al. (2017); Zou et al. (2018) 2 Bojanowski et al. (2017) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 8/29
  • 17. Step 1 – Learn a regression function in the source domain Selecting queries prior to applying elastic net • hybrid feature selection similarly to previous work1 • derive query embeddings eq using fastText2 • define a flu context/topic: T = {‘flu’, ‘fever’} • compute each query’s similarity to T using g (q, T ) = cos (eq, eT1 ) × cos (eq, eT2 ) cos(·, ·) is mapped to [0, 1] • filter out queries with either g ≤ 0.5 or r ≤ 0.3 (corr. with ILI) § ¦ ¤ ¥ QS: remaining queries after applying elastic net 1 Zou et al. (2016); Lampos et al. (2017); Zou et al. (2018) 2 Bojanowski et al. (2017) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 8/29
  • 18. Step 2 – Mapping source to target queries Task: map QS to a subset of PT (pool of target queries) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 9/29
  • 19. Step 2 – Mapping source to target queries Task: map QS to a subset of PT (pool of target queries) How? • direct translation does not work — invalid search queries — worse performance Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 9/29
  • 20. Step 2 – Mapping source to target queries Task: map QS to a subset of PT (pool of target queries) How? • direct translation does not work — invalid search queries — worse performance • semantic similarity, Θs: (cross-lingual) word embeddings • temporal similarity, Θc: correlation between frequency time series • hybrid similarity: Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1] • consider 1-to-k mappings Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 9/29
  • 21. Step 2 – Semantic similarity (Θs) Same language in both domains? • Use cosine similarity on query embeddings Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 10/29
  • 22. Step 2 – Semantic similarity (Θs) Same language in both domains? • Use cosine similarity on query embeddings If not, derive bi-lingual embeddings1 • m core translation pairs, σ→τ, with embeddings Eσ, Eτ ∈Rm×d • learn a transformation matrix, W∈Rd×d , by minimising: argmin W ∥EσW − Eτ ∥ 2 2 , subject to W⊤ W = I 1 Smith et al. (2016) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 10/29
  • 23. Step 2 – Semantic similarity (Θs) Same language in both domains? • Use cosine similarity on query embeddings If not, derive bi-lingual embeddings1 • m core translation pairs, σ→τ, with embeddings Eσ, Eτ ∈Rm×d • learn a transformation matrix, W∈Rd×d , by minimising: argmin W ∥EσW − Eτ ∥ 2 2 , subject to W⊤ W = I • orthogonality constraint: — Eτ ≈ EσW and Eσ ≈ Eτ W⊤ — improves the performance of machine translation2 • solution: W = VU⊤ , where E⊤ τ Eσ = UΣV⊤ (SVD) 1 Smith et al. (2016) 2 Artetxe et al. (2016) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 10/29
  • 24. Step 2 – Semantic similarity (Θs) Compute a query (source) to query (target) similarity matrix • source, target query embedding: eqi , eqj ∈R1×d • cosine similarity matrix Ω∈Rs×|PT| , ωij = ( eqi W e⊤ qj ) ( ∥eqi W∥2∥eqj ∥2 ) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 11/29
  • 25. Step 2 – Semantic similarity (Θs) Compute a query (source) to query (target) similarity matrix • source, target query embedding: eqi , eqj ∈R1×d • cosine similarity matrix Ω∈Rs×|PT| , ωij = ( eqi W e⊤ qj ) ( ∥eqi W∥2∥eqj ∥2 ) Inverted softmax • using ωij directly for translations can generate hubs — target query is similar to way too many different source queries — reduces performance of machine translation1 • instead, given a source query qi, find a target qj that maximises Pj→i = exp (η ωij) αj s∑ z=1 exp (η ωiz) 1 Dinu et al. (2014); Smith et al. (2016) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 11/29
  • 26. Step 2 – Semantic similarity (Θs) Pj→i = exp (η ωij) αj s∑ z=1 exp (η ωiz) • αj: ensures Pj→i is a probability • s: number of source queries • η: learned by maximising the log probability over the alignment dictionary (σ→τ): argmax η ∑ pairs ij ln (Pj→i) Inverted softmax • probability that a target query translates back to the source query • hub target query =⇒ large denominator • top-k target queries are selected as possible mappings of qi Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 12/29
  • 27. Step 2 – Semantic similarity (Θs) Inverted softmax • probability that a target query translates back to the source query • hub target query =⇒ large denominator • top-k target queries are selected as possible mappings of qi Determine the semantic similarity score by • using these top-k queries (average if k > 1) • and computing Θs(qi, qj) = ( eqi W e⊤ qj ) / ( ∥eqi W∥2∥eqj ∥2 ) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 13/29
  • 28. Step 2 – Temporal similarity (Θc) Exploit query relationship in the frequency space: • important relationship; based on the core statistical input information Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 14/29
  • 29. Step 2 – Temporal similarity (Θc) Exploit query relationship in the frequency space: • important relationship; based on the core statistical input information • compute pair-wise correlation between the frequency time series of source and target queries • flu seasons may be offset in different locations Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 14/29
  • 30. Step 2 – Temporal similarity (Θc) Exploit query relationship in the frequency space: • important relationship; based on the core statistical input information • compute pair-wise correlation between the frequency time series of source and target queries • flu seasons may be offset in different locations ✓ compute all correlations using a shifting window of ±ξ weeks ✓ optimal window lij (source query qi, target query qj) is independently computed for each target query Θc(qi, qj) = ρ ( xi(t), xj(t + lij) ) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 14/29
  • 31. Step 3 – Determining weights for target queries Previous steps • source query qi allocated weight wi • source query qi mapped to a set Ti of k ≥ 1 target queries Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 15/29
  • 32. Step 3 – Determining weights for target queries Previous steps • source query qi allocated weight wi • source query qi mapped to a set Ti of k ≥ 1 target queries Weight transfer • if k = 1, directly assign wi to the single target query • if k > 1, wi is distributed across the k identified target queries Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 15/29
  • 33. Step 3 – Determining weights for target queries Previous steps • source query qi allocated weight wi • source query qi mapped to a set Ti of k ≥ 1 target queries Weight transfer • if k = 1, directly assign wi to the single target query • if k > 1, wi is distributed across the k identified target queries Weighting schemes • uniform: w′ j = wi/k • based on Θij, j ∈ {2, . . . , k}: w′ j = wiΘij ∑ qj ∈Ti Θij Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 15/29
  • 34. Experiments – Transfer tasks Source location: United States (US) Target locations • France (FR): from English to French • Spain (ES): from English to Spanish • Australia (AU): from English to English, different hemisphere, greater temporal difference in flu outbreaks Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 16/29
  • 35. Experiments – Transfer tasks Source location: United States (US) Target locations • France (FR): from English to French • Spain (ES): from English to Spanish • Australia (AU): from English to English, different hemisphere, greater temporal difference in flu outbreaks Why choose locations where syndromic surveillance systems exist? • more robust evaluation at this preliminary stage Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 16/29
  • 36. Experiments – Data Search query frequencies from Google • retrieved from the Google Correlate endpoint • z-scored (by default) • weekly rates • September 2007 to August 2016 (both inclusive) • # queries: 34,121 (US), 29,996 (FR), 15,673 (ES), 8,764 (AU) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 17/29
  • 37. Experiments – Data Search query frequencies from Google • retrieved from the Google Correlate endpoint • z-scored (by default) • weekly rates • September 2007 to August 2016 (both inclusive) • # queries: 34,121 (US), 29,996 (FR), 15,673 (ES), 8,764 (AU) Influenza-like illness (ILI) rates • data from health organisations in these countries (CDC, SN, SISSS, ASPREN) • same date range, weekly ILI rates • z-scored as the metric systems vary in these countries Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 17/29
  • 38. Experiments – ILI rates in the source vs. target country How similar are they? US 2008 2009 2010 2011 2012 2013 2014 2015 2016 -1 0 1 2 3 4 5 ILIrates(z-scored) US Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
  • 39. Experiments – ILI rates in the source vs. target country How similar are they? US vs. FR 2008 2009 2010 2011 2012 2013 2014 2015 2016 -1 0 1 2 3 4 5 ILIrates(z-scored) US FR Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
  • 40. Experiments – ILI rates in the source vs. target country How similar are they? US vs. ES 2008 2009 2010 2011 2012 2013 2014 2015 2016 -1 0 1 2 3 4 5 ILIrates(z-scored) US ES Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
  • 41. Experiments – ILI rates in the source vs. target country How similar are they? US vs. AU 2008 2009 2010 2011 2012 2013 2014 2015 2016 -1 0 1 2 3 4 5 ILIrates(z-scored) US AU Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 18/29
  • 42. Experiments – Evaluation Protocol • train a model using 5 flu seasons, test it on the next • evaluate performance on the the last 4 flu seasons of our data set • Θc: use a window of ξ = ±6 weeks • source query → k = {1, ..., 5} target queries • Pearson correlation, mean absolute error (MAE), root mean squared error (RMSE) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 19/29
  • 43. Experiments – Evaluation Protocol • train a model using 5 flu seasons, test it on the next • evaluate performance on the the last 4 flu seasons of our data set • Θc: use a window of ξ = ±6 weeks • source query → k = {1, ..., 5} target queries • Pearson correlation, mean absolute error (MAE), root mean squared error (RMSE) Baseline models • worst case baseline (R): random shuffling of identified query pairs • unsupervised learning (U) using most semantically relevant queries • best case threshold (S): supervised learning using elastic net • transfer component analysis (TCA)1 1 Pan et al. (2009) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 19/29
  • 44. Experiments – General observations In general: • semantic similarity (Θs) is performing better than temporal similarity (Θc) when used in isolation • using semantic or temporal similarity in isolation provides inferior performance, i.e. hybrid similarity works best • values for k > 1 did not help the hybrid similarity to improve • when k > 1, the non-uniform way of weighting was performing better Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 20/29
  • 45. Experiments – General observations In general: • semantic similarity (Θs) is performing better than temporal similarity (Θc) when used in isolation • using semantic or temporal similarity in isolation provides inferior performance, i.e. hybrid similarity works best • values for k > 1 did not help the hybrid similarity to improve • when k > 1, the non-uniform way of weighting was performing better Closer look at results for γ = 0, γ = 1 and the best choice of γ § ¦ ¤ ¥ Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1] Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 20/29
  • 46. Experiments – Results for France § ¦ ¤ ¥ Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1] Avg. correlation 0 0.2 0.4 0.6 0.8 1 γ = 0 γ = 1 γ = .5 0.9590.956 0.835 Avg. MAE 0 13 26 39 52 65 γ = 0 γ = 1 γ = .5 34.05 46.79 61.53 Avg. RMSE 0 21 42 63 84 105 γ = 0 γ = 1 γ = .5 52.15 65.37 100.06 R: 0.911 U: 0.916 S: 0.984 R: 87.729 U: NA S: 25.088 R: 101.845 U: NA S: 42.349 Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 21/29
  • 47. Experiments – Results for France Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 22/29
  • 48. Experiments – Results for Spain § ¦ ¤ ¥ Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1] Avg. correlation 0 0.2 0.4 0.6 0.8 1 γ = 0 γ = 1 γ = .2 0.9180.944 0.827 Avg. MAE 0 7 14 21 28 35 γ = 0 γ = 1 γ = .2 22.66 33.22 25.99 Avg. RMSE 0 9 18 27 36 45 γ = 0 γ = 1 γ = .2 32.30 38.57 41.68 R: 0.872 U: 0.925 S: 0.971 R: 40.311 U: NA S: 22.120 R: 47.204 U: NA S: 30.600 Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 23/29
  • 49. Experiments – Results for Spain Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 24/29
  • 50. Experiments – Results for Australia § ¦ ¤ ¥ Θ = γΘs + (1 − γ)Θc, γ ∈ [0, 1] Avg. correlation 0 0.2 0.4 0.6 0.8 1 γ = 0 γ = 1 γ = .9 0.9210.915 0.7 Avg. MAE 0 9 18 27 36 45 γ = 0 γ = 1 γ = .9 22.04 30.28 42.35 Avg. RMSE 0 12 24 36 48 60 γ = 0 γ = 1 γ = .9 25.59 34.33 55.32 R: 0.875 U: 0.862 S: 0.916 R: 25.792 U: NA S: 17.829 R: 30.080 U: NA S: 21.782 Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 25/29
  • 51. Experiments – Results for Australia Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 26/29
  • 52. Experiments – Results for different values of γ Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
  • 53. Experiments – Results for different values of γ Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
  • 54. Experiments – Results for different values of γ Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
  • 55. Experiments – Results for different values of γ • hybrid similarity optima differ per target country • optimal γ depends on the characteristics of the input space • µ(Θc)/µ(Θs) across queries relates to optimal γ: 1.143 (FR), 0.982 (ES), 2.261 (AU) • identifying optimal γ automatically is an open task • γ = 0.5 provides better results than non hybrid similarities Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 27/29
  • 56. Experiments – Where do some of the errors come from? Error analysis setup • investigate the models for the optimal gammas • compute the mean ILI estimate impact (%) during the 10 weeks with highest MAE across all test periods per target country • identify the worst-5 query pairings — – — – — – — Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
  • 57. Experiments – Where do some of the errors come from? Error analysis setup • investigate the models for the optimal gammas • compute the mean ILI estimate impact (%) during the 10 weeks with highest MAE across all test periods per target country • identify the worst-5 query pairings — – — – — – — France – from English (US) to French • 24 hour flu → grippe intestinale • influenza a treatment → grippe traitement • remedies for colds → rhume de cerveau • child temperature → température du corps • child fever → fièvre adulte (13.24%) (8.07%) (6.75%) (6.37%) (6.04%) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
  • 58. Experiments – Where do some of the errors come from? Error analysis setup • investigate the models for the optimal gammas • compute the mean ILI estimate impact (%) during the 10 weeks with highest MAE across all test periods per target country • identify the worst-5 query pairings — – — – — – — Spain – from English (US) to Spanish • mucinez for kids → tratmiento de la grippe • child fever → sinusitis • influenza a treatment → con gripe • symptoms pneumonia → bronquitis • child temperature → temperatura corporal (20.76%) (7.76%) (7.02%) (6.04%) (5.62%) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
  • 59. Experiments – Where do some of the errors come from? Error analysis setup • investigate the models for the optimal gammas • compute the mean ILI estimate impact (%) during the 10 weeks with highest MAE across all test periods per target country • identify the worst-5 query pairings — – — – — – — Australia – from English (US) to English (AU) • 24 hour flu → flu duration • child temperature → warmer • how to treat a fever → have a fever • tamiflu and breastfeeding → flu while pregnant • robitussin cf → colds (11.51%) (9.77%) (6.94%) (6.81%) (5.18%) Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 28/29
  • 60. Conclusions and future work Summary of outcomes • previous efforts were heavily based on supervised learning models • transfer learning method to enable modelling in areas that lack an established syndromic surveillance system — unsupervised (no ground truth data at the target location) — core operation: how to map source to target queries • satisfactory performance (e.g. r > .92) • 21.6% increase in RMSE compared to a fully supervised model Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 29/29
  • 61. Conclusions and future work Summary of outcomes • previous efforts were heavily based on supervised learning models • transfer learning method to enable modelling in areas that lack an established syndromic surveillance system — unsupervised (no ground truth data at the target location) — core operation: how to map source to target queries • satisfactory performance (e.g. r > .92) • 21.6% increase in RMSE compared to a fully supervised model Future work • study where target location is a low or middle income country — harder to evaluate; qualitative analysis by experts • investigate parameters γ (similarity balance) and k (number of target queries in a mapping) further and learn them from the data Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19. 29/29
  • 62. Questions ? Acknowledgements • Funded by the EPSRC project “i-sense” (EP/K031953/1, EP/R00529X/1) • SISSS and Amparo Larrauri (Spain) for providing syndromic surveillance data • Simon Moura and Peter Hayes for offering constructive feedback Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.
  • 63. References Artetxe, M., Labaka, G., and Agirre, E. (2016). Learning Principled Bilingual Mappings of Word Embeddings while Preserving Monolingual Invariance. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2289–2294. Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association of Computational Linguistics, 5(1):135–146. Cook, S., Conrad, C., Fowlkes, A. L., and Mohebbi, M. H. (2011). Assessing Google Flu Trends Performance in the United States during the 2009 Influenza Virus A (H1N1) Pandemic. PLOS ONE, 6(8). Dinu, G., Lazaridou, A., and Baroni, M. (2014). Improving Zero-shot Learning by Mitigating the Hubness Problem. arXiv preprint arXiv:1412.6568. Eysenbach, G. (2006). Infodemiology: tracking flu-related searches on the web for syndromic surveillance. Proc. of AMIA Annual Symposium, pages 244–248. Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., and Brilliant, L. (2009). Detecting Influenza Epidemics using Search Engine Query Data. Nature, 457(7232):1012–1014. Lampos, V., Miller, A. C., Crossan, S., and Stefansen, C. (2015a). Advances in Nowcasting Influenza-like Illness Rates using Search Query Logs. Scientific Reports, 5(12760). Lampos, V., Yom-Tov, E., Pebody, R., and Cox, I. J. (2015b). Assessing the Impact of a Health Intervention via User-Generated Internet Content. Data Mining and Knowledge Discovery, 29(5):1434–1457. Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.
  • 64. References Lampos, V., Zou, B., and Cox, I. J. (2017). Enhancing Feature Selection Using Word Embeddings: The Case of Flu Surveillance. In Proceedings of the 26th International Conference on World Wide Web, pages 695–704. Lazer, D., Kennedy, R., King, G., and Vespignani, A. (2014). The Parable of Google Flu: Traps in Big Data Analysis. Science, 343(6176):1203–1205. Olson, D. R., Konty, K. J., Paladini, M., Viboud, C., and Simonsen, L. (2013). Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales. PLOS Computational Biology, 9(10). Pan, S. J., Tsang, I. W., Kwok, J. T., and Yang, Q. (2009). Domain Adaptation via Transfer Component Analysis. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, pages 1187–1192. Polgreen, P. M., Chen, Y., Pennock, D. M., Nelson, F. D., and Weinstein, R. A. (2008). Using Internet Searches for Influenza Surveillance. Clinical Infectious Diseases, 47(11):1443–1448. Smith, S. L., Turban, D. H. P., Hamblin, S., and Hammerla, N. Y. (2016). Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax. arXiv preprint arXiv:1702.03859. Wagner, M., Lampos, V., Cox, I. J., and Pebody, R. (2018). The added value of online user-generated content in traditional methods for influenza surveillance. Scientific reports, 8(1):13963. Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.
  • 65. References Yang, S., Santillana, M., and Kou, S. C. (2015). Accurate Estimation of Influenza Epidemics using Google Search Data via ARGO. Proceedings of the National Academy of Sciences, 112(47):14473–14478. Zou, B., Lampos, V., and Cox, I. J. (2018). Multi-Task Learning Improves Disease Models from Web Search. In Proceedings of the 2018 World Wide Web Conference, pages 87–96. Zou, B., Lampos, V., Gorton, R., and Cox, I. J. (2016). On Infectious Intestinal Disease Surveillance using Social Media Content. In Proceedings of the 6th International Conference on Digital Health, pages 157–161. Zou, H. and Hastie, T. (2005). Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301–320. Zou, Lampos, Cox. Transfer learning for unsupervised flu models from online search. WWW ’19.