Hetman immem xi final March 2016

IMMEM XI
Navigating Microbial Genomes:
Insights from the Next Generation
9 – 12 March 2016, Estoril, Portugal

29.3
19.67
11.12
3.08
1.94 1.56
0.36 0.32
Campylobacteriosis Salmonellosis Giardiasis
Shigellosis Verotoxigenic E. coli (VTEC) Cryptosporidiosis
Listeriosis Cyclosporiasis
(*447)
(*269)
(*24)
(*4)
(*39) (*7)
(*.55) (*7.5)
Thomas et al (2013). doi:10.1089/fpd.2012.1389
FoodNet Canada Short Report 2013
***Post-correction estimate
2
Campylobacter is a public health challenge
 #1 bacterial gastrointestinal disease in Canada and a leading foodborne
pathogen worldwide (300-500 million cases)
 Self-limiting illness, highly under-reported, largely sporadic

3
The epidemiology of campylobacteriosis is daunting
Source: Julie Arsenault (PhD Thesis)
papyrus.bib.umontreal.ca/jspui/handle/1866/4625
 Widespread in “farm-to-fork” and “source-to-tap”
 high prevalence in most major livestock species
 found in many wild animal species, insects, surface waters
 Difficult to establish sources of exposure and routes of transmission
 Crisis = Opportunity  WGS to the rescue!!!

4
Genetic Similarity
CI_1875
CI_1884
CI_4356
CI_4378
CI_4411
CI_4447
CI_4424
06_6211
CE_M_10_2108
CE_M_10_2096
06_4734
07_2174
CE_M_09_3054
CE_M_10_2113
CGY_HR_108
CHR_026
CHR_130
CI_0325
CI_1096
CI_0973
CI_1874
CI_4383
CI_5889
CI_5692
CI_0292
08_1700
CGY_HR_058
CI_0927
CI_4458
CI_5947
CI_5943
CGY_HR_020
CGY_HR_009
CHR_053
CI_0334
07_4268
06_3245
06_3569
07_1493
CI_0884
CI_0094
CI_3421
CI_0405
07_0971
07_1875
CI_2864
06_7515
CI_1415
06_3783
06_3849
06_3851
06_6554
CI_3889
07_5039
CGY_HR_073
CGY_HR_074
CI_0898
CI_0893
06_3852
08_1711
08_1709
CI_2548
CI_2423
CI_2991
CI_2989
CI_1799
CI_2009
CI_2004
CI_4102
CI_3609
CI_2328
CI_1653
CI_4079
CI_3036
CI_2605
CI_2536
CI_0532
CI_0699
CI_3074
CI_3043
CI_0450
CI_0458
CI_0453
CI_2695
CI_2533
CI_2510
CI_0765
CI_5034
CI_3986
CI_2950
CI_2705
CI_0697
CI_3252
CI_1636
CI_3856
CI_1660
CI_4990
CI_3943
CI_3812
CI_2230
CI_1845
CI_0168
CI_0182
CI_0392
CI_3299
CI_3290
07_0675
07_0549
CE_M_10_3107
07_7324
Epidemio
CE2_R2_11_3085
CE2_R_11_3009
07_7324
06_7656
06_7331
CGY_HR_061
CGY_HR_076
CI_0927
CI_5943
CI_5947
CGY_HR_058
CI_0292
CI_5692
CI_5889
CI_1874
07_4268
CI_4383
CI_0334
CGY_HR_009
CGY_HR_020
CHR_053
CI_4458
08_1700
06_7515
07_1875
CI_2864
07_0971
CI_0405
CI_3421
CI_0884
06_3245
CI_0094
06_3569
07_1493
06_6211
06_4734
CE_M_10_2108
CHR_026
CGY_HR_108
CHR_130
CI_0973
CI_1096
CI_0325
07_2174
CE_M_09_3054
CE_M_10_2096
CE_M_10_2113
CI_1415
CI_0893
CI_0898
CGY_HR_073
CGY_HR_074
08_1711
06_3852
08_1709
07_5039
06_3783
06_3851
CI_3889
06_3849
06_6554
CI_2989
CI_2991
CI_2423
CI_2548
CI_4102
CI_3609
CI_1653
CI_2328
CI_3036
CI_2605
CI_0532
CI_2536
CI_0453
CI_0450
CI_0458
CI_2533
CI_2695
CI_1799
CI_2004
CI_2009
CI_2705
CI_1636
CI_3252
CI_0697
CI_5034
CI_4079
CI_0765
CI_2950
CI_3986
CI_2510
CI_3074
CI_0699
CI_3043
CI_3290
CI_3299
CI_0392
CI_0168
CI_0182
CI_3812
CI_1845
CI_2230
CI_1660
CI_3943
CI_3856
CI_4990
CI_0955
CE_M_10_3107
07_0549
07_0675
 Can rapidly generate different clusters
of isolates at an almost unlimited
number of thresholds
 E.g.:
Do groups formed by genomic
relationships agree with those formed
by epidemiologic relationships ?
- OR -
What is the optimal threshold for
forming clusters that agree with
epidemiologic relationships?
Genomic data…so many options for thresholding
 WGS based analyses still require knowledge of the epidemiology to guide
clustering of genomic data into “epidemiologically relevant clusters”

5
Those who make many species
are the 'splitters' and those who
make few are the 'lumpers’…
– CD (1857)
Clustering thresholds have been with us forever…
 Need to calibrate our analysises to ensure our results exploit the high
resolution of WGS data while remaining epidemiologically relevant

6
Building a model for quantifying epidemiological similarity
“Essentially, all models are wrong,
but some are useful.”
George E.P. Box
(1919-2013)

7
How to relate epidemiologic and genomic clustering?
1. Adjusted Wallace Coefficient: (AWC)  Carriço et al. (Comparing Partitions)
The directional likelihood that two isolates clustered together using
one method will be grouped together in the second method
AWCStrain 1 Strain 2 Strain 1 Strain 2
WGS clusters Epi clusters
2. Intra-cluster cohesion: (ICC)
A measure of the of the genomic and/or epidemiologic homogeneity of
the isolates within a cluster
High ICC Low ICC

8
Comparing epidemiology vs. genomics
 Need a model to assess strain to strain relationships based on isolate
epidemiology so we can directly compare them against the WGS data
Core
Analysis
Source
Location
Date
Genomics
Workflow
Epidemiology
Workflow
Sequencing Assembly Annotation
In-Silico Typing
Cluster Analysis
&
Analysis of
concordance
Metadata Curation Quantify Epi-Similarities
Isolate Selection

The challenge with epidemiological data
Source SpatialTemporal
 Surveillance data is inherently less comprehensive than outbreak data
 Metadata is generally qualitative/categorical, not quantitative

Source SpatialTemporal
 Establish a metric that summarizes the relationships between isolates
based on basic epidemiologic metadata
 Clustering of isolates based on epidemiological metadata
Our proposed approach:
A model for quantifying epidemiological similarity
between strains based on three primary factors:
source, space, time
EpiSym = σ(source) + γ(geospatial) + τ(temporal)
σ = coefficient for Source
γ = coefficient for Geospatial
τ = coefficient for Temporal
Building a model for epidemiological similarity

11
Spatial
=
Where
• distab is given by the Haversine formula
• x, y = sampling dates
Temporal
=
Quantifying epi-similarities: Spatial and Temporal
 ‘Spatial’ and ‘Temporal’ factors required for the EpiSym coefficient are
relatively simple to build into the equation

12
 Identify all available sources
 Identify core epidemiological attributes
 Assess each source independently and completely for each
attribute
 Score the pairwise similarity between any two sources based
on their shared epidemiological attributes
Source
Quantifying Source-Source Similarities
=
Where
• i, j = two sources being compared
• *(i + j) = number of matching attributes
• n = maximum possible score
EpiSym

13
Faecal_Cow Retail_Chicken
Animal
Food Production
Retail
Domestic
Wild
Avian
Ruminant
Porcine
OtherAnimal
Human
Retail_Foods
Food Association
Rural
Urban
Environmental
Water
Soil
Farm?
Food
An example: ‘faecal cow’ vs. ‘retail chicken’
Similarity:
=
12.5
19
= 0.658
Σ Pairwise Matches
Maximum
Possible Score
=
 Once source similarity is quantified, we can compute overall EpiSym
 We can systematically compute EpiSym across large datasets  epi clusters
 Comparison to genomic clusters using cluster concordance metrics

14
08_5603
08_5925
08_6160
08_6208
08_7039
08_7017
08_6877
08_7016
08_4456
08_4696
08_4697
08_4913
08_5176
08_5490
08_4603
08_4474
08_4472
08_4468
08_4466
08_4460
08_4461
06_3783
06_3851
06_3852
06_3245
06_3849
06_3569
06_7331
06_7332
06_7515
06_7656
06_5790
06_6212
06_6211
06_4734
06_4911
06_5176
07_5581
07_5583
07_6215
07_6017
07_6066
07_5038
07_5039
07_5041
07_3238
07_4428
07_4268
07_4269
07_4076
07_3853
07_3647
07_3508
07_2680
07_2174
07_1875
07_1493
07_0971
07_1009
07_0675
07_0549
CHR_053
CHR_028
CHR_026
CHR_023
CHR_022
CHR_151
CHR_119
CHR_127
CHR_130
CHR_090
CHR_159
CGY_HR_121
CGY_HR_009
CGY_HR_027
CGY_HR_026
CGY_HR_013
CGY_HR_022
CGY_HR_020
CGY_HR_139
CGY_HR_140
CGY_HR_170
CGY_HR_174
CGY_HR_061
CGY_HR_058
CGY_HR_080
CGY_HR_082
CGY_HR_083
CGY_HR_073
CGY_HR_074
CGY_HR_076
CGY_HR_118
CGY_HR_108
CGY_HR_109
CGY_HR_090
CGY_HR_098
CGY_HR_101
CGY_HR_103
CE2_R_11_3009
CE2_R2_11_1009
CE2_R_11_3021
CE2_R2_11_1006
CE2_R2_11_3023
CE2_R_11_3039
CE2_R_11_1063
CE2_R_11_3131
CE2_R2_11_2022
CE2_R2_11_3085
CE2_R2_11_2018
CE2_R_11_3113
CE2_R2_11_3081
CE2_R2_11_1027
CE2_R2_11_1033
06_6554
06_2866
08_0100
08_1711
08_1709
08_1714
08_1700
08_0099
08_0096
07_7314
07_7331
07_7324
CE_M_09_4099
CE_M_10_4054
CE_M_10_4091
CE_M_09_3054
CE_M_09_3081
CE_M_09_2085
CE_M_10_3062
CE_M_10_3107
CE_M_10_2096
CE_M_10_2108
CE_M_10_2113
CE_R_10_0306
CE_R_10_0305
CE_R_10_0273
CE_R_11_0192
CE_R_11_0170
CE_R_11_0178
CE_R_11_0114
CE_R_11_0100
CE_R_11_0077
CE_R2_11_0134
CE_R2_11_0350
CE_R2_11_0374
CE_R_11_0270
CE_R_11_0251
CE_R_11_0238
CE_R_11_0240
CI_5997
CI_4990
CI_4864
CI_4835
CI_4909
CI_5943
CI_5947
CI_5889
CI_5906
CI_5328
CI_5357
CI_5429
CI_5692
CI_4458
CI_4447
CI_4428
CI_4424
CI_4411
CI_4395
CI_4383
CI_4378
CI_4360
CI_4353
CI_4356
CI_3290
CI_3299
CI_4108
CI_5245
CI_5254
CI_5265
CI_5292
CI_5300
CI_1943
CI_1964
CI_1969
CI_2061
CI_1920
CI_2055
CI_2004
CI_2009
CI_1884
CI_1874
CI_1875
CI_1092
CI_1117
CI_0955
CI_0918
CI_1096
CI_1109
CI_0973
CI_0987
CI_0927
CI_0898
CI_0884
CI_0893
CI_3889
CI_3986
CI_3943
CI_0405
CI_0677
CI_0685
CI_0765
CI_0458
CI_0450
CI_0453
CI_0346
CI_0783
CI_0334
CI_0322
CI_0325
CI_0182
CI_0168
CI_0136
CI_0165
CI_0637
CI_0392
CI_2840
CI_1915
CI_2328
CI_1415
CI_0292
CI_0094
CI_0697
CI_0699
CI_0609
CI_0532
CI_2950
CI_3036
CI_3043
CI_2695
CI_2499
CI_2533
CI_2536
CI_5198
CI_4806
CI_5034
CI_4102
CI_3856
CI_4071
CI_4079
CI_3879
CI_3812
CI_3252
CI_3421
CI_3609
CI_3643
CI_2864
CI_1845
CI_2510
CI_2605
CI_2423
CI_2439
CI_3074
CI_2989
CI_2991
CI_2230
CI_1660
CI_1653
CI_2548
CI_2705
CI_1799
CI_1636
CI_1636
CI_1799
CI_2705
CI_2548
CI_1653
CI_1660
CI_2230
CI_2991
CI_2989
CI_3074
CI_2439
CI_2423
CI_2605
CI_2510
CI_1845
CI_2864
CI_3643
CI_3609
CI_3421
CI_3252
CI_3812
CI_3879
CI_4079
CI_4071
CI_3856
CI_4102
CI_5034
CI_4806
CI_5198
CI_2536
CI_2533
CI_2499
CI_2695
CI_3043
CI_3036
CI_2950
CI_0532
CI_0609
CI_0699
CI_0697
CI_0094
CI_0292
CI_1415
CI_2328
CI_1915
CI_2840
CI_0392
CI_0637
CI_0165
CI_0136
CI_0168
CI_0182
CI_0325
CI_0322
CI_0334
CI_0783
CI_0346
CI_0453
CI_0450
CI_0458
CI_0765
CI_0685
CI_0677
CI_0405
CI_3943
CI_3986
CI_3889
CI_0893
CI_0884
CI_0898
CI_0927
CI_0987
CI_0973
CI_1109
CI_1096
CI_0918
CI_0955
CI_1117
CI_1092
CI_1875
CI_1874
CI_1884
CI_2009
CI_2004
CI_2055
CI_1920
CI_2061
CI_1969
CI_1964
CI_1943
CI_5300
CI_5292
CI_5265
CI_5254
CI_5245
CI_4108
CI_3299
CI_3290
CI_4356
CI_4353
CI_4360
CI_4378
CI_4383
CI_4395
CI_4411
CI_4424
CI_4428
CI_4447
CI_4458
CI_5692
CI_5429
CI_5357
CI_5328
CI_5906
CI_5889
CI_5947
CI_5943
CI_4909
CI_4835
CI_4864
CI_4990
CI_5997
CE_R_11_0240
CE_R_11_0238
CE_R_11_0251
CE_R_11_0270
CE_R2_11_0374
CE_R2_11_0350
CE_R2_11_0134
CE_R_11_0077
CE_R_11_0100
CE_R_11_0114
CE_R_11_0178
CE_R_11_0170
CE_R_11_0192
CE_R_10_0273
CE_R_10_0305
CE_R_10_0306
CE_M_10_2113
CE_M_10_2108
CE_M_10_2096
CE_M_10_3107
CE_M_10_3062
CE_M_09_2085
CE_M_09_3081
CE_M_09_3054
CE_M_10_4091
CE_M_10_4054
CE_M_09_4099
07_7324
07_7331
07_7314
08_0096
08_0099
08_1700
08_1714
08_1709
08_1711
08_0100
06_2866
06_6554
CE2_R2_11_1033
CE2_R2_11_1027
CE2_R2_11_3081
CE2_R_11_3113
CE2_R2_11_2018
CE2_R2_11_3085
CE2_R2_11_2022
CE2_R_11_3131
CE2_R_11_1063
CE2_R_11_3039
CE2_R2_11_3023
CE2_R2_11_1006
CE2_R_11_3021
CE2_R2_11_1009
CE2_R_11_3009
CGY_HR_103
CGY_HR_101
CGY_HR_098
CGY_HR_090
CGY_HR_109
CGY_HR_108
CGY_HR_118
CGY_HR_076
CGY_HR_074
CGY_HR_073
CGY_HR_083
CGY_HR_082
CGY_HR_080
CGY_HR_058
CGY_HR_061
CGY_HR_174
CGY_HR_170
CGY_HR_140
CGY_HR_139
CGY_HR_020
CGY_HR_022
CGY_HR_013
CGY_HR_026
CGY_HR_027
CGY_HR_009
CGY_HR_121
CHR_159
CHR_090
CHR_130
CHR_127
CHR_119
CHR_151
CHR_022
CHR_023
CHR_026
CHR_028
CHR_053
07_0549
07_0675
07_1009
07_0971
07_1493
07_1875
07_2174
07_2680
07_3508
07_3647
07_3853
07_4076
07_4269
07_4268
07_4428
07_3238
07_5041
07_5039
07_5038
07_6066
07_6017
07_6215
07_5583
07_5581
06_5176
06_4911
06_4734
06_6211
06_6212
06_5790
06_7656
06_7515
06_7332
06_7331
06_3569
06_3849
06_3245
06_3852
06_3851
06_3783
08_4461
08_4460
08_4466
08_4468
08_4472
08_4474
08_4603
08_5490
08_5176
08_4913
08_4697
08_4696
08_4456
08_7016
08_6877
08_7017
08_7039
08_6208
08_6160
08_5925
08_5603
0 0.2 0.4 0.6 0.8
Value
02004006008001000
Color Key
and Histogram
Count
Clinical
Animal
Environmental
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P Major clusters based on source factor
 Subclusters further refined by spatial and temporal factors
Results: epidemiological clustering of C. jejuni isolates

15
08_5603
08_5925
08_6160
08_6208
08_7039
08_7017
08_6877
08_7016
08_4456
08_4696
08_4697
08_4913
08_5176
08_5490
08_4603
08_4474
08_4472
08_4468
08_4466
08_4460
08_4461
06_3783
06_3851
06_3852
06_3245
06_3849
06_3569
06_7331
06_7332
06_7515
06_7656
06_5790
06_6212
06_6211
06_4734
06_4911
06_5176
07_5581
07_5583
07_6215
07_6017
07_6066
07_5038
07_5039
07_5041
07_3238
07_4428
07_4268
07_4269
07_4076
07_3853
07_3647
07_3508
07_2680
07_2174
07_1875
07_1493
07_0971
07_1009
07_0675
07_0549
CHR_053
CHR_028
CHR_026
CHR_023
CHR_022
CHR_151
CHR_119
CHR_127
CHR_130
CHR_090
CHR_159
CGY_HR_121
CGY_HR_009
CGY_HR_027
CGY_HR_026
CGY_HR_013
CGY_HR_022
CGY_HR_020
CGY_HR_139
CGY_HR_140
CGY_HR_170
CGY_HR_174
CGY_HR_061
CGY_HR_058
CGY_HR_080
CGY_HR_082
CGY_HR_083
CGY_HR_073
CGY_HR_074
CGY_HR_076
CGY_HR_118
CGY_HR_108
CGY_HR_109
CGY_HR_090
CGY_HR_098
CGY_HR_101
CGY_HR_103
CE2_R_11_3009
CE2_R2_11_1009
CE2_R_11_3021
CE2_R2_11_1006
CE2_R2_11_3023
CE2_R_11_3039
CE2_R_11_1063
CE2_R_11_3131
CE2_R2_11_2022
CE2_R2_11_3085
CE2_R2_11_2018
CE2_R_11_3113
CE2_R2_11_3081
CE2_R2_11_1027
CE2_R2_11_1033
06_6554
06_2866
08_0100
08_1711
08_1709
08_1714
08_1700
08_0099
08_0096
07_7314
07_7331
07_7324
CE_M_09_4099
CE_M_10_4054
CE_M_10_4091
CE_M_09_3054
CE_M_09_3081
CE_M_09_2085
CE_M_10_3062
CE_M_10_3107
CE_M_10_2096
CE_M_10_2108
CE_M_10_2113
CE_R_10_0306
CE_R_10_0305
CE_R_10_0273
CE_R_11_0192
CE_R_11_0170
CE_R_11_0178
CE_R_11_0114
CE_R_11_0100
CE_R_11_0077
CE_R2_11_0134
CE_R2_11_0350
CE_R2_11_0374
CE_R_11_0270
CE_R_11_0251
CE_R_11_0238
CE_R_11_0240
CI_5997
CI_4990
CI_4864
CI_4835
CI_4909
CI_5943
CI_5947
CI_5889
CI_5906
CI_5328
CI_5357
CI_5429
CI_5692
CI_4458
CI_4447
CI_4428
CI_4424
CI_4411
CI_4395
CI_4383
CI_4378
CI_4360
CI_4353
CI_4356
CI_3290
CI_3299
CI_4108
CI_5245
CI_5254
CI_5265
CI_5292
CI_5300
CI_1943
CI_1964
CI_1969
CI_2061
CI_1920
CI_2055
CI_2004
CI_2009
CI_1884
CI_1874
CI_1875
CI_1092
CI_1117
CI_0955
CI_0918
CI_1096
CI_1109
CI_0973
CI_0987
CI_0927
CI_0898
CI_0884
CI_0893
CI_3889
CI_3986
CI_3943
CI_0405
CI_0677
CI_0685
CI_0765
CI_0458
CI_0450
CI_0453
CI_0346
CI_0783
CI_0334
CI_0322
CI_0325
CI_0182
CI_0168
CI_0136
CI_0165
CI_0637
CI_0392
CI_2840
CI_1915
CI_2328
CI_1415
CI_0292
CI_0094
CI_0697
CI_0699
CI_0609
CI_0532
CI_2950
CI_3036
CI_3043
CI_2695
CI_2499
CI_2533
CI_2536
CI_5198
CI_4806
CI_5034
CI_4102
CI_3856
CI_4071
CI_4079
CI_3879
CI_3812
CI_3252
CI_3421
CI_3609
CI_3643
CI_2864
CI_1845
CI_2510
CI_2605
CI_2423
CI_2439
CI_3074
CI_2989
CI_2991
CI_2230
CI_1660
CI_1653
CI_2548
CI_2705
CI_1799
CI_1636
CI_1636
CI_1799
CI_2705
CI_2548
CI_1653
CI_1660
CI_2230
CI_2991
CI_2989
CI_3074
CI_2439
CI_2423
CI_2605
CI_2510
CI_1845
CI_2864
CI_3643
CI_3609
CI_3421
CI_3252
CI_3812
CI_3879
CI_4079
CI_4071
CI_3856
CI_4102
CI_5034
CI_4806
CI_5198
CI_2536
CI_2533
CI_2499
CI_2695
CI_3043
CI_3036
CI_2950
CI_0532
CI_0609
CI_0699
CI_0697
CI_0094
CI_0292
CI_1415
CI_2328
CI_1915
CI_2840
CI_0392
CI_0637
CI_0165
CI_0136
CI_0168
CI_0182
CI_0325
CI_0322
CI_0334
CI_0783
CI_0346
CI_0453
CI_0450
CI_0458
CI_0765
CI_0685
CI_0677
CI_0405
CI_3943
CI_3986
CI_3889
CI_0893
CI_0884
CI_0898
CI_0927
CI_0987
CI_0973
CI_1109
CI_1096
CI_0918
CI_0955
CI_1117
CI_1092
CI_1875
CI_1874
CI_1884
CI_2009
CI_2004
CI_2055
CI_1920
CI_2061
CI_1969
CI_1964
CI_1943
CI_5300
CI_5292
CI_5265
CI_5254
CI_5245
CI_4108
CI_3299
CI_3290
CI_4356
CI_4353
CI_4360
CI_4378
CI_4383
CI_4395
CI_4411
CI_4424
CI_4428
CI_4447
CI_4458
CI_5692
CI_5429
CI_5357
CI_5328
CI_5906
CI_5889
CI_5947
CI_5943
CI_4909
CI_4835
CI_4864
CI_4990
CI_5997
CE_R_11_0240
CE_R_11_0238
CE_R_11_0251
CE_R_11_0270
CE_R2_11_0374
CE_R2_11_0350
CE_R2_11_0134
CE_R_11_0077
CE_R_11_0100
CE_R_11_0114
CE_R_11_0178
CE_R_11_0170
CE_R_11_0192
CE_R_10_0273
CE_R_10_0305
CE_R_10_0306
CE_M_10_2113
CE_M_10_2108
CE_M_10_2096
CE_M_10_3107
CE_M_10_3062
CE_M_09_2085
CE_M_09_3081
CE_M_09_3054
CE_M_10_4091
CE_M_10_4054
CE_M_09_4099
07_7324
07_7331
07_7314
08_0096
08_0099
08_1700
08_1714
08_1709
08_1711
08_0100
06_2866
06_6554
CE2_R2_11_1033
CE2_R2_11_1027
CE2_R2_11_3081
CE2_R_11_3113
CE2_R2_11_2018
CE2_R2_11_3085
CE2_R2_11_2022
CE2_R_11_3131
CE2_R_11_1063
CE2_R_11_3039
CE2_R2_11_3023
CE2_R2_11_1006
CE2_R_11_3021
CE2_R2_11_1009
CE2_R_11_3009
CGY_HR_103
CGY_HR_101
CGY_HR_098
CGY_HR_090
CGY_HR_109
CGY_HR_108
CGY_HR_118
CGY_HR_076
CGY_HR_074
CGY_HR_073
CGY_HR_083
CGY_HR_082
CGY_HR_080
CGY_HR_058
CGY_HR_061
CGY_HR_174
CGY_HR_170
CGY_HR_140
CGY_HR_139
CGY_HR_020
CGY_HR_022
CGY_HR_013
CGY_HR_026
CGY_HR_027
CGY_HR_009
CGY_HR_121
CHR_159
CHR_090
CHR_130
CHR_127
CHR_119
CHR_151
CHR_022
CHR_023
CHR_026
CHR_028
CHR_053
07_0549
07_0675
07_1009
07_0971
07_1493
07_1875
07_2174
07_2680
07_3508
07_3647
07_3853
07_4076
07_4269
07_4268
07_4428
07_3238
07_5041
07_5039
07_5038
07_6066
07_6017
07_6215
07_5583
07_5581
06_5176
06_4911
06_4734
06_6211
06_6212
06_5790
06_7656
06_7515
06_7332
06_7331
06_3569
06_3849
06_3245
06_3852
06_3851
06_3783
08_4461
08_4460
08_4466
08_4468
08_4472
08_4474
08_4603
08_5490
08_5176
08_4913
08_4697
08_4696
08_4456
08_7016
08_6877
08_7017
08_7039
08_6208
08_6160
08_5925
08_5603
0 0.2 0.4 0.6 0.8
Value
02004006008001000
Color Key
and Histogram
Count
Clinical
Animal
Environmental
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
1 2
3
4
5
 Clusters of secondary heat correspond to isolates with similar geography
and temporal data, but different sources
Results: epidemiological clustering of C. jejuni isolates

16
Calibrating WGS typing for epidemiologic investigationsGenetic Similarity
07_1875
CI_2864
06_7515
CI_1415
06_3783
06_3849
06_3851
06_6554
CI_3889
07_5039
CGY_HR_073
CGY_HR_074
CI_0898
CI_0893
06_3852
08_1711
08_1709
CI_2548
CI_2423
CI_2991
CI_2989
CI_1799
CI_2009
CI_2004
CI_4102
CI_3609
CI_2328
CI_1653
CI_4079
CI_3036
CI_2605
CI_2536
CI_0532
CI_0699
CI_3074
CI_3043
CI_0450
CI_0458
CI_0453
CI_2695
CI_2533
CI_2510
CI_0765
CI_5034
CI_3986
CI_2950
CI_2705
CI_0697
CI_3252
CI_1636
CI_3856
CI_1660
CI_4990
CI_3943
CI_3812
CI_2230
CI_1845
CI_0168
CI_0182
CI_0392
CI_3299
CI_3290
07_0675
07_0549
CE_M_10_3107
07_7324
 We can identify the clusters obtained at varying thresholds and compare
them to epidemiological clusters to look for ‘best-fit’
An advantage of WGS is the flexibility in thresholding that is possible

0.25
0.50
0.75
1.00
0 25 50 75 100
cgMLST Clustering Threshold (%)
WeightedGlobalClusterCohesion
WGEC_ns
WGEC_ws
WGGC_ns
WGGC_ws
17
Calibrating WGS typing for epidemiologic investigations
Genomic cluster
homogeneity
vs.
Epidemiologic cluster
homogeneity
 Calculate point of highest genomic-cohesion while maintaining
 Multi-isolate clusters
 High epidemiologic validity

18
Epi vs. Genomic clustering: examining the outliers
 Strains with similar epidemiology aren’t necessarily similar genomically
(and vice-versa!)
 By overlaying the two methods, we can identify clusters that group
together significantly stronger via genomic or epidemiologic relationships
“Epi-Clustering “Genomic-Clustering”

19
= stronger similarity via
= stronger similarity via
−1.0 −0.5 0.0 0.5 1.0
01000
0102030
Frequency Count (left−tail p = 0.05)
ST−1244
ST−137
ST−19
ST−21
ST−2306
ST−2521
ST−262
ST−267
ST−3391
ST−3530
ST−42
ST−45
ST−459
ST−46
ST−48
ST−50
ST−5164
ST−52
ST−5619
ST−61
ST−679
ST−7694
ST−8
ST−922
ST−929
ST−982
0 10 20 30
Frequency Count (right−tail p = 0.05)
“Generalist
genotype”
“Generalist
source”
 ‘Generalist’ genotypes  persist across many conbinations of source,
temporal and spatial parameters
 ‘Generalist’ reservoirs  support the persistence of a broad range of
genotypes
Epi vs. Genomic clustering: examining the outliers

20
Summary
 We have developed a model to help guide our analysis of Campylobacter
WGS data for practical public health purposes
 Systematic examination of the relationship between the genomic and
epidemiological similarity of sets of isolates  optimization of clustering
for epidemiologic relevance
 Calculate point of highest genomic-cohesion while maintaining
 High epidemiologic cohesion
 Multi-isolate clusters
 Interactive web application under development (Check it out!)
https://hetmanb.shinyapps.io/EpiQuant/

21
Acknowledgements
People
• Supervisors:
Ed Taboada + Jim Thomas
• Lab:
Steven Mutschall (PHAC)
Peter Kruczkiewicz (PHAC)
Dillon Barker (PHAC/ULeth)
Funding
• University of Lethbridge
• Public Health Agency of Canada A-base
• Gov’t of Canada: Genomics Research and Development Initiative

Hetman immem xi final March 2016

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hetman immem xi final March 2016

Similar to Hetman immem xi final March 2016 (20)

More from IRIDA_community

More from IRIDA_community (15)

Recently uploaded

Recently uploaded (20)

Hetman immem xi final March 2016