Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
IMMEM XI
Navigating Microbial Genomes:
Insights from the Next Generation
9 – 12 March 2016, Estoril, Portugal
29.3
19.67
11.12
3.08
1.94 1.56
0.36 0.32
Campylobacteriosis Salmonellosis Giardiasis
Shigellosis Verotoxigenic E. coli (V...
3
The epidemiology of campylobacteriosis is daunting
Source: Julie Arsenault (PhD Thesis)
papyrus.bib.umontreal.ca/jspui/h...
4
Genetic Similarity
CI_1875
CI_1884
CI_4356
CI_4378
CI_4411
CI_4447
CI_4424
06_6211
CE_M_10_2108
CE_M_10_2096
06_4734
07_...
5
Those who make many species
are the 'splitters' and those who
make few are the 'lumpers’…
– CD (1857)
Clustering thresho...
6
Building a model for quantifying epidemiological similarity
“Essentially, all models are wrong,
but some are useful.”
Ge...
7
How to relate epidemiologic and genomic clustering?
1. Adjusted Wallace Coefficient: (AWC)  Carriço et al. (Comparing P...
8
Comparing epidemiology vs. genomics
 Need a model to assess strain to strain relationships based on isolate
epidemiolog...
The challenge with epidemiological data
Source SpatialTemporal
 Surveillance data is inherently less comprehensive than o...
Source SpatialTemporal
 Establish a metric that summarizes the relationships between isolates
based on basic epidemiologi...
11
Spatial
=
Where
• distab is given by the Haversine formula
• x, y = sampling dates
Temporal
=
Quantifying epi-similarit...
12
 Identify all available sources
 Identify core epidemiological attributes
 Assess each source independently and comp...
13
Faecal_Cow Retail_Chicken
Animal
Food Production
Retail
Domestic
Wild
Avian
Ruminant
Porcine
OtherAnimal
Human
Retail_F...
14
08_5603
08_5925
08_6160
08_6208
08_7039
08_7017
08_6877
08_7016
08_4456
08_4696
08_4697
08_4913
08_5176
08_5490
08_4603...
15
08_5603
08_5925
08_6160
08_6208
08_7039
08_7017
08_6877
08_7016
08_4456
08_4696
08_4697
08_4913
08_5176
08_5490
08_4603...
16
Calibrating WGS typing for epidemiologic investigationsGenetic Similarity
07_1875
CI_2864
06_7515
CI_1415
06_3783
06_38...
0.25
0.50
0.75
1.00
0 25 50 75 100
cgMLST Clustering Threshold (%)
WeightedGlobalClusterCohesion
WGEC_ns
WGEC_ws
WGGC_ns
W...
18
Epi vs. Genomic clustering: examining the outliers
 Strains with similar epidemiology aren’t necessarily similar genom...
19
= stronger similarity via
= stronger similarity via
−1.0 −0.5 0.0 0.5 1.0
01000
0102030
Frequency Count (left−tail p = ...
20
Summary
 We have developed a model to help guide our analysis of Campylobacter
WGS data for practical public health pu...
21
Acknowledgements
People
• Supervisors:
Ed Taboada + Jim Thomas
• Lab:
Steven Mutschall (PHAC)
Peter Kruczkiewicz (PHAC)...
Upcoming SlideShare
Loading in …5
×

Hetman immem xi final March 2016

84 views

Published on

The EpiQuant framework for assessing epidemiologic and genetic concordance: Towards improved use of genomic data in epidemiological applications.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Hetman immem xi final March 2016

  1. 1. IMMEM XI Navigating Microbial Genomes: Insights from the Next Generation 9 – 12 March 2016, Estoril, Portugal
  2. 2. 29.3 19.67 11.12 3.08 1.94 1.56 0.36 0.32 Campylobacteriosis Salmonellosis Giardiasis Shigellosis Verotoxigenic E. coli (VTEC) Cryptosporidiosis Listeriosis Cyclosporiasis (*447) (*269) (*24) (*4) (*39) (*7) (*.55) (*7.5) Thomas et al (2013). doi:10.1089/fpd.2012.1389 FoodNet Canada Short Report 2013 ***Post-correction estimate 2 Campylobacter is a public health challenge  #1 bacterial gastrointestinal disease in Canada and a leading foodborne pathogen worldwide (300-500 million cases)  Self-limiting illness, highly under-reported, largely sporadic
  3. 3. 3 The epidemiology of campylobacteriosis is daunting Source: Julie Arsenault (PhD Thesis) papyrus.bib.umontreal.ca/jspui/handle/1866/4625  Widespread in “farm-to-fork” and “source-to-tap”  high prevalence in most major livestock species  found in many wild animal species, insects, surface waters  Difficult to establish sources of exposure and routes of transmission  Crisis = Opportunity  WGS to the rescue!!!
  4. 4. 4 Genetic Similarity CI_1875 CI_1884 CI_4356 CI_4378 CI_4411 CI_4447 CI_4424 06_6211 CE_M_10_2108 CE_M_10_2096 06_4734 07_2174 CE_M_09_3054 CE_M_10_2113 CGY_HR_108 CHR_026 CHR_130 CI_0325 CI_1096 CI_0973 CI_1874 CI_4383 CI_5889 CI_5692 CI_0292 08_1700 CGY_HR_058 CI_0927 CI_4458 CI_5947 CI_5943 CGY_HR_020 CGY_HR_009 CHR_053 CI_0334 07_4268 06_3245 06_3569 07_1493 CI_0884 CI_0094 CI_3421 CI_0405 07_0971 07_1875 CI_2864 06_7515 CI_1415 06_3783 06_3849 06_3851 06_6554 CI_3889 07_5039 CGY_HR_073 CGY_HR_074 CI_0898 CI_0893 06_3852 08_1711 08_1709 CI_2548 CI_2423 CI_2991 CI_2989 CI_1799 CI_2009 CI_2004 CI_4102 CI_3609 CI_2328 CI_1653 CI_4079 CI_3036 CI_2605 CI_2536 CI_0532 CI_0699 CI_3074 CI_3043 CI_0450 CI_0458 CI_0453 CI_2695 CI_2533 CI_2510 CI_0765 CI_5034 CI_3986 CI_2950 CI_2705 CI_0697 CI_3252 CI_1636 CI_3856 CI_1660 CI_4990 CI_3943 CI_3812 CI_2230 CI_1845 CI_0168 CI_0182 CI_0392 CI_3299 CI_3290 07_0675 07_0549 CE_M_10_3107 07_7324 Epidemio CE2_R2_11_3085 CE2_R_11_3009 07_7324 06_7656 06_7331 CGY_HR_061 CGY_HR_076 CI_0927 CI_5943 CI_5947 CGY_HR_058 CI_0292 CI_5692 CI_5889 CI_1874 07_4268 CI_4383 CI_0334 CGY_HR_009 CGY_HR_020 CHR_053 CI_4458 08_1700 06_7515 07_1875 CI_2864 07_0971 CI_0405 CI_3421 CI_0884 06_3245 CI_0094 06_3569 07_1493 06_6211 06_4734 CE_M_10_2108 CHR_026 CGY_HR_108 CHR_130 CI_0973 CI_1096 CI_0325 07_2174 CE_M_09_3054 CE_M_10_2096 CE_M_10_2113 CI_1415 CI_0893 CI_0898 CGY_HR_073 CGY_HR_074 08_1711 06_3852 08_1709 07_5039 06_3783 06_3851 CI_3889 06_3849 06_6554 CI_2989 CI_2991 CI_2423 CI_2548 CI_4102 CI_3609 CI_1653 CI_2328 CI_3036 CI_2605 CI_0532 CI_2536 CI_0453 CI_0450 CI_0458 CI_2533 CI_2695 CI_1799 CI_2004 CI_2009 CI_2705 CI_1636 CI_3252 CI_0697 CI_5034 CI_4079 CI_0765 CI_2950 CI_3986 CI_2510 CI_3074 CI_0699 CI_3043 CI_3290 CI_3299 CI_0392 CI_0168 CI_0182 CI_3812 CI_1845 CI_2230 CI_1660 CI_3943 CI_3856 CI_4990 CI_0955 CE_M_10_3107 07_0549 07_0675  Can rapidly generate different clusters of isolates at an almost unlimited number of thresholds  E.g.: Do groups formed by genomic relationships agree with those formed by epidemiologic relationships ? - OR - What is the optimal threshold for forming clusters that agree with epidemiologic relationships? Genomic data…so many options for thresholding  WGS based analyses still require knowledge of the epidemiology to guide clustering of genomic data into “epidemiologically relevant clusters”
  5. 5. 5 Those who make many species are the 'splitters' and those who make few are the 'lumpers’… – CD (1857) Clustering thresholds have been with us forever…  Need to calibrate our analysises to ensure our results exploit the high resolution of WGS data while remaining epidemiologically relevant
  6. 6. 6 Building a model for quantifying epidemiological similarity “Essentially, all models are wrong, but some are useful.” George E.P. Box (1919-2013)
  7. 7. 7 How to relate epidemiologic and genomic clustering? 1. Adjusted Wallace Coefficient: (AWC)  Carriço et al. (Comparing Partitions) The directional likelihood that two isolates clustered together using one method will be grouped together in the second method AWCStrain 1 Strain 2 Strain 1 Strain 2 WGS clusters Epi clusters 2. Intra-cluster cohesion: (ICC) A measure of the of the genomic and/or epidemiologic homogeneity of the isolates within a cluster High ICC Low ICC
  8. 8. 8 Comparing epidemiology vs. genomics  Need a model to assess strain to strain relationships based on isolate epidemiology so we can directly compare them against the WGS data Core Analysis Source Location Date Genomics Workflow Epidemiology Workflow Sequencing Assembly Annotation In-Silico Typing Cluster Analysis & Analysis of concordance Metadata Curation Quantify Epi-Similarities Isolate Selection
  9. 9. The challenge with epidemiological data Source SpatialTemporal  Surveillance data is inherently less comprehensive than outbreak data  Metadata is generally qualitative/categorical, not quantitative
  10. 10. Source SpatialTemporal  Establish a metric that summarizes the relationships between isolates based on basic epidemiologic metadata  Clustering of isolates based on epidemiological metadata Our proposed approach: A model for quantifying epidemiological similarity between strains based on three primary factors: source, space, time EpiSym = σ(source) + γ(geospatial) + τ(temporal) σ = coefficient for Source γ = coefficient for Geospatial τ = coefficient for Temporal Building a model for epidemiological similarity
  11. 11. 11 Spatial = Where • distab is given by the Haversine formula • x, y = sampling dates Temporal = Quantifying epi-similarities: Spatial and Temporal  ‘Spatial’ and ‘Temporal’ factors required for the EpiSym coefficient are relatively simple to build into the equation
  12. 12. 12  Identify all available sources  Identify core epidemiological attributes  Assess each source independently and completely for each attribute  Score the pairwise similarity between any two sources based on their shared epidemiological attributes Source Quantifying Source-Source Similarities = Where • i, j = two sources being compared • *(i + j) = number of matching attributes • n = maximum possible score EpiSym
  13. 13. 13 Faecal_Cow Retail_Chicken Animal Food Production Retail Domestic Wild Avian Ruminant Porcine OtherAnimal Human Retail_Foods Food Association Rural Urban Environmental Water Soil Farm? Food An example: ‘faecal cow’ vs. ‘retail chicken’ Similarity: = 12.5 19 = 0.658 Σ Pairwise Matches Maximum Possible Score =  Once source similarity is quantified, we can compute overall EpiSym  We can systematically compute EpiSym across large datasets  epi clusters  Comparison to genomic clusters using cluster concordance metrics
  14. 14. 14 08_5603 08_5925 08_6160 08_6208 08_7039 08_7017 08_6877 08_7016 08_4456 08_4696 08_4697 08_4913 08_5176 08_5490 08_4603 08_4474 08_4472 08_4468 08_4466 08_4460 08_4461 06_3783 06_3851 06_3852 06_3245 06_3849 06_3569 06_7331 06_7332 06_7515 06_7656 06_5790 06_6212 06_6211 06_4734 06_4911 06_5176 07_5581 07_5583 07_6215 07_6017 07_6066 07_5038 07_5039 07_5041 07_3238 07_4428 07_4268 07_4269 07_4076 07_3853 07_3647 07_3508 07_2680 07_2174 07_1875 07_1493 07_0971 07_1009 07_0675 07_0549 CHR_053 CHR_028 CHR_026 CHR_023 CHR_022 CHR_151 CHR_119 CHR_127 CHR_130 CHR_090 CHR_159 CGY_HR_121 CGY_HR_009 CGY_HR_027 CGY_HR_026 CGY_HR_013 CGY_HR_022 CGY_HR_020 CGY_HR_139 CGY_HR_140 CGY_HR_170 CGY_HR_174 CGY_HR_061 CGY_HR_058 CGY_HR_080 CGY_HR_082 CGY_HR_083 CGY_HR_073 CGY_HR_074 CGY_HR_076 CGY_HR_118 CGY_HR_108 CGY_HR_109 CGY_HR_090 CGY_HR_098 CGY_HR_101 CGY_HR_103 CE2_R_11_3009 CE2_R2_11_1009 CE2_R_11_3021 CE2_R2_11_1006 CE2_R2_11_3023 CE2_R_11_3039 CE2_R_11_1063 CE2_R_11_3131 CE2_R2_11_2022 CE2_R2_11_3085 CE2_R2_11_2018 CE2_R_11_3113 CE2_R2_11_3081 CE2_R2_11_1027 CE2_R2_11_1033 06_6554 06_2866 08_0100 08_1711 08_1709 08_1714 08_1700 08_0099 08_0096 07_7314 07_7331 07_7324 CE_M_09_4099 CE_M_10_4054 CE_M_10_4091 CE_M_09_3054 CE_M_09_3081 CE_M_09_2085 CE_M_10_3062 CE_M_10_3107 CE_M_10_2096 CE_M_10_2108 CE_M_10_2113 CE_R_10_0306 CE_R_10_0305 CE_R_10_0273 CE_R_11_0192 CE_R_11_0170 CE_R_11_0178 CE_R_11_0114 CE_R_11_0100 CE_R_11_0077 CE_R2_11_0134 CE_R2_11_0350 CE_R2_11_0374 CE_R_11_0270 CE_R_11_0251 CE_R_11_0238 CE_R_11_0240 CI_5997 CI_4990 CI_4864 CI_4835 CI_4909 CI_5943 CI_5947 CI_5889 CI_5906 CI_5328 CI_5357 CI_5429 CI_5692 CI_4458 CI_4447 CI_4428 CI_4424 CI_4411 CI_4395 CI_4383 CI_4378 CI_4360 CI_4353 CI_4356 CI_3290 CI_3299 CI_4108 CI_5245 CI_5254 CI_5265 CI_5292 CI_5300 CI_1943 CI_1964 CI_1969 CI_2061 CI_1920 CI_2055 CI_2004 CI_2009 CI_1884 CI_1874 CI_1875 CI_1092 CI_1117 CI_0955 CI_0918 CI_1096 CI_1109 CI_0973 CI_0987 CI_0927 CI_0898 CI_0884 CI_0893 CI_3889 CI_3986 CI_3943 CI_0405 CI_0677 CI_0685 CI_0765 CI_0458 CI_0450 CI_0453 CI_0346 CI_0783 CI_0334 CI_0322 CI_0325 CI_0182 CI_0168 CI_0136 CI_0165 CI_0637 CI_0392 CI_2840 CI_1915 CI_2328 CI_1415 CI_0292 CI_0094 CI_0697 CI_0699 CI_0609 CI_0532 CI_2950 CI_3036 CI_3043 CI_2695 CI_2499 CI_2533 CI_2536 CI_5198 CI_4806 CI_5034 CI_4102 CI_3856 CI_4071 CI_4079 CI_3879 CI_3812 CI_3252 CI_3421 CI_3609 CI_3643 CI_2864 CI_1845 CI_2510 CI_2605 CI_2423 CI_2439 CI_3074 CI_2989 CI_2991 CI_2230 CI_1660 CI_1653 CI_2548 CI_2705 CI_1799 CI_1636 CI_1636 CI_1799 CI_2705 CI_2548 CI_1653 CI_1660 CI_2230 CI_2991 CI_2989 CI_3074 CI_2439 CI_2423 CI_2605 CI_2510 CI_1845 CI_2864 CI_3643 CI_3609 CI_3421 CI_3252 CI_3812 CI_3879 CI_4079 CI_4071 CI_3856 CI_4102 CI_5034 CI_4806 CI_5198 CI_2536 CI_2533 CI_2499 CI_2695 CI_3043 CI_3036 CI_2950 CI_0532 CI_0609 CI_0699 CI_0697 CI_0094 CI_0292 CI_1415 CI_2328 CI_1915 CI_2840 CI_0392 CI_0637 CI_0165 CI_0136 CI_0168 CI_0182 CI_0325 CI_0322 CI_0334 CI_0783 CI_0346 CI_0453 CI_0450 CI_0458 CI_0765 CI_0685 CI_0677 CI_0405 CI_3943 CI_3986 CI_3889 CI_0893 CI_0884 CI_0898 CI_0927 CI_0987 CI_0973 CI_1109 CI_1096 CI_0918 CI_0955 CI_1117 CI_1092 CI_1875 CI_1874 CI_1884 CI_2009 CI_2004 CI_2055 CI_1920 CI_2061 CI_1969 CI_1964 CI_1943 CI_5300 CI_5292 CI_5265 CI_5254 CI_5245 CI_4108 CI_3299 CI_3290 CI_4356 CI_4353 CI_4360 CI_4378 CI_4383 CI_4395 CI_4411 CI_4424 CI_4428 CI_4447 CI_4458 CI_5692 CI_5429 CI_5357 CI_5328 CI_5906 CI_5889 CI_5947 CI_5943 CI_4909 CI_4835 CI_4864 CI_4990 CI_5997 CE_R_11_0240 CE_R_11_0238 CE_R_11_0251 CE_R_11_0270 CE_R2_11_0374 CE_R2_11_0350 CE_R2_11_0134 CE_R_11_0077 CE_R_11_0100 CE_R_11_0114 CE_R_11_0178 CE_R_11_0170 CE_R_11_0192 CE_R_10_0273 CE_R_10_0305 CE_R_10_0306 CE_M_10_2113 CE_M_10_2108 CE_M_10_2096 CE_M_10_3107 CE_M_10_3062 CE_M_09_2085 CE_M_09_3081 CE_M_09_3054 CE_M_10_4091 CE_M_10_4054 CE_M_09_4099 07_7324 07_7331 07_7314 08_0096 08_0099 08_1700 08_1714 08_1709 08_1711 08_0100 06_2866 06_6554 CE2_R2_11_1033 CE2_R2_11_1027 CE2_R2_11_3081 CE2_R_11_3113 CE2_R2_11_2018 CE2_R2_11_3085 CE2_R2_11_2022 CE2_R_11_3131 CE2_R_11_1063 CE2_R_11_3039 CE2_R2_11_3023 CE2_R2_11_1006 CE2_R_11_3021 CE2_R2_11_1009 CE2_R_11_3009 CGY_HR_103 CGY_HR_101 CGY_HR_098 CGY_HR_090 CGY_HR_109 CGY_HR_108 CGY_HR_118 CGY_HR_076 CGY_HR_074 CGY_HR_073 CGY_HR_083 CGY_HR_082 CGY_HR_080 CGY_HR_058 CGY_HR_061 CGY_HR_174 CGY_HR_170 CGY_HR_140 CGY_HR_139 CGY_HR_020 CGY_HR_022 CGY_HR_013 CGY_HR_026 CGY_HR_027 CGY_HR_009 CGY_HR_121 CHR_159 CHR_090 CHR_130 CHR_127 CHR_119 CHR_151 CHR_022 CHR_023 CHR_026 CHR_028 CHR_053 07_0549 07_0675 07_1009 07_0971 07_1493 07_1875 07_2174 07_2680 07_3508 07_3647 07_3853 07_4076 07_4269 07_4268 07_4428 07_3238 07_5041 07_5039 07_5038 07_6066 07_6017 07_6215 07_5583 07_5581 06_5176 06_4911 06_4734 06_6211 06_6212 06_5790 06_7656 06_7515 06_7332 06_7331 06_3569 06_3849 06_3245 06_3852 06_3851 06_3783 08_4461 08_4460 08_4466 08_4468 08_4472 08_4474 08_4603 08_5490 08_5176 08_4913 08_4697 08_4696 08_4456 08_7016 08_6877 08_7017 08_7039 08_6208 08_6160 08_5925 08_5603 0 0.2 0.4 0.6 0.8 Value 02004006008001000 Color Key and Histogram Count Clinical Animal Environmental A B C D E F G H I J K L M N O P Major clusters based on source factor  Subclusters further refined by spatial and temporal factors Results: epidemiological clustering of C. jejuni isolates
  15. 15. 15 08_5603 08_5925 08_6160 08_6208 08_7039 08_7017 08_6877 08_7016 08_4456 08_4696 08_4697 08_4913 08_5176 08_5490 08_4603 08_4474 08_4472 08_4468 08_4466 08_4460 08_4461 06_3783 06_3851 06_3852 06_3245 06_3849 06_3569 06_7331 06_7332 06_7515 06_7656 06_5790 06_6212 06_6211 06_4734 06_4911 06_5176 07_5581 07_5583 07_6215 07_6017 07_6066 07_5038 07_5039 07_5041 07_3238 07_4428 07_4268 07_4269 07_4076 07_3853 07_3647 07_3508 07_2680 07_2174 07_1875 07_1493 07_0971 07_1009 07_0675 07_0549 CHR_053 CHR_028 CHR_026 CHR_023 CHR_022 CHR_151 CHR_119 CHR_127 CHR_130 CHR_090 CHR_159 CGY_HR_121 CGY_HR_009 CGY_HR_027 CGY_HR_026 CGY_HR_013 CGY_HR_022 CGY_HR_020 CGY_HR_139 CGY_HR_140 CGY_HR_170 CGY_HR_174 CGY_HR_061 CGY_HR_058 CGY_HR_080 CGY_HR_082 CGY_HR_083 CGY_HR_073 CGY_HR_074 CGY_HR_076 CGY_HR_118 CGY_HR_108 CGY_HR_109 CGY_HR_090 CGY_HR_098 CGY_HR_101 CGY_HR_103 CE2_R_11_3009 CE2_R2_11_1009 CE2_R_11_3021 CE2_R2_11_1006 CE2_R2_11_3023 CE2_R_11_3039 CE2_R_11_1063 CE2_R_11_3131 CE2_R2_11_2022 CE2_R2_11_3085 CE2_R2_11_2018 CE2_R_11_3113 CE2_R2_11_3081 CE2_R2_11_1027 CE2_R2_11_1033 06_6554 06_2866 08_0100 08_1711 08_1709 08_1714 08_1700 08_0099 08_0096 07_7314 07_7331 07_7324 CE_M_09_4099 CE_M_10_4054 CE_M_10_4091 CE_M_09_3054 CE_M_09_3081 CE_M_09_2085 CE_M_10_3062 CE_M_10_3107 CE_M_10_2096 CE_M_10_2108 CE_M_10_2113 CE_R_10_0306 CE_R_10_0305 CE_R_10_0273 CE_R_11_0192 CE_R_11_0170 CE_R_11_0178 CE_R_11_0114 CE_R_11_0100 CE_R_11_0077 CE_R2_11_0134 CE_R2_11_0350 CE_R2_11_0374 CE_R_11_0270 CE_R_11_0251 CE_R_11_0238 CE_R_11_0240 CI_5997 CI_4990 CI_4864 CI_4835 CI_4909 CI_5943 CI_5947 CI_5889 CI_5906 CI_5328 CI_5357 CI_5429 CI_5692 CI_4458 CI_4447 CI_4428 CI_4424 CI_4411 CI_4395 CI_4383 CI_4378 CI_4360 CI_4353 CI_4356 CI_3290 CI_3299 CI_4108 CI_5245 CI_5254 CI_5265 CI_5292 CI_5300 CI_1943 CI_1964 CI_1969 CI_2061 CI_1920 CI_2055 CI_2004 CI_2009 CI_1884 CI_1874 CI_1875 CI_1092 CI_1117 CI_0955 CI_0918 CI_1096 CI_1109 CI_0973 CI_0987 CI_0927 CI_0898 CI_0884 CI_0893 CI_3889 CI_3986 CI_3943 CI_0405 CI_0677 CI_0685 CI_0765 CI_0458 CI_0450 CI_0453 CI_0346 CI_0783 CI_0334 CI_0322 CI_0325 CI_0182 CI_0168 CI_0136 CI_0165 CI_0637 CI_0392 CI_2840 CI_1915 CI_2328 CI_1415 CI_0292 CI_0094 CI_0697 CI_0699 CI_0609 CI_0532 CI_2950 CI_3036 CI_3043 CI_2695 CI_2499 CI_2533 CI_2536 CI_5198 CI_4806 CI_5034 CI_4102 CI_3856 CI_4071 CI_4079 CI_3879 CI_3812 CI_3252 CI_3421 CI_3609 CI_3643 CI_2864 CI_1845 CI_2510 CI_2605 CI_2423 CI_2439 CI_3074 CI_2989 CI_2991 CI_2230 CI_1660 CI_1653 CI_2548 CI_2705 CI_1799 CI_1636 CI_1636 CI_1799 CI_2705 CI_2548 CI_1653 CI_1660 CI_2230 CI_2991 CI_2989 CI_3074 CI_2439 CI_2423 CI_2605 CI_2510 CI_1845 CI_2864 CI_3643 CI_3609 CI_3421 CI_3252 CI_3812 CI_3879 CI_4079 CI_4071 CI_3856 CI_4102 CI_5034 CI_4806 CI_5198 CI_2536 CI_2533 CI_2499 CI_2695 CI_3043 CI_3036 CI_2950 CI_0532 CI_0609 CI_0699 CI_0697 CI_0094 CI_0292 CI_1415 CI_2328 CI_1915 CI_2840 CI_0392 CI_0637 CI_0165 CI_0136 CI_0168 CI_0182 CI_0325 CI_0322 CI_0334 CI_0783 CI_0346 CI_0453 CI_0450 CI_0458 CI_0765 CI_0685 CI_0677 CI_0405 CI_3943 CI_3986 CI_3889 CI_0893 CI_0884 CI_0898 CI_0927 CI_0987 CI_0973 CI_1109 CI_1096 CI_0918 CI_0955 CI_1117 CI_1092 CI_1875 CI_1874 CI_1884 CI_2009 CI_2004 CI_2055 CI_1920 CI_2061 CI_1969 CI_1964 CI_1943 CI_5300 CI_5292 CI_5265 CI_5254 CI_5245 CI_4108 CI_3299 CI_3290 CI_4356 CI_4353 CI_4360 CI_4378 CI_4383 CI_4395 CI_4411 CI_4424 CI_4428 CI_4447 CI_4458 CI_5692 CI_5429 CI_5357 CI_5328 CI_5906 CI_5889 CI_5947 CI_5943 CI_4909 CI_4835 CI_4864 CI_4990 CI_5997 CE_R_11_0240 CE_R_11_0238 CE_R_11_0251 CE_R_11_0270 CE_R2_11_0374 CE_R2_11_0350 CE_R2_11_0134 CE_R_11_0077 CE_R_11_0100 CE_R_11_0114 CE_R_11_0178 CE_R_11_0170 CE_R_11_0192 CE_R_10_0273 CE_R_10_0305 CE_R_10_0306 CE_M_10_2113 CE_M_10_2108 CE_M_10_2096 CE_M_10_3107 CE_M_10_3062 CE_M_09_2085 CE_M_09_3081 CE_M_09_3054 CE_M_10_4091 CE_M_10_4054 CE_M_09_4099 07_7324 07_7331 07_7314 08_0096 08_0099 08_1700 08_1714 08_1709 08_1711 08_0100 06_2866 06_6554 CE2_R2_11_1033 CE2_R2_11_1027 CE2_R2_11_3081 CE2_R_11_3113 CE2_R2_11_2018 CE2_R2_11_3085 CE2_R2_11_2022 CE2_R_11_3131 CE2_R_11_1063 CE2_R_11_3039 CE2_R2_11_3023 CE2_R2_11_1006 CE2_R_11_3021 CE2_R2_11_1009 CE2_R_11_3009 CGY_HR_103 CGY_HR_101 CGY_HR_098 CGY_HR_090 CGY_HR_109 CGY_HR_108 CGY_HR_118 CGY_HR_076 CGY_HR_074 CGY_HR_073 CGY_HR_083 CGY_HR_082 CGY_HR_080 CGY_HR_058 CGY_HR_061 CGY_HR_174 CGY_HR_170 CGY_HR_140 CGY_HR_139 CGY_HR_020 CGY_HR_022 CGY_HR_013 CGY_HR_026 CGY_HR_027 CGY_HR_009 CGY_HR_121 CHR_159 CHR_090 CHR_130 CHR_127 CHR_119 CHR_151 CHR_022 CHR_023 CHR_026 CHR_028 CHR_053 07_0549 07_0675 07_1009 07_0971 07_1493 07_1875 07_2174 07_2680 07_3508 07_3647 07_3853 07_4076 07_4269 07_4268 07_4428 07_3238 07_5041 07_5039 07_5038 07_6066 07_6017 07_6215 07_5583 07_5581 06_5176 06_4911 06_4734 06_6211 06_6212 06_5790 06_7656 06_7515 06_7332 06_7331 06_3569 06_3849 06_3245 06_3852 06_3851 06_3783 08_4461 08_4460 08_4466 08_4468 08_4472 08_4474 08_4603 08_5490 08_5176 08_4913 08_4697 08_4696 08_4456 08_7016 08_6877 08_7017 08_7039 08_6208 08_6160 08_5925 08_5603 0 0.2 0.4 0.6 0.8 Value 02004006008001000 Color Key and Histogram Count Clinical Animal Environmental A B C D E F G H I J K L M N O P 1 2 3 4 5  Clusters of secondary heat correspond to isolates with similar geography and temporal data, but different sources Results: epidemiological clustering of C. jejuni isolates
  16. 16. 16 Calibrating WGS typing for epidemiologic investigationsGenetic Similarity 07_1875 CI_2864 06_7515 CI_1415 06_3783 06_3849 06_3851 06_6554 CI_3889 07_5039 CGY_HR_073 CGY_HR_074 CI_0898 CI_0893 06_3852 08_1711 08_1709 CI_2548 CI_2423 CI_2991 CI_2989 CI_1799 CI_2009 CI_2004 CI_4102 CI_3609 CI_2328 CI_1653 CI_4079 CI_3036 CI_2605 CI_2536 CI_0532 CI_0699 CI_3074 CI_3043 CI_0450 CI_0458 CI_0453 CI_2695 CI_2533 CI_2510 CI_0765 CI_5034 CI_3986 CI_2950 CI_2705 CI_0697 CI_3252 CI_1636 CI_3856 CI_1660 CI_4990 CI_3943 CI_3812 CI_2230 CI_1845 CI_0168 CI_0182 CI_0392 CI_3299 CI_3290 07_0675 07_0549 CE_M_10_3107 07_7324  We can identify the clusters obtained at varying thresholds and compare them to epidemiological clusters to look for ‘best-fit’ An advantage of WGS is the flexibility in thresholding that is possible
  17. 17. 0.25 0.50 0.75 1.00 0 25 50 75 100 cgMLST Clustering Threshold (%) WeightedGlobalClusterCohesion WGEC_ns WGEC_ws WGGC_ns WGGC_ws 17 Calibrating WGS typing for epidemiologic investigations Genomic cluster homogeneity vs. Epidemiologic cluster homogeneity  Calculate point of highest genomic-cohesion while maintaining  Multi-isolate clusters  High epidemiologic validity
  18. 18. 18 Epi vs. Genomic clustering: examining the outliers  Strains with similar epidemiology aren’t necessarily similar genomically (and vice-versa!)  By overlaying the two methods, we can identify clusters that group together significantly stronger via genomic or epidemiologic relationships “Epi-Clustering “Genomic-Clustering”
  19. 19. 19 = stronger similarity via = stronger similarity via −1.0 −0.5 0.0 0.5 1.0 01000 0102030 Frequency Count (left−tail p = 0.05) ST−1244 ST−137 ST−19 ST−21 ST−2306 ST−2521 ST−262 ST−267 ST−3391 ST−3530 ST−42 ST−45 ST−459 ST−46 ST−48 ST−50 ST−5164 ST−52 ST−5619 ST−61 ST−679 ST−7694 ST−8 ST−922 ST−929 ST−982 0 10 20 30 Frequency Count (right−tail p = 0.05) “Generalist genotype” “Generalist source”  ‘Generalist’ genotypes  persist across many conbinations of source, temporal and spatial parameters  ‘Generalist’ reservoirs  support the persistence of a broad range of genotypes Epi vs. Genomic clustering: examining the outliers
  20. 20. 20 Summary  We have developed a model to help guide our analysis of Campylobacter WGS data for practical public health purposes  Systematic examination of the relationship between the genomic and epidemiological similarity of sets of isolates  optimization of clustering for epidemiologic relevance  Calculate point of highest genomic-cohesion while maintaining  High epidemiologic cohesion  Multi-isolate clusters  Interactive web application under development (Check it out!) https://hetmanb.shinyapps.io/EpiQuant/
  21. 21. 21 Acknowledgements People • Supervisors: Ed Taboada + Jim Thomas • Lab: Steven Mutschall (PHAC) Peter Kruczkiewicz (PHAC) Dillon Barker (PHAC/ULeth) Funding • University of Lethbridge • Public Health Agency of Canada A-base • Gov’t of Canada: Genomics Research and Development Initiative

×