Successfully reported this slideshow.
Upcoming SlideShare
×

# Fst, selection index

614 views

Published on

Genetic Epidemiology 2017 (same as previous version)

Published in: Healthcare
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Hello! I can recommend a site that has helped me. It's called ⇒ www.HelpWriting.net ⇐ They helped me for writing my quality research paper.

Are you sure you want to  Yes  No
• My brother found Custom Writing Service ⇒ www.WritePaper.info ⇐ and ordered a couple of works. Their customer service is outstanding, never left a query unanswered.

Are you sure you want to  Yes  No
• Hello! I have searched hard to find a reliable and best research paper writing service and finally i got a good option for my needs as ⇒ www.HelpWriting.net ⇐

Are you sure you want to  Yes  No
• Be the first to like this

### Fst, selection index

1. 1. FST & Some Selection Index 유전체역학 2017 김진섭 GSPH, SNU November 22, 2017 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 1 / 65
2. 2. Fst Contents 1 Fst Wright’s F-statistics Cockerham’s θ-statistics 2 Selection Index EHH iHS xp-EHH 3 Practice 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 2 / 65
3. 3. Fst Wright’s F-statistics 3 types of Heterozygosity[4] Individual, Subpopulation, Total Population 1 HI = 1 n n i=1 ˆHi 2 HS = 1 n n i=1 2pi qi 3 HT = 2¯p¯q ( ˆHi : observed heterozygosity in ith subpopulation, 2pi qi : average heterozygosity in ith subpopulation, 2¯p¯q: average heterozygosity of total population) Locus 별로 값 구한다. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 3 / 65
4. 4. Fst Wright’s F-statistics Wright’s F-statistics[4] 1 FIS = HS −HI HS 2 FST = HT −HS HT 3 FIT = HT −HI HT Example FST = 0 → Subpopulation의 eﬀect없다!! 차이 없다. FST = 1 → Subpopulation별로 차이가 크다. Simple relation 1 − FIT = (1 − FIS )(1 − FST ) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 4 / 65
5. 5. Fst Wright’s F-statistics http://academic.reed.edu/biology/professors/srenn/pages/ research/2011_students/sean/SM_thesis.html 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 5 / 65
6. 6. Fst Wright’s F-statistics http://www.johnderbyshire.com/Miscellaneous/Other/Fst.jpg 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 6 / 65
7. 7. Fst Wright’s F-statistics FST inference[5] Convenient measure of genetic diﬀerentiation. Most widely used descriptive statistics in population and evolutionary genetics. Natural selection in particular subpopulation. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 7 / 65
8. 8. Fst Wright’s F-statistics Problem in estimation HT = 2¯p¯q 1 Subpopulation마다 sample수가 다르면?? 2 Ex: SASIA 1000명, Oceania 100명.. 3 제대로 된 ¯p 추정이 아님. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 8 / 65
9. 9. Fst Cockerham’s θ-statistics ANOVA approach[1, 5] θ = σP σT (σP: variance due to population, σT : total variance) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 9 / 65
10. 10. Fst Cockerham’s θ-statistics Wright’s FST = Cockerham’s θ 실제 계산은 대부분 θ 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 10 / 65
11. 11. Fst Cockerham’s θ-statistics θ inference Population > 2 대세와 다른 population이 있다!! 어떤 population인지는 말 안해준다. Pairwise FST 2 population만 가지고 계산. 상대적인 비교. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 11 / 65
12. 12. Fst Cockerham’s θ-statistics 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 12 / 65
13. 13. Fst Cockerham’s θ-statistics Figure. FST calculated for each SNP between Tibetan and Han populations[6] 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 13 / 65
14. 14. Fst Cockerham’s θ-statistics Figure. Inter-population pairwise comparisons of FST statistics http://academic.reed.edu/biology/professors/srenn/pages/ research/2011_students/sean/SM_thesis.html 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 14 / 65
15. 15. Selection Index Contents 1 Fst Wright’s F-statistics Cockerham’s θ-statistics 2 Selection Index EHH iHS xp-EHH 3 Practice 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 15 / 65
16. 16. Selection Index 특정 인구집단에 특정 haplotype이 많냐?? Example: Erik Corona’s slide - Next slide 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 16 / 65
17. 17. Selection Index Population Genetics Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% Lactase + H2O 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 17 / 65
18. 18. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 18 / 65
19. 19. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% AATTGCAGATTACA <1% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 19 / 65
20. 20. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 22% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 9% AATTGCAGATTACA <1% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 20 / 65
21. 21. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 21% -1% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 24% AACTACAGATTACC 16% GATTACAGACTACA 7% AATTACAGATTACA 8% -1% AATTGCAGATTACA 2% +2% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 21 / 65
22. 22. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 21% -1% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 23% -1% AACTACAGATTACC 15% -1% GATTACAGACTACA 7% AATTACAGATTACA 7% -2% AATTGCAGATTACA 5% +5% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 22 / 65
23. 23. Selection Index Population Genetics Lactase + H2O Glucose HAPLOTYPES GATTACAGATTACA 20% -2% AATTACAGATTAAA 3% GACTACAGATTACC 19% GATTACCTATTAAC 23% -1% AACTACAGATTACC 15% -1% GATTACAGACTACA 6% -1% AATTACAGATTACA 5% -4% AATTGCAGATTACA 9% +9% 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 23 / 65
24. 24. Selection Index EHH EHH: Sabeti, Reich et al. (2002)[7] Extended Haplotype Homozygosity Random으로 2개 haplotype 뽑았을 때 그것이 같을 확률은?? 0 → haplotype이 다 다르다. 1 → haplotype이 모두 같다. 관심있는 haplotype을 Core라 한다. EHHt = s i=1 eti 2 ct 2 (t: core haplotype, c: the number of samples of a particular core haplotype, e: the number of samples of a particular extended haplotype, s: the number of unique extended haplotype) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 24 / 65
25. 25. Selection Index EHH How can we detect Pos. Sel.? AATTACAGATTACA 50 people have this GATTACAGATTACA 50 people have this ---- 50 KB ---- 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 25 / 65
26. 26. Selection Index EHH 50 KB + 20 KB = 70 KB__ AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 How can we detect Pos. Sel.? 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 26 / 65
27. 27. Selection Index EHH Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 27 / 65
28. 28. Selection Index EHH ( (3 2 5 2 7 2 8 2)+ Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 10 2)+( )+( )+ )+( )+6 2( )+4 2( )7 2 )50 2( ( 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 28 / 65
29. 29. Selection Index EHH )+ Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 10 2( )+ 8 2( )+7 2( )+5 2( )+3 2( )+6 2( )+4 2( )7 2( )50 2( 0.121 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 29 / 65
30. 30. Selection Index EHH EHH Drops Over Genetic Distance EHH drops off quickly over  genetic distance Starts with 1 Ends at 0 Every hap block will  eventually be unique 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 30 / 65
31. 31. Selection Index EHH AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 EHH What It Is & What It Isn’t Detects over‐representation of a haplotype This will raise the p(two haps are homozygous) Does NOT detect if a haplotype spread quickly Low recombination != spread quickly AATTACAGATTACA AACACGC 22 AATTACAGATTACA ATGATAG 28 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 31 / 65
32. 32. Selection Index EHH Compare EHH Scores AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 )+24 2( )26 2( )50 2( 0.121 0.490 Low Recombination Over Represented 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 32 / 65
33. 33. Selection Index EHH Can EHH Detect Pos. Sel.? 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 33 / 65
34. 34. Selection Index EHH Relative EHH Detects over‐representation of a haplotype Low recombination This will raise the p(two haps are homozygous) Does detect if a haplotype spread quickly Other haplotype blocks are controls! Recombination cold‐spot / hot‐spot agnostic Low score if both alleles are assoc. w/ high or  low recombination AATTACAGATTACA AACACGC 22 AATTACAGATTACA ATGATAG 28 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 34 / 65
35. 35. Selection Index EHH Extended Haplotype Homozygosity (EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 0.121 0.490 0.490 0.121 = 4.05REHH = 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 35 / 65
36. 36. Selection Index EHH REHH: Problem #1 We get a different REHH value at different genetic  distance cutoffs AATTACAGATTACA 50 GATTACAGATTACA 50 ---- 50 KB ---- REHH = 1.0 AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 GATTACAGATTACA CACATAG 24 GATTACAGATTACA CACACAG 26 ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 36 / 65
37. 37. Selection Index EHH Which REHH value to use? Extend to the right AGTTACAGATTACAAACACGC AAATACAGATTACAATGATAG AATTACAGATTACAAACCCAG AATTTCAGATTACACTGACAG AATTAAAGATTACACAGACAG AATTACCGATTACAAACACAG AATTACAAATTACACACACAG AATTACAGGTTACACACCCAG GATTACAGATTACACACATAG GATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 37 / 65
38. 38. Selection Index EHH …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 Which REHH value to use? Extend to the right 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 38 / 65
39. 39. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 39 / 65
40. 40. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 40 / 65
41. 41. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 41 / 65
42. 42. Selection Index EHH Which REHH value to use? Extend to the right …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 42 / 65
43. 43. Selection Index EHH Which REHH value to use? Extend to the left …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 43 / 65
44. 44. Selection Index EHH Which REHH value to use? Extend to the left …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 44 / 65
45. 45. Selection Index EHH Which REHH value to use? Extend to the left …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG ---------- 70 KB --------- REHH = 4.05 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 45 / 65
46. 46. Selection Index EHH REHH: Problem #2 REHH score is heavily  biased by allele  frequencies Must normalize P(REHH | Allele Freq.) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 46 / 65
47. 47. Selection Index EHH REHH: Problem #3 Not possible to detect  selection in high  frequency alleles Solution requires a X‐ population approach  (discussed later) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 47 / 65
48. 48. Selection Index EHH Leaves a lot to be desired Picking the maximum is arbitrary Why not the mean REHH score? Biased by allele frequency ln(REHH | allele freq) ~ norm dist. Still widely used and published with REHH Overview 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 48 / 65
49. 49. Selection Index EHH Site-speciﬁc EHH[9] 두 allele의 EHH값의 대략적인 평균(weight: squared allele frequencies) Focal SNP의 대략적인 EHH크기 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 49 / 65
50. 50. Selection Index iHS iHS: sabeti(2007)[8] 모든 위치에 대해 적분!!!!해서 비교 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 50 / 65
51. 51. Selection Index iHS Integrated Haplotype Score (iHS) Unstandardized iHS =  EHH y  x y = bwd distance x = fwd distance EHHD = derived allele EHHA = ancestral allele 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 51 / 65
52. 52. Selection Index iHS …ACAGATTACAGTTACAGATTACAAACACGC… …ACAGATTACAAATACAGATTACAATGATAG… …ACAGATTACAATTACAGATTACAAACCCAG… …ACAGATTACAATTTCAGATTACACTGACAG… …ACAGATTACAATTAAAGATTACACAGACAG… …ACAGATTACAATTACCGATTACAAACACAG… …ACAGATTACAATTACAAATTACACACACAG… …ACAGATTACAATTACAGTTACAACACCCAG… …TACAGATTAGATTACAGATTACACACATAG …TACAGATTAGATTACAGATTACACACACAG + 0.5 = 1.20.7 4.0 + 4.4 = 8.4 Unstandardized iHS ln(8.4/3.2)  =  0.419  Integrated Haplotype Score (iHS) 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 52 / 65
53. 53. Selection Index iHS iHS Characteristics As both alleles have the same AUC, iHS zero Large negative values indicate selection of allele in the  denominator Large positive values indicate selection of allele in the  numerator Still heavily biased by allele frequency! Z‐score normalization 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 53 / 65
54. 54. Selection Index iHS Unstandardized iHS ‐ E(iHS | Allele Frequency)  SD(iHS | Allele Frequency)  E(iHS | Allele Freq.):   Estimated from empirical distribution SD(iHS | Allele Freq.): Estimated from empirical distribution Integrated Haplotype Score (iHS) = iHS 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 54 / 65
55. 55. Selection Index iHS iHS Overview iHS and REHH are EHH based methods to detect  positive selection iHS outperforms REHH in specific allele frequencies They don’t completely outperform each other 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 55 / 65
56. 56. Selection Index iHS iHS: Problem #1 Still can’t detect selection in high frequency (old)  alleles Relatively High EHH values  are not present high  frequency (old) alleles! Use a reference population If pos. sel. didn’t take place  in ref. population, EHH is  high 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 56 / 65
57. 57. Selection Index xp-EHH xp-EHH: sabeti(2007)[8] Population 별, 같은 allele별 integreted EHH를 비교!! 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 57 / 65
58. 58. Selection Index xp-EHH Cross Population EHH (XP‐EHH) AATTACAGATTACA AACACGC 10 AATTACAGATTACA ATGATAG 8 AATTACAGATTACA AACCCAG 7 AATTACAGATTACA CTGACAG 5 AATTACAGATTACA CAGACAG 3 AATTACAGATTACA AACACAG 6 AATTACAGATTACA CACACAG 4 AATTACAGATTACA CACCCAG 7 Same allele but diff population AATTACAGATTACA CACATAG 20 AATTACAGATTACA CACACAG 30 0.5 XP‐EHH = ln(3.3/0.5) = 1.89  Z‐score Norn Integrate EHH over distance from allele Calculated for fwd/rev sides independently Integrate until EHH = 0.04 in e.a. population 3.3 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 58 / 65
59. 59. Selection Index xp-EHH REHH and iHS are more or less complementary e.a. is better at detecting pos. sel. at diff freqs. XP‐EHH Can detect pos. sel. in high freq. alleles Susceptible to population variation in  recombination rate Overview 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 59 / 65
60. 60. Selection Index xp-EHH Final Verdict: REHH vs iHS vs XP‐EHH REHH iHS test XP‐EHH 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 60 / 65
61. 61. Selection Index xp-EHH Rsb[9] Population끼리 비교하는 또다른 지표. Population별로만 비교. Locus별로 두 allele의 integrated EHH의 average: iES Locus의 대략적인 selection정도를 population끼리 비교. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 61 / 65
62. 62. Practice Contents 1 Fst Wright’s F-statistics Cockerham’s θ-statistics 2 Selection Index EHH iHS xp-EHH 3 Practice 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 62 / 65
63. 63. Practice FST hierfstat[3] PER3 gene in HGDP(Human Genome Diversity Panel): 289 SNPs & 7 population EHH, iHS rehh[2] 패키지 자체 제공 예제 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 63 / 65
64. 64. Practice Reference I [1] Cockerham, C. C. (1969). Variance of gene frequencies. Evolution, pages 72–84. [2] Gautier, M. and Vitalis, R. (2012). rehh: an r package to detect footprints of selection in genome-wide snp data from haplotype structure. Bioinformatics, 28(8):1176–1177. [3] Goudet, J. (2005). Hierfstat, a package for r to compute and test hierarchical f-statistics. Molecular Ecology Notes, 5(1):184–186. [4] Hamilton, M. (2011). Population genetics. John Wiley & Sons. [5] Holsinger, K. E. and Weir, B. S. (2009). Genetics in geographically structured populations: deﬁning, estimating and interpreting fst. Nature Reviews Genetics, 10(9):639–650. [6] Huerta-S´anchez, E., Jin, X., Bianba, Z., Peter, B. M., Vinckenbosch, N., Liang, Y., Yi, X., He, M., Somel, M., Ni, P., et al. (2014). Altitude adaptation in tibetans caused by introgression of denisovan-like dna. Nature, 512(7513):194–197. [7] Sabeti, P. C., Reich, D. E., Higgins, J. M., Levine, H. Z., Richter, D. J., Schaﬀner, S. F., Gabriel, S. B., Platko, J. V., Patterson, N. J., McDonald, G. J., et al. (2002). Detecting recent positive selection in the human genome from haplotype structure. Nature, 419(6909):832–837. [8] Sabeti, P. C., Varilly, P., Fry, B., Lohmueller, J., Hostetter, E., Cotsapas, C., Xie, X., Byrne, E. H., McCarroll, S. A., Gaudet, R., et al. (2007). Genome-wide detection and characterization of positive selection in human populations. Nature, 449(7164):913–918. [9] Tang, K., Thornton, K. R., and Stoneking, M. (2007). A new approach for using genome scans to detect recent positive selection in the human genome. PLoS biology, 5(7):e171. 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 64 / 65
65. 65. Practice END Email : secondmath85@gmail.com Oﬃce: (02)880-2743 H.P: 010-9192-5385 김진섭 (GSPH, SNU) FST & Some Selection Index November 22, 2017 65 / 65