SlideShare a Scribd company logo
FST & Some Selection Index
진화, 인구집단 유전학과 건강 2014
김진섭
GSPH, SNU
October 29, 2014
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 1 / 65
Fst
Contents
1 Fst
Wright’s F-statistics
Cockerham’s θ-statistics
2 Selection Index
EHH
iHS
xp-EHH
3 Practice
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 2 / 65
Fst Wright’s F-statistics
3 types of Heterozygosity[4]
Individual, Subpopulation, Total Population
1 HI = 1
n
n
i=1
ˆHi
2 HS = 1
n
n
i=1 2pi qi
3 HT = 2¯p¯q
( ˆHi : observed heterozygosity in ith subpopulation, 2pi qi : average
heterozygosity in ith subpopulation, 2¯p¯q: average heterozygosity of total
population)
Locus 별로 값 구한다.
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 3 / 65
Fst Wright’s F-statistics
Wright’s F-statistics[4]
1 FIS = HS −HI
HS
2 FST = HT −HS
HT
3 FIT = HT −HI
HT
Example
FST = 0 → Subpopulation의 effect없다!! 차이 없다.
FST = 1 → Subpopulation별로 차이가 크다.
Simple relation
1 − FIT = (1 − FIS )(1 − FST )
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 4 / 65
Fst Wright’s F-statistics
http://academic.reed.edu/biology/professors/srenn/pages/
research/2011_students/sean/SM_thesis.html
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 5 / 65
Fst Wright’s F-statistics
http://www.johnderbyshire.com/Miscellaneous/Other/Fst.jpg
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 6 / 65
Fst Wright’s F-statistics
FST inference[5]
Convenient measure of genetic differentiation.
Most widely used descriptive statistics in population and
evolutionary genetics.
Natural selection in particular subpopulation.
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 7 / 65
Fst Wright’s F-statistics
Problem in estimation
HT = 2¯p¯q
1 Subpopulation마다 sample수가 다르면??
2 Ex: SASIA 1000명, Oceania 100명..
3 제대로 된 ¯p 추정이 아님.
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 8 / 65
Fst Cockerham’s θ-statistics
ANOVA approach[1, 5]
θ =
σP
σT
(σP: variance due to population, σT : total variance)
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 9 / 65
Fst Cockerham’s θ-statistics
Wright’s FST = Cockerham’s θ
실제 계산은 대부분 θ
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 10 / 65
Fst Cockerham’s θ-statistics
θ inference
Population > 2
대세와 다른 population이 있다!!
어떤 population인지는 말 안해준다.
Pairwise FST
2 population만 가지고 계산.
상대적인 비교.
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 11 / 65
Fst Cockerham’s θ-statistics
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 12 / 65
Fst Cockerham’s θ-statistics
Figure: FST calculated for each SNP between Tibetan and Han populations[6]
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 13 / 65
Fst Cockerham’s θ-statistics
Figure: Inter-population pairwise comparisons of FST statistics
http://academic.reed.edu/biology/professors/srenn/pages/
research/2011_students/sean/SM_thesis.html
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 14 / 65
Selection Index
Contents
1 Fst
Wright’s F-statistics
Cockerham’s θ-statistics
2 Selection Index
EHH
iHS
xp-EHH
3 Practice
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 15 / 65
Selection Index
특정 인구집단에 특정 haplotype이 많냐??
Example: Erik Corona’s slide - Next slide
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 16 / 65
Selection Index
Population Genetics
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATTAAC 24%
AACTACAGATTACC 16%
GATTACAGACTACA 7%
AATTACAGATTACA 9%
Lactase + H2O
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 17 / 65
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATTAAC 24%
AACTACAGATTACC 16%
GATTACAGACTACA 7%
AATTACAGATTACA 9%
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 18 / 65
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATTAAC 24%
AACTACAGATTACC 16%
GATTACAGACTACA 7%
AATTACAGATTACA 9%
AATTGCAGATTACA <1%
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 19 / 65
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 22%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATTAAC 24%
AACTACAGATTACC 16%
GATTACAGACTACA 7%
AATTACAGATTACA 9%
AATTGCAGATTACA <1%
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 20 / 65
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 21% -1%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATTAAC 24%
AACTACAGATTACC 16%
GATTACAGACTACA 7%
AATTACAGATTACA 8% -1%
AATTGCAGATTACA 2% +2%
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 21 / 65
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 21% -1%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATTAAC 23% -1%
AACTACAGATTACC 15% -1%
GATTACAGACTACA 7%
AATTACAGATTACA 7% -2%
AATTGCAGATTACA 5% +5%
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 22 / 65
Selection Index
Population Genetics
Lactase + H2O
Glucose
HAPLOTYPES
GATTACAGATTACA 20% -2%
AATTACAGATTAAA 3%
GACTACAGATTACC 19%
GATTACCTATTAAC 23% -1%
AACTACAGATTACC 15% -1%
GATTACAGACTACA 6% -1%
AATTACAGATTACA 5% -4%
AATTGCAGATTACA 9% +9%
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 23 / 65
Selection Index EHH
EHH: Sabeti, Reich et al. (2002)[7]
Extended Haplotype Homozygosity
Random으로 2개 haplotype 뽑았을 때 그것이 같을 확률은??
0 → haplotype이 다 다르다.
1 → haplotype이 모두 같다.
관심있는 haplotype을 Core라 한다.
EHHt =
s
i=1
eti
2
ct
2
(t: core haplotype, c: the number of samples of a particular core
haplotype, e: the number of samples of a particular extended haplotype, s:
the number of unique extended haplotype)
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 24 / 65
Selection Index EHH
How can we detect Pos. Sel.?
AATTACAGATTACA 50 people have this
GATTACAGATTACA 50 people have this
---- 50 KB ----
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 25 / 65
Selection Index EHH
50 KB + 20 KB = 70 KB__
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
How can we detect Pos. Sel.?
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 26 / 65
Selection Index EHH
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 27 / 65
Selection Index EHH
( (3
2
5
2
7
2
8
2)+
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
10
2)+( )+( )+ )+( )+6
2( )+4
2( )7
2
)50
2(
(
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 28 / 65
Selection Index EHH
)+
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
10
2( )+ 8
2( )+7
2( )+5
2( )+3
2( )+6
2( )+4
2( )7
2(
)50
2(
0.121
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 29 / 65
Selection Index EHH
EHH Drops Over Genetic Distance
EHH drops off quickly over 
genetic distance
Starts with 1
Ends at 0
Every hap block will 
eventually be unique
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 30 / 65
Selection Index EHH
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
EHH What It Is & What It Isn’t
Detects over‐representation of a haplotype
This will raise the p(two haps are homozygous)
Does NOT detect if a haplotype spread quickly
Low recombination != spread quickly
AATTACAGATTACA AACACGC 22
AATTACAGATTACA ATGATAG 28
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 31 / 65
Selection Index EHH
Compare EHH Scores
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
)+24
2( )26
2(
)50
2(
0.121
0.490
Low Recombination
Over Represented
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 32 / 65
Selection Index EHH
Can EHH Detect Pos. Sel.?
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 33 / 65
Selection Index EHH
Relative EHH
Detects over‐representation of a haplotype
Low recombination
This will raise the p(two haps are homozygous)
Does detect if a haplotype spread quickly
Other haplotype blocks are controls!
Recombination cold‐spot / hot‐spot agnostic
Low score if both alleles are assoc. w/ high or 
low recombination
AATTACAGATTACA AACACGC 22
AATTACAGATTACA ATGATAG 28
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 34 / 65
Selection Index EHH
Extended Haplotype Homozygosity (EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
0.121
0.490
0.490
0.121
= 4.05REHH =
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 35 / 65
Selection Index EHH
REHH: Problem #1
We get a different REHH value at different genetic 
distance cutoffs
AATTACAGATTACA 50
GATTACAGATTACA 50
---- 50 KB ----
REHH = 1.0
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
GATTACAGATTACA CACATAG 24
GATTACAGATTACA CACACAG 26
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 36 / 65
Selection Index EHH
Which REHH value to use?
Extend to the right
AGTTACAGATTACAAACACGC
AAATACAGATTACAATGATAG
AATTACAGATTACAAACCCAG
AATTTCAGATTACACTGACAG
AATTAAAGATTACACAGACAG
AATTACCGATTACAAACACAG
AATTACAAATTACACACACAG
AATTACAGGTTACACACCCAG
GATTACAGATTACACACATAG
GATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 37 / 65
Selection Index EHH
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
Which REHH value to use?
Extend to the right
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 38 / 65
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 39 / 65
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 40 / 65
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 41 / 65
Selection Index EHH
Which REHH value to use?
Extend to the right
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 42 / 65
Selection Index EHH
Which REHH value to use?
Extend to the left
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 43 / 65
Selection Index EHH
Which REHH value to use?
Extend to the left
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 44 / 65
Selection Index EHH
Which REHH value to use?
Extend to the left
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
---------- 70 KB ---------
REHH = 4.05
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 45 / 65
Selection Index EHH
REHH: Problem #2
REHH score is heavily 
biased by allele 
frequencies
Must normalize
P(REHH | Allele Freq.)
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 46 / 65
Selection Index EHH
REHH: Problem #3
Not possible to detect 
selection in high 
frequency alleles
Solution requires a X‐
population approach 
(discussed later)
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 47 / 65
Selection Index EHH
Leaves a lot to be desired
Picking the maximum is arbitrary
Why not the mean REHH score?
Biased by allele frequency
ln(REHH | allele freq) ~ norm dist.
Still widely used and published with
REHH Overview
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 48 / 65
Selection Index EHH
Site-specific EHH[9]
두 allele의 EHH값의 대략적인 평균(weight: squared allele frequencies)
Focal SNP의 대략적인 EHH크기
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 49 / 65
Selection Index iHS
iHS: sabeti(2007)[8]
모든 위치에 대해 적분!!!!해서 비교
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 50 / 65
Selection Index iHS
Integrated Haplotype Score (iHS)
Unstandardized iHS = 
EHH
y  x
y = bwd distance
x = fwd distance
EHHD = derived allele
EHHA = ancestral allele
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 51 / 65
Selection Index iHS
…ACAGATTACAGTTACAGATTACAAACACGC…
…ACAGATTACAAATACAGATTACAATGATAG…
…ACAGATTACAATTACAGATTACAAACCCAG…
…ACAGATTACAATTTCAGATTACACTGACAG…
…ACAGATTACAATTAAAGATTACACAGACAG…
…ACAGATTACAATTACCGATTACAAACACAG…
…ACAGATTACAATTACAAATTACACACACAG…
…ACAGATTACAATTACAGTTACAACACCCAG…
…TACAGATTAGATTACAGATTACACACATAG
…TACAGATTAGATTACAGATTACACACACAG
+ 0.5 = 1.20.7
4.0 + 4.4 = 8.4
Unstandardized iHS
ln(8.4/3.2)  =  0.419 
Integrated Haplotype Score (iHS)
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 52 / 65
Selection Index iHS
iHS Characteristics
As both alleles have the same AUC, iHS zero
Large negative values indicate selection of allele in the 
denominator
Large positive values indicate selection of allele in the 
numerator
Still heavily biased by allele frequency!
Z‐score normalization
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 53 / 65
Selection Index iHS
Unstandardized iHS ‐ E(iHS | Allele Frequency) 
SD(iHS | Allele Frequency) 
E(iHS | Allele Freq.):   Estimated from empirical distribution
SD(iHS | Allele Freq.): Estimated from empirical distribution
Integrated Haplotype Score (iHS)
= iHS
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 54 / 65
Selection Index iHS
iHS Overview
iHS and REHH are EHH based methods to detect 
positive selection
iHS outperforms REHH in specific allele frequencies
They don’t completely outperform each other
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 55 / 65
Selection Index iHS
iHS: Problem #1
Still can’t detect selection in high frequency (old) 
alleles
Relatively High EHH values 
are not present high 
frequency (old) alleles!
Use a reference population
If pos. sel. didn’t take place 
in ref. population, EHH is 
high
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 56 / 65
Selection Index xp-EHH
xp-EHH: sabeti(2007)[8]
Population 별, 같은 allele별 integreted EHH를 비교!!
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 57 / 65
Selection Index xp-EHH
Cross Population EHH (XP‐EHH)
AATTACAGATTACA AACACGC 10
AATTACAGATTACA ATGATAG 8
AATTACAGATTACA AACCCAG 7
AATTACAGATTACA CTGACAG 5
AATTACAGATTACA CAGACAG 3
AATTACAGATTACA AACACAG 6
AATTACAGATTACA CACACAG 4
AATTACAGATTACA CACCCAG 7
Same allele but diff population
AATTACAGATTACA CACATAG 20
AATTACAGATTACA CACACAG 30
0.5
XP‐EHH = ln(3.3/0.5) = 1.89  Z‐score Norn
Integrate EHH over distance from allele
Calculated for fwd/rev sides independently
Integrate until EHH = 0.04 in e.a. population
3.3
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 58 / 65
Selection Index xp-EHH
REHH and iHS are more or less complementary
e.a. is better at detecting pos. sel. at diff freqs.
XP‐EHH
Can detect pos. sel. in high freq. alleles
Susceptible to population variation in 
recombination rate
Overview
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 59 / 65
Selection Index xp-EHH
Final Verdict: REHH vs iHS vs XP‐EHH
REHH
iHS test
XP‐EHH
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 60 / 65
Selection Index xp-EHH
Rsb[9]
Population끼리 비교하는 또다른 지표.
Population별로만 비교.
Locus별로 두 allele의 integrated EHH의 average: iES
Locus의 대략적인 selection정도를 population끼리 비교.
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 61 / 65
Practice
Contents
1 Fst
Wright’s F-statistics
Cockerham’s θ-statistics
2 Selection Index
EHH
iHS
xp-EHH
3 Practice
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 62 / 65
Practice
FST
hierfstat[3]
PER3 gene in HGDP(Human Genome Diversity Panel): 289 SNPs &
7 population
EHH, iHS
rehh[2]
패키지 자체 제공 예제
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 63 / 65
Practice
Reference I
[1] Cockerham, C. C. (1969). Variance of gene frequencies. Evolution, pages 72–84.
[2] Gautier, M. and Vitalis, R. (2012). rehh: an r package to detect footprints of selection in genome-wide snp data from
haplotype structure. Bioinformatics, 28(8):1176–1177.
[3] Goudet, J. (2005). Hierfstat, a package for r to compute and test hierarchical f-statistics. Molecular Ecology Notes,
5(1):184–186.
[4] Hamilton, M. (2011). Population genetics. John Wiley & Sons.
[5] Holsinger, K. E. and Weir, B. S. (2009). Genetics in geographically structured populations: defining, estimating and
interpreting fst. Nature Reviews Genetics, 10(9):639–650.
[6] Huerta-S´anchez, E., Jin, X., Bianba, Z., Peter, B. M., Vinckenbosch, N., Liang, Y., Yi, X., He, M., Somel, M., Ni, P., et al.
(2014). Altitude adaptation in tibetans caused by introgression of denisovan-like dna. Nature, 512(7513):194–197.
[7] Sabeti, P. C., Reich, D. E., Higgins, J. M., Levine, H. Z., Richter, D. J., Schaffner, S. F., Gabriel, S. B., Platko, J. V.,
Patterson, N. J., McDonald, G. J., et al. (2002). Detecting recent positive selection in the human genome from haplotype
structure. Nature, 419(6909):832–837.
[8] Sabeti, P. C., Varilly, P., Fry, B., Lohmueller, J., Hostetter, E., Cotsapas, C., Xie, X., Byrne, E. H., McCarroll, S. A.,
Gaudet, R., et al. (2007). Genome-wide detection and characterization of positive selection in human populations. Nature,
449(7164):913–918.
[9] Tang, K., Thornton, K. R., and Stoneking, M. (2007). A new approach for using genome scans to detect recent positive
selection in the human genome. PLoS biology, 5(7):e171.
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 64 / 65
Practice
END
Email : secondmath85@gmail.com
Office: (02)880-2743
H.P: 010-9192-5385
김진섭 (GSPH, SNU) FST & Some Selection Index October 29, 2014 65 / 65

More Related Content

More from Jinseob Kim

Regression Basic : MLE
Regression  Basic : MLERegression  Basic : MLE
Regression Basic : MLE
Jinseob Kim
 
iHS calculation in R
iHS calculation in RiHS calculation in R
iHS calculation in R
Jinseob Kim
 
Fst in R
Fst in R Fst in R
Fst in R
Jinseob Kim
 
질병부담계산: Dismod mr gbd2010
질병부담계산: Dismod mr gbd2010질병부담계산: Dismod mr gbd2010
질병부담계산: Dismod mr gbd2010
Jinseob Kim
 
DALY & QALY
DALY & QALYDALY & QALY
DALY & QALY
Jinseob Kim
 
Case-crossover study
Case-crossover studyCase-crossover study
Case-crossover study
Jinseob Kim
 
Generalized Additive Model
Generalized Additive Model Generalized Additive Model
Generalized Additive Model
Jinseob Kim
 
Deep Learning by JSKIM (Korean)
Deep Learning by JSKIM (Korean)Deep Learning by JSKIM (Korean)
Deep Learning by JSKIM (Korean)
Jinseob Kim
 
Machine Learning Introduction
Machine Learning IntroductionMachine Learning Introduction
Machine Learning Introduction
Jinseob Kim
 
Tree advanced
Tree advancedTree advanced
Tree advanced
Jinseob Kim
 
Deep learning by JSKIM
Deep learning by JSKIMDeep learning by JSKIM
Deep learning by JSKIM
Jinseob Kim
 
Main result
Main result Main result
Main result
Jinseob Kim
 
Multilevel study
Multilevel study Multilevel study
Multilevel study
Jinseob Kim
 
GEE & GLMM in GWAS
GEE & GLMM in GWASGEE & GLMM in GWAS
GEE & GLMM in GWAS
Jinseob Kim
 
Whole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian LassoWhole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian Lasso
Jinseob Kim
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
Jinseob Kim
 
R Introduction & auto make table1
R Introduction & auto make table1R Introduction & auto make table1
R Introduction & auto make table1Jinseob Kim
 
Think bayes
Think bayesThink bayes
Think bayes
Jinseob Kim
 

More from Jinseob Kim (18)

Regression Basic : MLE
Regression  Basic : MLERegression  Basic : MLE
Regression Basic : MLE
 
iHS calculation in R
iHS calculation in RiHS calculation in R
iHS calculation in R
 
Fst in R
Fst in R Fst in R
Fst in R
 
질병부담계산: Dismod mr gbd2010
질병부담계산: Dismod mr gbd2010질병부담계산: Dismod mr gbd2010
질병부담계산: Dismod mr gbd2010
 
DALY & QALY
DALY & QALYDALY & QALY
DALY & QALY
 
Case-crossover study
Case-crossover studyCase-crossover study
Case-crossover study
 
Generalized Additive Model
Generalized Additive Model Generalized Additive Model
Generalized Additive Model
 
Deep Learning by JSKIM (Korean)
Deep Learning by JSKIM (Korean)Deep Learning by JSKIM (Korean)
Deep Learning by JSKIM (Korean)
 
Machine Learning Introduction
Machine Learning IntroductionMachine Learning Introduction
Machine Learning Introduction
 
Tree advanced
Tree advancedTree advanced
Tree advanced
 
Deep learning by JSKIM
Deep learning by JSKIMDeep learning by JSKIM
Deep learning by JSKIM
 
Main result
Main result Main result
Main result
 
Multilevel study
Multilevel study Multilevel study
Multilevel study
 
GEE & GLMM in GWAS
GEE & GLMM in GWASGEE & GLMM in GWAS
GEE & GLMM in GWAS
 
Whole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian LassoWhole Genome Regression using Bayesian Lasso
Whole Genome Regression using Bayesian Lasso
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
 
R Introduction & auto make table1
R Introduction & auto make table1R Introduction & auto make table1
R Introduction & auto make table1
 
Think bayes
Think bayesThink bayes
Think bayes
 

Recently uploaded

Identification and nursing management of congenital malformations .pptx
Identification and nursing management of congenital malformations .pptxIdentification and nursing management of congenital malformations .pptx
Identification and nursing management of congenital malformations .pptx
MGM SCHOOL/COLLEGE OF NURSING
 
Physiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdfPhysiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdf
MedicoseAcademics
 
Top 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in IndiaTop 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in India
Swastik Ayurveda
 
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptxTemporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
Dr. Rabia Inam Gandapore
 
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Oleg Kshivets
 
Non-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdfNon-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdf
MedicoseAcademics
 
Best Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and IndigestionBest Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and Indigestion
Swastik Ayurveda
 
Aortic Association CBL Pilot April 19 – 20 Bern
Aortic Association CBL Pilot April 19 – 20 BernAortic Association CBL Pilot April 19 – 20 Bern
Aortic Association CBL Pilot April 19 – 20 Bern
suvadeepdas911
 
Hemodialysis: Chapter 4, Dialysate Circuit - Dr.Gawad
Hemodialysis: Chapter 4, Dialysate Circuit - Dr.GawadHemodialysis: Chapter 4, Dialysate Circuit - Dr.Gawad
Hemodialysis: Chapter 4, Dialysate Circuit - Dr.Gawad
NephroTube - Dr.Gawad
 
263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,
sisternakatoto
 
Pictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdfPictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdf
Dr. Rabia Inam Gandapore
 
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptxPharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Dr. Rabia Inam Gandapore
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
The Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic PrinciplesThe Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic Principles
MedicoseAcademics
 
Colonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implicationsColonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implications
Dr Maria Tamanna
 
Cardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdfCardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdf
shivalingatalekar1
 
NVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control programNVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control program
Sapna Thakur
 
Light House Retreats: Plant Medicine Retreat Europe
Light House Retreats: Plant Medicine Retreat EuropeLight House Retreats: Plant Medicine Retreat Europe
Light House Retreats: Plant Medicine Retreat Europe
Lighthouse Retreat
 
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists  Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Saeid Safari
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
د.محمود نجيب
 

Recently uploaded (20)

Identification and nursing management of congenital malformations .pptx
Identification and nursing management of congenital malformations .pptxIdentification and nursing management of congenital malformations .pptx
Identification and nursing management of congenital malformations .pptx
 
Physiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdfPhysiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdf
 
Top 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in IndiaTop 10 Best Ayurvedic Kidney Stone Syrups in India
Top 10 Best Ayurvedic Kidney Stone Syrups in India
 
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptxTemporomandibular Joint By RABIA INAM GANDAPORE.pptx
Temporomandibular Joint By RABIA INAM GANDAPORE.pptx
 
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
Local Advanced Lung Cancer: Artificial Intelligence, Synergetics, Complex Sys...
 
Non-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdfNon-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdf
 
Best Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and IndigestionBest Ayurvedic medicine for Gas and Indigestion
Best Ayurvedic medicine for Gas and Indigestion
 
Aortic Association CBL Pilot April 19 – 20 Bern
Aortic Association CBL Pilot April 19 – 20 BernAortic Association CBL Pilot April 19 – 20 Bern
Aortic Association CBL Pilot April 19 – 20 Bern
 
Hemodialysis: Chapter 4, Dialysate Circuit - Dr.Gawad
Hemodialysis: Chapter 4, Dialysate Circuit - Dr.GawadHemodialysis: Chapter 4, Dialysate Circuit - Dr.Gawad
Hemodialysis: Chapter 4, Dialysate Circuit - Dr.Gawad
 
263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,263778731218 Abortion Clinic /Pills In Harare ,
263778731218 Abortion Clinic /Pills In Harare ,
 
Pictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdfPictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdf
 
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptxPharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
Pharynx and Clinical Correlations BY Dr.Rabia Inam Gandapore.pptx
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
 
The Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic PrinciplesThe Electrocardiogram - Physiologic Principles
The Electrocardiogram - Physiologic Principles
 
Colonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implicationsColonic and anorectal physiology with surgical implications
Colonic and anorectal physiology with surgical implications
 
Cardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdfCardiac Assessment for B.sc Nursing Student.pdf
Cardiac Assessment for B.sc Nursing Student.pdf
 
NVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control programNVBDCP.pptx Nation vector borne disease control program
NVBDCP.pptx Nation vector borne disease control program
 
Light House Retreats: Plant Medicine Retreat Europe
Light House Retreats: Plant Medicine Retreat EuropeLight House Retreats: Plant Medicine Retreat Europe
Light House Retreats: Plant Medicine Retreat Europe
 
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists  Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
 

Selection index population_genetics