SlideShare a Scribd company logo
COMPREHENSIVE CATALOG OF
STATISTICAL FORMULAE,
ALGORITHMS AND SOFTWARE – A
STEP TOWARDS GOOD STATISTICS
PRACTICE IN FORENSIC GENETICS
Nikita N. Khromov-Borisov,Nikita N. Khromov-Borisov,
Andrew G. Smolyanitsky
Forensic Medicine Bureau of Leningrad District
Saint Petersburg, Russia
Nikita.KhromovBorisov@gmail.com
Andrew.Smolyanitsky@yandex.ru
Quotations of the day
If your experiment requires statistics,
then you ought to have done a better experiment
Ernest Rutherford
Statistical thinking will one day be as necessary
for efficient citizenship as the ability to read and writefor efficient citizenship as the ability to read and write
Herbert George Wells
Those who ignore Statistics are condemned to reinvent it
Bradley Efron
If Experimentation is the Queen of the Sciences,
then Statistical Methods must be regarded
as Guardians of the Royal Virtue
Myron Tribus
GSP – Good Statistics
Practice is what we need
Obviously, in their turn, statistical methods must be blameless and
perfect
So there is an urgent need for the comprehensive catalog of
carefully checked and approved formulae
as well as corresponding algorithms and software.as well as corresponding algorithms and software.
Unfortunately, some of them are published initially with errors which are
reproduced in subsequent sources.
Example of corrections:
Clayton T. M., Foreman L. A., Carracedo A.
FORENSIC SCIENCE INTERNATIONAL
Vol. 125: No. 2-3 p. 284-284, 2002
Motherless case in paternity testing by Lee et al.
Elements of the Statistical Design of
Experiment
Some formulae are in rare use or forgotten.
Example: Chakraborty’s formula for the sample size required
to reach the representativeness (reliability, “saturation”) of
the reference population samples. Human Biology 64 (1992)
141-159:141-159:
ln[1 - (1 - α)1/r]
Nmin= --------------
4 ln(1 – Pmin)
Nmin - minimum number of independent individuals to be analyzed,
a - probability of error,
r - number of alleles revealed by the system,
Pmin- minimum allele frequency.
Minimum sample sizes required
for the reference population data
Minimum
allele
frequency
No. of
alleles
Error Sample size,
No. of
individuals
P r α NPmin r α Nmin
0.01 2 - 25 0.001-0.0001 190 - 310
0.005 2 – 25 0.001-0.0001 380 - 620
0.001 2 - 25 0.001-0.0001 1900 - 3100
Template for paternity testing
Mother Child Tested man
JK JK JJ JK JW
PI 1/(pJ +pK) 0.5/(pJ +pK)
JJ JJ JJ JWJJ
JK
KK
KW
JJ
JJ
JK
JK
JJ
JJ
JJ
JJ
JK
JK
JK
JW
JW
JW
JW JZ
PI 1/pJ 0.5/pJ
Obligative paternal alleles are in red color
False genotyping is excluded
C. C. Li and A. Chakravarti
alternatives
Paternity probability based on Nonexclusion
P0
W=-------------------------------
P0 + (1-P0)(1- E1)…(1-Et)P0 + (1-P0)(1- E1)…(1-Et)
Ei – probability of exclusion for i-th test
P0 can be estimated from long-term records
Am. J. Hum. Genet. 43 (1) 197-205 (1988)
Coincidental DNA matches
Match probability as a property of a locus:
M0 = 2(sum pi
2)2 – sum(pi
4)
First principles: no prior knowledge is required
Li C.C. Hum. Biol. 68 (1996) 167-184
Won the Gabriel W. Lasker Award as the best paper
of the year in Human Biology
Rare allele frequency
estimation
Commonly, from a reference sample: pi = ni/N
When, however, a stain and suspect are independent homozygotes AiA then
pi = (ni + 4)/(N + 4)
If they are independent heterozygotes AiAj then
pi = (ni + 2)/(N + 2) and pj = (nj + 2)/(N + 2)
Let the size of a reference sample N = 1000 and the frequency of a rare
allele Ai is ni = 1, so pi = ni/N = 0.001 and pii = 0.000001
If a stain and suspect are homozygotes AiAi, then pi becomes
pi = (1 + 4)/(1000 + 4 ) ≈ 0.005 and pii = 0.000025
Forensic genetics software
Allelix http://www.allelix.net/
BDgen dbgen@yahoo.com.ar
DNAdacto, mDNAbase gavriley@krinc.ru
DNAmix, EasyDNA: EasyPA, EasyPAnt, EasyIN, EsayMISS/EasyKIN
http://www.hku.hk/statistics/staff/wingfung/countdown/dnamix.ht
ml
DNAmix2DNAmix2
ftp://statgen.ncsu.edu/pub/storey/DNAMIXv2/dos/dnamix2.exe
DNA-view http://dna-view.com/
EasyPat, Patern
http://www.uni-
kiel.de/medinfo/mitarbeiter/krawczak/download/index.html
Familias
http://www.math.chalmers.se/~mostad/familias/familias.zip
FCalc bolon@caltech.edu www.its.caltech.edu/~bolon
GRAPE serge@star.net.
Identity, NewPat5 DadShare
http://www.zoo.cam.ac.uk/zoostaff/amos/
PARENTE http://www2.ujf-
grenoble.fr/leca/download/PARENTE/PARENTE.zip
PATER2 spena@dcc.ufmg.br
PATRI
http://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdivhttp://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdiv
/mdiv.exe
PedCheck
http://watson.hgen.pitt.edu/register/docs/pedcheck.html
PowerMarker http://152.14.14.57/
PowerStats http://www.promega.com/geneticidtools/
ProbMax http://www.uoguelph.ca/~rdanzman/software/
Relative hhg2@columbia.edu
SPUR nina.fukshansky@gmx.de
STRLab http://strlab.co.za/
Population genetics software
Arlequin http://anthro/unige.ch/arlequin
CERVUS http://helios.bto.ed.ac.uk/evolgen
Con~Struct andy.overall@ed.ac.uk
FSTAT http://www.unil.ch/izea/softwares/fstat.html
FSTMET, HWMET http://www.reading.ac.uk/~snsbalng/
GDA http://lewis.eeb.uconn.edu/lewishome/software.html
GEN lazzeroni@stanford.edu
GENEPOP ftp://ftp.cefe.cnrs-mop.fr/genepop
GENEPOP on WebGENEPOP on Web
http://wbiomed.curtin.edu.au/genepop/index.html
GENETIX http://www.univ-montp2.fr/~genetix/genetix.htm
GeneKonv http://www.rrz.uni-
hamburg.de/OekoGenetik/software.htm
HWE
http://www.biology.ualberta.ca/old_site/jbrzusto/hwenj.html
PopGen32 Http://www.ualberta.ca/~fyeh
Population http://www.cnrs-
gif.fr/pge/bioinfo/populations/index.php?lang=fr
PowerMarker http://www.powermarker.net
PowerStats http://www.promega.com/geneticidtools/
TFPGA http://bioweb.usu.edu/mpmbio/tfpga.htm
Software online
Allelix http://www.allelix.net/
GENEPOP on Web
http://wbiomed.curtin.edu.au/genepop/index.html
ProfilerPlus Random Match Probability Calculator
http://www.csfs.ca/pplus/profiler.htm
Different tests (even exact) can lead to
different conclusions
Locus: vWA, Russian Caucasians, Hardy-Weinberg equilibrium test
Test P-value or CL Software
χ2 0.106 ChiHW, GDA, PowerMarker,
etc.etc.
Corrected χ2 0.092 GEN
Fisher’s probability,
Guo-Thompson alg.
0.026 Arlequin, GDA, GENEPOP,
HWE, TFPGA, etc.
G2 asympt. 0.163 POPGENE
Fis 0.141 FSTAT, GENETIX
Fis, 95% cred. lim. -0.044, -0.015 HWMET
Algorithms
Modern software implement exact nonparametric
approaches and modern Bayesian ideology and
methodology.
Their realization requires sophisticated
computational algorithms and facilities.
In this respect some new problems are raised, e.g.
the problem of convergence for the procedures
based on Markov chain Monte Carlo (MCMC)
algorithms.
familiarize yourself with the method,
including convergence diagnostics
K. L. Ayres, D. J. Balding
P-values produced by MCMC procedure
depend on the number of
randomization steps:
10 steps — P = 0.7815 ± 0.0008104 steps — P = 0.7815 ± 0.0008
105 steps — P = 0.2681 ± 0.0005
106 steps — P = 0.373 ± 0.012
107 steps — P = 0.424 ± 0.006
108 steps — P = 0.460 ± 0.003
Conclusion
First principle of GSP
It should be good statistics practiceIt should be good statistics practice
to analyze the data with different
statistical methods and investigate
their consistency.
Acknowledgements
We thank Drs., Karen L. Ayres and David J. Balding,
Laura C. Lazzeroni and Kenneth Lange, and John
Brzustowski
for kind supply with the executables of theirfor kind supply with the executables of their
programs (HWMET, GEN and HWE, respectively).
Many thanks to them and Drs. Angel Carracedo,
Laurent Excoffier, Jerome Goudet, Kejun Liu,
Tristan Marshall, Mark P. Miller, Eleanor Morgan,
Michel Raymond, Francois Rousset, Hans-Georg
Scheil, Bruce S. Weir and Dmitri Zaykin,
the authors of other programs and papers used in
this study, for helpful and fruitful discussion.
Sincere thanks
Drs.
Carsten HohoffCarsten Hohoff
Edwin Ehrlich
Kurt Trübner
for the invitation, help and financial
support

More Related Content

Similar to Catalog of formulae for forensic genetics ppt

2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical Analysis2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical Analysis
NUI Galway
 
Team 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatationTeam 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatation
Nafiz Ishtiaque Ahmed
 
A Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary AlgorithmsA Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
Tracy Hill
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting Strategy
Waqas Tariq
 
Exponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLEExponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLE
IOSR Journals
 
Basen Network
Basen NetworkBasen Network
Basen Network
guestf7d226
 
Quantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdfQuantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdf
Quinn Lathrop
 
Data analytics to support exposome research course slides
Data analytics to support exposome research course slidesData analytics to support exposome research course slides
Data analytics to support exposome research course slides
Chirag Patel
 
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingBayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Julyan Arbel
 
50120130405032
5012013040503250120130405032
50120130405032
IAEME Publication
 
London 2008
London 2008London 2008
London 2008
Jonas Ranstam PhD
 
Novelties in social science statistics
Novelties in social science statisticsNovelties in social science statistics
Novelties in social science statistics
Jiri Haviger
 
PSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology WorkshopPSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology Workshop
Casey Greene
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020
Eero Siljander
 
Sampling methods
Sampling  methodsSampling  methods
Sampling methods
Ashok Kulkarni
 
ppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdfppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdf
Paul Gardner
 
20081206 Biostatistics
20081206 Biostatistics20081206 Biostatistics
20081206 Biostatistics
Chung-Han Yang
 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining Techniques
IJRES Journal
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
inscit2006
 
advanced_statistics.pdf
advanced_statistics.pdfadvanced_statistics.pdf
advanced_statistics.pdf
GerryMakilan2
 

Similar to Catalog of formulae for forensic genetics ppt (20)

2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical Analysis2013.03.26 Bayesian Methods for Modern Statistical Analysis
2013.03.26 Bayesian Methods for Modern Statistical Analysis
 
Team 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatationTeam 5 imputing_medical_missing_data_ga approach_preseatation
Team 5 imputing_medical_missing_data_ga approach_preseatation
 
A Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary AlgorithmsA Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
A Comparison Of Fitness Scallng Methods In Evolutionary Algorithms
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting Strategy
 
Exponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLEExponential software reliability using SPRT: MLE
Exponential software reliability using SPRT: MLE
 
Basen Network
Basen NetworkBasen Network
Basen Network
 
Quantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdfQuantitative Studies Group - Item Response Theory Spring 2014.pdf
Quantitative Studies Group - Item Response Theory Spring 2014.pdf
 
Data analytics to support exposome research course slides
Data analytics to support exposome research course slidesData analytics to support exposome research course slides
Data analytics to support exposome research course slides
 
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketingBayesian Nonparametrics, Applications to biology, ecology, and marketing
Bayesian Nonparametrics, Applications to biology, ecology, and marketing
 
50120130405032
5012013040503250120130405032
50120130405032
 
London 2008
London 2008London 2008
London 2008
 
Novelties in social science statistics
Novelties in social science statisticsNovelties in social science statistics
Novelties in social science statistics
 
PSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology WorkshopPSB2016 Computational Microbiology Workshop
PSB2016 Computational Microbiology Workshop
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020
 
Sampling methods
Sampling  methodsSampling  methods
Sampling methods
 
ppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdfppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdf
 
20081206 Biostatistics
20081206 Biostatistics20081206 Biostatistics
20081206 Biostatistics
 
Heart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining Techniques
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
advanced_statistics.pdf
advanced_statistics.pdfadvanced_statistics.pdf
advanced_statistics.pdf
 

More from Nikita Khromov-Borisov

парадоксы спортгеномики 2015
парадоксы спортгеномики 2015парадоксы спортгеномики 2015
парадоксы спортгеномики 2015
Nikita Khromov-Borisov
 
химия днк для генетиков 2015
химия днк для генетиков 2015химия днк для генетиков 2015
химия днк для генетиков 2015
Nikita Khromov-Borisov
 
парадоксы геномной медицины 2015
парадоксы геномной медицины 2015парадоксы геномной медицины 2015
парадоксы геномной медицины 2015
Nikita Khromov-Borisov
 
Harmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictionsHarmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictions
Nikita Khromov-Borisov
 
Evolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsEvolutionary arguments in medical genomics
Evolutionary arguments in medical genomics
Nikita Khromov-Borisov
 
кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014
Nikita Khromov-Borisov
 
Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014
Nikita Khromov-Borisov
 
Syndrome of statistical leniency ppt
Syndrome of statistical leniency pptSyndrome of statistical leniency ppt
Syndrome of statistical leniency ppt
Nikita Khromov-Borisov
 
Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013
Nikita Khromov-Borisov
 
Population thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions pptPopulation thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions ppt
Nikita Khromov-Borisov
 
Modern free biostatistical software ppt
Modern free biostatistical software pptModern free biostatistical software ppt
Modern free biostatistical software ppt
Nikita Khromov-Borisov
 
Half a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology pptHalf a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology ppt
Nikita Khromov-Borisov
 
Genetics of predispositions ppt
Genetics of predispositions pptGenetics of predispositions ppt
Genetics of predispositions ppt
Nikita Khromov-Borisov
 
Format for the population data in forensic genetics ppt
Format for the population data in forensic genetics pptFormat for the population data in forensic genetics ppt
Format for the population data in forensic genetics ppt
Nikita Khromov-Borisov
 
Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013
Nikita Khromov-Borisov
 
Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004Nikita Khromov-Borisov
 
Joshua lederberg ppt
Joshua lederberg pptJoshua lederberg ppt
Joshua lederberg ppt
Nikita Khromov-Borisov
 
Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014
Nikita Khromov-Borisov
 

More from Nikita Khromov-Borisov (18)

парадоксы спортгеномики 2015
парадоксы спортгеномики 2015парадоксы спортгеномики 2015
парадоксы спортгеномики 2015
 
химия днк для генетиков 2015
химия днк для генетиков 2015химия днк для генетиков 2015
химия днк для генетиков 2015
 
парадоксы геномной медицины 2015
парадоксы геномной медицины 2015парадоксы геномной медицины 2015
парадоксы геномной медицины 2015
 
Harmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictionsHarmonizing statistical evidences and predictions
Harmonizing statistical evidences and predictions
 
Evolutionary arguments in medical genomics
Evolutionary arguments in medical genomicsEvolutionary arguments in medical genomics
Evolutionary arguments in medical genomics
 
кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014кризис воспроизводимости в биомедицине Rus 2014
кризис воспроизводимости в биомедицине Rus 2014
 
Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014Prematurity of genetic testing of predispositions rus 2014
Prematurity of genetic testing of predispositions rus 2014
 
Syndrome of statistical leniency ppt
Syndrome of statistical leniency pptSyndrome of statistical leniency ppt
Syndrome of statistical leniency ppt
 
Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013Reproducibility and predictivity in the genetics of predispositions ppt 2013
Reproducibility and predictivity in the genetics of predispositions ppt 2013
 
Population thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions pptPopulation thinking in studies of genetic predispositions ppt
Population thinking in studies of genetic predispositions ppt
 
Modern free biostatistical software ppt
Modern free biostatistical software pptModern free biostatistical software ppt
Modern free biostatistical software ppt
 
Half a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology pptHalf a century with the central dogma of molecular biology ppt
Half a century with the central dogma of molecular biology ppt
 
Genetics of predispositions ppt
Genetics of predispositions pptGenetics of predispositions ppt
Genetics of predispositions ppt
 
Format for the population data in forensic genetics ppt
Format for the population data in forensic genetics pptFormat for the population data in forensic genetics ppt
Format for the population data in forensic genetics ppt
 
Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013Evolutionary medical genomics ppt 2013
Evolutionary medical genomics ppt 2013
 
Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004Biometrical problems in population studies ppt 2004
Biometrical problems in population studies ppt 2004
 
Joshua lederberg ppt
Joshua lederberg pptJoshua lederberg ppt
Joshua lederberg ppt
 
Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014Reproducibility of results in the genetics of predisposition eng 2014
Reproducibility of results in the genetics of predisposition eng 2014
 

Recently uploaded

aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
PsychoTech Services
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
Frédéric Baudron
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Modelo de slide quimica para powerpoint
Modelo  de slide quimica para powerpointModelo  de slide quimica para powerpoint
Modelo de slide quimica para powerpoint
Karen593256
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 

Recently uploaded (20)

aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Modelo de slide quimica para powerpoint
Modelo  de slide quimica para powerpointModelo  de slide quimica para powerpoint
Modelo de slide quimica para powerpoint
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 

Catalog of formulae for forensic genetics ppt

  • 1. COMPREHENSIVE CATALOG OF STATISTICAL FORMULAE, ALGORITHMS AND SOFTWARE – A STEP TOWARDS GOOD STATISTICS PRACTICE IN FORENSIC GENETICS Nikita N. Khromov-Borisov,Nikita N. Khromov-Borisov, Andrew G. Smolyanitsky Forensic Medicine Bureau of Leningrad District Saint Petersburg, Russia Nikita.KhromovBorisov@gmail.com Andrew.Smolyanitsky@yandex.ru
  • 2. Quotations of the day If your experiment requires statistics, then you ought to have done a better experiment Ernest Rutherford Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and writefor efficient citizenship as the ability to read and write Herbert George Wells Those who ignore Statistics are condemned to reinvent it Bradley Efron If Experimentation is the Queen of the Sciences, then Statistical Methods must be regarded as Guardians of the Royal Virtue Myron Tribus
  • 3. GSP – Good Statistics Practice is what we need Obviously, in their turn, statistical methods must be blameless and perfect So there is an urgent need for the comprehensive catalog of carefully checked and approved formulae as well as corresponding algorithms and software.as well as corresponding algorithms and software. Unfortunately, some of them are published initially with errors which are reproduced in subsequent sources. Example of corrections: Clayton T. M., Foreman L. A., Carracedo A. FORENSIC SCIENCE INTERNATIONAL Vol. 125: No. 2-3 p. 284-284, 2002 Motherless case in paternity testing by Lee et al.
  • 4. Elements of the Statistical Design of Experiment Some formulae are in rare use or forgotten. Example: Chakraborty’s formula for the sample size required to reach the representativeness (reliability, “saturation”) of the reference population samples. Human Biology 64 (1992) 141-159:141-159: ln[1 - (1 - α)1/r] Nmin= -------------- 4 ln(1 – Pmin) Nmin - minimum number of independent individuals to be analyzed, a - probability of error, r - number of alleles revealed by the system, Pmin- minimum allele frequency.
  • 5. Minimum sample sizes required for the reference population data Minimum allele frequency No. of alleles Error Sample size, No. of individuals P r α NPmin r α Nmin 0.01 2 - 25 0.001-0.0001 190 - 310 0.005 2 – 25 0.001-0.0001 380 - 620 0.001 2 - 25 0.001-0.0001 1900 - 3100
  • 6. Template for paternity testing Mother Child Tested man JK JK JJ JK JW PI 1/(pJ +pK) 0.5/(pJ +pK) JJ JJ JJ JWJJ JK KK KW JJ JJ JK JK JJ JJ JJ JJ JK JK JK JW JW JW JW JZ PI 1/pJ 0.5/pJ Obligative paternal alleles are in red color False genotyping is excluded
  • 7. C. C. Li and A. Chakravarti alternatives Paternity probability based on Nonexclusion P0 W=------------------------------- P0 + (1-P0)(1- E1)…(1-Et)P0 + (1-P0)(1- E1)…(1-Et) Ei – probability of exclusion for i-th test P0 can be estimated from long-term records Am. J. Hum. Genet. 43 (1) 197-205 (1988)
  • 8. Coincidental DNA matches Match probability as a property of a locus: M0 = 2(sum pi 2)2 – sum(pi 4) First principles: no prior knowledge is required Li C.C. Hum. Biol. 68 (1996) 167-184 Won the Gabriel W. Lasker Award as the best paper of the year in Human Biology
  • 9. Rare allele frequency estimation Commonly, from a reference sample: pi = ni/N When, however, a stain and suspect are independent homozygotes AiA then pi = (ni + 4)/(N + 4) If they are independent heterozygotes AiAj then pi = (ni + 2)/(N + 2) and pj = (nj + 2)/(N + 2) Let the size of a reference sample N = 1000 and the frequency of a rare allele Ai is ni = 1, so pi = ni/N = 0.001 and pii = 0.000001 If a stain and suspect are homozygotes AiAi, then pi becomes pi = (1 + 4)/(1000 + 4 ) ≈ 0.005 and pii = 0.000025
  • 10. Forensic genetics software Allelix http://www.allelix.net/ BDgen dbgen@yahoo.com.ar DNAdacto, mDNAbase gavriley@krinc.ru DNAmix, EasyDNA: EasyPA, EasyPAnt, EasyIN, EsayMISS/EasyKIN http://www.hku.hk/statistics/staff/wingfung/countdown/dnamix.ht ml DNAmix2DNAmix2 ftp://statgen.ncsu.edu/pub/storey/DNAMIXv2/dos/dnamix2.exe DNA-view http://dna-view.com/ EasyPat, Patern http://www.uni- kiel.de/medinfo/mitarbeiter/krawczak/download/index.html Familias http://www.math.chalmers.se/~mostad/familias/familias.zip FCalc bolon@caltech.edu www.its.caltech.edu/~bolon GRAPE serge@star.net.
  • 11. Identity, NewPat5 DadShare http://www.zoo.cam.ac.uk/zoostaff/amos/ PARENTE http://www2.ujf- grenoble.fr/leca/download/PARENTE/PARENTE.zip PATER2 spena@dcc.ufmg.br PATRI http://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdivhttp://www.bscb.cornell.edu/Homepages/Rasmus_Nielsen/mdiv /mdiv.exe PedCheck http://watson.hgen.pitt.edu/register/docs/pedcheck.html PowerMarker http://152.14.14.57/ PowerStats http://www.promega.com/geneticidtools/ ProbMax http://www.uoguelph.ca/~rdanzman/software/ Relative hhg2@columbia.edu SPUR nina.fukshansky@gmx.de STRLab http://strlab.co.za/
  • 12. Population genetics software Arlequin http://anthro/unige.ch/arlequin CERVUS http://helios.bto.ed.ac.uk/evolgen Con~Struct andy.overall@ed.ac.uk FSTAT http://www.unil.ch/izea/softwares/fstat.html FSTMET, HWMET http://www.reading.ac.uk/~snsbalng/ GDA http://lewis.eeb.uconn.edu/lewishome/software.html GEN lazzeroni@stanford.edu GENEPOP ftp://ftp.cefe.cnrs-mop.fr/genepop GENEPOP on WebGENEPOP on Web http://wbiomed.curtin.edu.au/genepop/index.html GENETIX http://www.univ-montp2.fr/~genetix/genetix.htm GeneKonv http://www.rrz.uni- hamburg.de/OekoGenetik/software.htm HWE http://www.biology.ualberta.ca/old_site/jbrzusto/hwenj.html PopGen32 Http://www.ualberta.ca/~fyeh Population http://www.cnrs- gif.fr/pge/bioinfo/populations/index.php?lang=fr PowerMarker http://www.powermarker.net PowerStats http://www.promega.com/geneticidtools/ TFPGA http://bioweb.usu.edu/mpmbio/tfpga.htm
  • 13. Software online Allelix http://www.allelix.net/ GENEPOP on Web http://wbiomed.curtin.edu.au/genepop/index.html ProfilerPlus Random Match Probability Calculator http://www.csfs.ca/pplus/profiler.htm
  • 14. Different tests (even exact) can lead to different conclusions Locus: vWA, Russian Caucasians, Hardy-Weinberg equilibrium test Test P-value or CL Software χ2 0.106 ChiHW, GDA, PowerMarker, etc.etc. Corrected χ2 0.092 GEN Fisher’s probability, Guo-Thompson alg. 0.026 Arlequin, GDA, GENEPOP, HWE, TFPGA, etc. G2 asympt. 0.163 POPGENE Fis 0.141 FSTAT, GENETIX Fis, 95% cred. lim. -0.044, -0.015 HWMET
  • 15. Algorithms Modern software implement exact nonparametric approaches and modern Bayesian ideology and methodology. Their realization requires sophisticated computational algorithms and facilities. In this respect some new problems are raised, e.g. the problem of convergence for the procedures based on Markov chain Monte Carlo (MCMC) algorithms.
  • 16. familiarize yourself with the method, including convergence diagnostics K. L. Ayres, D. J. Balding P-values produced by MCMC procedure depend on the number of randomization steps: 10 steps — P = 0.7815 ± 0.0008104 steps — P = 0.7815 ± 0.0008 105 steps — P = 0.2681 ± 0.0005 106 steps — P = 0.373 ± 0.012 107 steps — P = 0.424 ± 0.006 108 steps — P = 0.460 ± 0.003
  • 17. Conclusion First principle of GSP It should be good statistics practiceIt should be good statistics practice to analyze the data with different statistical methods and investigate their consistency.
  • 18. Acknowledgements We thank Drs., Karen L. Ayres and David J. Balding, Laura C. Lazzeroni and Kenneth Lange, and John Brzustowski for kind supply with the executables of theirfor kind supply with the executables of their programs (HWMET, GEN and HWE, respectively). Many thanks to them and Drs. Angel Carracedo, Laurent Excoffier, Jerome Goudet, Kejun Liu, Tristan Marshall, Mark P. Miller, Eleanor Morgan, Michel Raymond, Francois Rousset, Hans-Georg Scheil, Bruce S. Weir and Dmitri Zaykin, the authors of other programs and papers used in this study, for helpful and fruitful discussion.
  • 19. Sincere thanks Drs. Carsten HohoffCarsten Hohoff Edwin Ehrlich Kurt Trübner for the invitation, help and financial support