FORENSIC EPIGENETICS FOR BODILY FLUID TYPING, SUSPECT AGE, AND PHENOTYPING
1. Forensic epigenetics and its applications in trace body
fluid identification and phenotyping.
Bruce McCord
Department of Chemistry and Biochemistry
Florida International University
Miami, FL
mccordb@fiu.edu
https://faculty.fiu.edu/~mccordb/Bruce.html.
2. Scenario (Body fluid analysis)
In the early 90s, A woman is found murdered.
Trace DNA is found under her fingernails.
Ex husband (custody dispute) could not be excluded
from intimate mixture. (presumptive for blood)
Suspect argues, not blood - DNA match is from
secondary skin cells transferred from mother to
child and then to father..
Critical to determination is the question: is this
blood? IS it human or animal? A presumptive test
indicating blood, might also be positive for rust…
3. current procedures for the body
fluid id date from the 1940s.
Chemical and enzymatic tests are
not specific and lack sensitivity
when compared to the PCR.
Forensic DNA detects subnanogram
levels of DNA, but cannot tell you
the source of the material..
Many labs no longer do sperm
searches, yet Y chromosome qPCR
based methods cannot distinguish
between touch DNA and semen.
The Problem:
4. New methods for body fluid analysis
exploit the process of transcription to
develop selective new markers
Proteins form the functions
of the cell
RNA templates are
translated to form the
scaffolding for proteins.
DNA templates are
transcribed to produce
RNA
Genome
Transcriptome
Proteome
Epigenome
5. Issues with the above methods
Traditional markers – low sensitivity, degradation,
hook effects, environmental contamination, specificity
Proteomics – human specificity, relating protein
sample to DNA extract, cost of equipment, technology
RNA – non specific quant methods (RNA+DNA),
variations in gene expression, potential for
degradation by RNase
We are exploring Epigenetic DNA methylation
6. Epigenetics
It is obvious to anyone that the human body has
many different kinds of cells; skin, hair, teeth,
blood, etc.
Yet our DNA is all the same, How then does our
body differentiate cells? Why do twins have
different fingerprints?
DNA methylation patterns in
young and older twins.
Even identical twins begin to
appear different with age.
7. Epigenetics
The answer is that there are heritable
differences in our DNA that are not related to
base pairing.
Instead these differences are controlled by
patterns of methylation in cytosine and in post
translational modifications of histones.
Epigenetics is the study of heritable changes in
gene expression unrelated to DNA base pairing.
While DNA methylation affects gene regulation,
it can be affected by environmental factors and
DNA repair mechanisms
8. Methylation
Methyl residues are covalently bound to the 5’ carbon
position of cytosine pyrimidine ring via DNA
methyltransferases (Dnmt) forming 5-methylcytosines
Observed at CpG dinucleotides (70% of CpGs are
methylated in vertebrates but distinct patterns are seen
“CpG islands” – areas of high CpG density usually
mapping to promoter regions
Methylation → gene silencing
http://www.hgu.mrc.ac.uk/people/r.meehan_researchb.html
9. Global differences exist between
methylation levels of different tissues
Note differences occurring between sperm, keratinocytes (skin cells),
and lymphocytes(white blood cells) (Eckhardt et al. 2006).
10. But PCR erases methylation differences-
So use Bisulfite Modified PCR to lock in place
11. Laboratory based pyrosequencing
The availability of small laboratory based sequencers
permits the rapid and specific determination of DNA
methylation for a PCR product.
The system utilizes next generation techniques and
automation but is small and flexible with high read
depth for single products.
Perfect for precise measure of % methylation
Pyromark 48 (Qiagen)
12. Workflow for forensic DNA methylation analysis
using pyrosequencing
DNA extraction
& quantification
Bisulfite conversion
PCR
Assay
Data analysis
Exploring markers
& primer design
A portion of the sample is removed
and treated for bisulfite conversion
unmethylated C’s become U’s
Standard PCR Amplification takes
place, unmethylated C’s become T’s
Pyrosequencing, SNPshot (CE),
Real time PCR, MPS
Identify potential CpG loci based on large
scale epigenetic arrays, design and test
primers by microfluidic CE
Perform standard methods using robotic
extraction methods (same process as STR
typing) and QPCR
The %methylation of each CpG site is
scored and compared to knowns
Body fluid is identified based on PCA
Kuppareddi
Balamurugan
Tania Madi
13. Ideal body fluid identification markers will exhibit a large
difference in percent methylation in one body fluid when
compared to other body fluids.
13
Madi et al., Electrophoresis,
(2012)
Body fluid specific loci
Tania Madi
14. In ZC3H12D Blood is hypermethylated while
Semen is hypomethylated
95%C
2%C
Note that at this locus
we identify 4 additional
CpG loci in addition to
the chip defined CpG.
NC Wyeth
Buried treasure
15. Madi T. et al. Electrophoresis 2012 33:1736-1745
Antunes et al. Electrophoresis 2016,37, 2751–2758
Markers for blood, saliva, sperm and vaginal epi
were developed requiring separate amplifications
Joana Antunes
PFN3A Vaginal
epithelia
ZC3H12D Sperm
Tania Madi
Kuppareddi
Balamurugan
Kuppareddi
Balamurugan
16. Validation
Human Specificity Sensitivity
Age Mixtures
George
Duncan
Deborah
Silva
Silva, Antunes, Balamurugan;
Duncan, Alho, B. McCord.
Developmental validation studies of
epigenetic DNA methylation markers
..Forensic Sci Int Genet. 2016
Jul;23:55-63.
18. Sensitivity Study
Results show sensitivity and reproducibility
from 20ng to 100pg
F-tests compared the 20ng input to each
smaller input. Variances are statistically
significant below 1ng
A single CpG is not partially methylated!
At 100pg of DNA (~15 cells) the
reproducibility suffers due to # of targets
10 methylated cells and 5 unmethylated cells
= 66% methylation
11 methylated cells and 4 unmethylated cells
= 73% methylation 0
20
40
60
80
100
CpG1 CpG2 CpG3 CpG4
Mean
%
Methylation
VE_8
PCR bias and % bisulfite conversion also
affect reproducibility at low end
21. Large primate samples amplify
but not all show
pyrosequencing results
Ladder
Chimpanzee
Orangutan
Gorilla
Chimpanzee
Orangutan
Gorilla
BCAS4 cg0679435
Chimpanzee
Orangutan
Gorilla
ZC3H12D
ERROR:
Surrounding
reference
sequence
was not
recognized
George Duncan
24. Our Current Multiplex for Body Fluid ID
simultaneous amplification of 4 loci to detect body fluid type
Sohee Cho
Quentin
Gauthier
Gauthier, Cho, Carmel, McCord, Electrophoresis, 2019, 40, 2565
25. Cluster Analysis of 5 CpGs
The testing set
resulted in each
and every sample
being placed in the
correct body fluid
cluster
25
27. Marker
name
Collection
Method
(saliva)
Selection
Total # of saliva
samples
Validation
Total # of
saliva
samples
Mean
percent
methylatio
n
References
BCAS4
Buccal
Swab
Methylation
WGA
n = 10 (saliva)
n = 42
(total)
Pyroseq n = 38 0.68
Madi et al.
2012
Gauthier et
al., 2019
SOX2OT
10mL
Spit
Human
Methylation
450K bead
array
n = 4
(saliva)
n = 12
(total)
Pyroseq n = 20
0.52
0.53
Park et al.,
2014
SLC12A8
FAM43A 200uL Spit
Human
Methylation
450K bead
array
n = 12 (saliva)
n = 70
(total)
SNaPshot n = 52 0.49
Lee et al.
2015
Previous work on the detection of
oral fluids – Salvia vs Buccal…
28. Microscopy demonstrates that the types of
cells and their proportions are different in
buccal and saliva samples
Percent
epithelial cells
% segmented, cells,
lymphocytes and other
mononuclear cells
Scientific Reports volume 8, Article number: 6944 (2018)
30. Effects of collection site for oral
fluids by locus
0
10
20
30
40
50
60
70
80
90
100
Buccal
(n=10)
Lip
(n=10)
Tongue
(n=10)
Spit
(n=10)
Chewing gum
(n=10)
Nasal
(n=10)
Semen
(n=5)
Blood
(n=5)
Vaginal Epi
(n=5)
Menstrual
Blood
(n=5)
Mean Percent Metylation observed in four markers on a variety of body fluids
BCAS4 SOX2OT SLC12A8 FAM43A
Primers were designed based on sequence data from the following references
BCAS4 Chr20:49410865 4 CPGs Madi et al. 2012
SOX2OT Chr3:181421427 2 CpGs Park et al. 2014
SLC12A8 Chr3:124860686 6 CpGs Park et al. 2014
FAM43A Chr3:194408845 9 CpGs Lee et al. 2015
32. FAM43A
Oral Fluids Non-Oral Fluids
Above 10% Below 10%
BCAS4
Buccal
& Lip
Spit
& Gum
Above 50% Below 50%
Once a sample has been determined to be
an oral fluid, BCAS4 can distinguish buccal
and lip from spit and gum
Buccal/Lip vs Spit/Gum – BCAS4
33. FAM43A
Non-Oral Fluids
Below 10%
BCAS4
Buccal
& Lip
Spit
& Gum
SOX2OT
Nasal
Above 27%
Oral Fluids
Once oral fluid is determined
by SOX20T, nasal can be
distinguished by FAM43A,
SOX20T and SCL12A8
Nasal Secretions
34. An oral fluid
classification scheme
Mean
%
Methylation
0
10
20
30
40
50
60
FAM43A_CpG 5
Oral Fluid Non-Oral Fluids
A.
Mean
%
Methylation
10
20
30
40
50
60
70
80
BCAS4_CpG1
Buccal & Lip Spit, Tongue and Gum
B.
ORAL FLUID NASAL OTHER BODY FLUIDS
Mean
%
Methylation 0
10
20
30
40
50
60
70
80
90
100
SOX2OT_CpG 1 AND FAM43A_CpG 5
0
SOX2OT
SOX2OT
FAM43A
SOX2OT
FAM43A
FAM43A
C.
35. • Tissue type
• Age
• Which Twin?
• Is this person a smoker?
• Does he/she drink?
• Exposure to illegal drugs?
• Body Mass Index?
• Diet?
• Physiology?
• Biogeographics
• Stress?
Epigenetic Phenotyping – What else can we do
with this?
We see a fascinating future for
epigenetics and forensic science,
particularly as an investigative tool.
•. Vidaki, Keiser, Genome Bio, 2017 Dec 21;18(1):238
36. The importance of age
determination in forensics
DNA based facial reconstruction
must be artificially aged
http://www.nytimes.com/2015/02/24/science/building-
face-and-a-case-on-dna.html
Melanie McCord
37. Determination of Age:
• Certain CpG loci gradually change methylation status with age,
likely the result of stochastic DNA damage/repair
• "High CpG density promoters, and in particular those mapping to
developmental genes, seem to increase in methylation with age
• CpGs located outside these regions tend to lose methylation with
age
• Age targets are probably are not involved in gene expression
during lifespan but are likely a stochastic effect of repair
38. We have examined several genetic loci, including
GRIA2, NPTX2, KLF14, and SCGN, previously identified in a
whole methylome studies were examined and primers
were designed to explore the regions.
We analyzed saliva and blood samples from volunteers
with ages ranging from 5 to 72 years
So how to determine age with
epigenetics and pyrosequencing?
Deborah Silva
D. Silva, J. Antunes, K. Balamurugan; G. Duncan, C.
S. Alho, B. McCord., Electrophoresis, 2015, 36,
1775-1780.
Alghanim H, Antunes J, Silva DSBS, Alho CS,
Balamurugan K, McCord B. Forensic Sci Int Genet.
2017 Nov;31:81-88
39. 39
0
5
10
15
20
25
30
35
40
CpG1 CpG2 CpG3 CpG4 CpG5 CpG6 CpG7
age (5-17) (18-29) (30-39) (40-49)
(50-59) (60-69) (70-73)
Percent
Methylation
KLF14-Saliva
3 CpGs found which are strongly influenced by age in saliva
Methylation Chip Arrays are used to find likely sites
for biological age
Then PCR primers are designed to target these sits
40. Top age/saliva Specific Loci.
Age for these loci can be determined in one or two amplifications
Alghanim H, Antunes J, Silva DSBS, Alho CS, Balamurugan K, McCord B. Detection and evaluation of DNA
methylation markers found at SCGN and KLF14 loci to estimate human age. FSI Genet. 2017 Nov;31:81-88
Zapico, S.; Gauthier, Q.; Antevska, A.; McCord, B.R. Identifying Methylation Patterns in Dental Pulp Aging: Application to Age-at-
Death Estimation in Forensic Anthropology. Int. J. Mol. Sci. 2021, 22, 3717.
KLF14: single amplification! KLF14+SCGN
41. Age Prediction Model for Saliva
41
• Saliva (n= 91, ages 5 and 73 years old) were divided into: 52-training set
and 39-validation set.
• Based on multivariate linear regression analysis median average deviation
of 7.1 years.
42. Example individual, actual age = 51
1- Based in single-locus model: Estimated age (in years) =
- 24.884 + (1.703 * CpG1 from KLF14) + (1.963 * CpG2 from KLF14)=
KLF14
56
43. 43
Chronological age versus predicted age of the entire data set of the 91 saliva
samples using the single-locus prediction model (CpG1 and CpG2 from KLF14
)
Predicted (epigenetic) vs Chronological Age
(KLF14)
FBI arrests per
100,000
Most crime occurs among individuals under 40
and model accuracy is better for lower ages
44. Anthropological Application:
Dental Pulp
..
Sara C Zapico, Quentin Gauthier, Aleksandra Antevska, Bruce R McCord, Identifying Methylation Patterns in Dental
Pulp Aging: Application to Age-at-Death Estimation in Forensic Anthropology, Int J Mol Sci, 2021 Apr 2;22(7):3717
Mean average error
45. DNA methylation markers for tobacco
smoking
45
Much has been made of the potential of massively parallel sequencing
for determination of eye, hair, and facial features based on SNPs.
1- But far away, in the dark, what can you tell about this man?
2- Is this person blond or dark haired? Are his eyes blue? Is he tall or
short? What country is he from? Very hard to tell…
But we know he is holding a cigarette and is a smoker….
46. DNA methylation markers for tobacco
smoking
46
Goals:
1- Identify CpG sites that show
association with tobacco
smoking
2- Develop a model to predict
the smoking status
47. Discovering Novel CpG sites
47
Rank
number
In Blood In Saliva
Locus Chromosome location (GRCh37)/
(Illumina ID)
Locus Chromosome location
(GRCh37)/ (Illumina ID)
1 AHRR Chr5:373,490 AHRR Ch5:373,476
2 2q37 Chr2:233,284,675 AHRR Ch5:373,494
3 AHRR Chr5:373,423 AHRR Ch5:373,423
4 AHRR Chr5:373,476 AHRR Ch5:373,490
5 AHRR Chr5:373,378 (cg05575921) AHRR Ch5:373,398
6 AHRR Chr5:373,494 AHRR ch5:395,488
7 AHRR Chr5:373,315 AHRR Ch5:374,018
8 AHRR Chr5:373,299 (cg23576855) AHRR Ch5:373,250
9 AHRR Chr5:373,651 AHRR Ch5:373,147
10 AHRR Chr5:373,398 AHRR Ch5:373,989
11 AHRR Chr5:373,653
12 AHRR Chr5:373,555
13 GFI1 Chr1:92,947,588 (cg09935388)
14 2q37 Chr2:233,284,662 (cg21566642)
15 F2RL3 Chr19:17,000,553
Six different genes (AHRR, 2q37, 6p21.33, GFI1, F2RL3, and MYO1G ).
No. of sites: 88 CpG sites (known sites) (chosen sites)
Type of body fluid: blood and saliva
48. Best Identified Sites
• Best CpGs identified were found to be around (cg05575921) in the AHRR gene
• AHRR encodes a protein that has a role in the aryl hydrocarbon receptor pathway –
responds to xenobiotics like polyaromatic hydrocarbons and dioxins
AHRR gene: Aryl Hydrocarbon Receptor
Repressor
AHRR gene mediates the metabolism of
xenobiotic particles like toxic cigarette smoke
components
50. Multinomial logistic regression (MLR)
analysis using Leave One Out approach
for the 4-CpG assay
Type of Model Accuracy of prediction in blood Accuracy of prediction in saliva
Smoking group Current
smoker
Former
smoker
Never
smoker
Total Current
smoker
Former
smoker
Never
smoker
Total
Combined MLR
(4 CpGs)
90.0% 66.7% 84.9% 82.7% 86.9% 54.5% 77.8% 71.4%
50
Singleplex amplification of four CPGs
51. Applications -
Epigenetics
A. Molecular Sperm search:
As laboratories begin to use Y screening with qPCR, instead of microscopy, a
sperm specific marker can be valuable as it distinguishes semen from general
male DNA – Simply test your DNA sample for sperm by epigenetics.
B. Child abuse:
The DNA recovered from victim or suspect can be identified as blood, semen,
vaginal or saliva vs (nasal / skin touch)
C. Unknown suspect - phenotyping:
Age, smoking status and other information (diet, weight, alcohol) may be
determined to aid identity
- KEY FACTORS FOR IMPROVING GENETIC GENEOLOGY TARGETING…
52. Conclusions
Epigenetic markers are a fascinating new way to develop
forensic body fluid identification and phenotyping
An Epigenetic Body Fluid Identification Multiplex via
Pyrosequencing has been created. Methods using CE and
qPCR have also been developed.
We have developed a specific marker for detecting
smoking status of unknown suspects.
Stochastic methylation markers have been developed to
determine age and may be useful in genealogical studies
and suspect appearance.
Key to all these methods is primer design, bioinformatics
and statistical analysis. 52
53. Acknowledgements Major support for this work was provided by:
The National Institute of Justice, Qiagen and the
Dubai Police
Points of view of the authors and do not necessarily represent
the official view of the U.S. Department of Justice
George Duncam
Tania Mahdi
Kuppereddi Balamurgan
Deborah Silva
Clarice Alho
Joana Antunes
Sohee Cho
Quentin Gauthier
Hussain Alghamin
Wensong Wu
Justin Carmel
Sarah Zapico
Mirna Ghemrawi
Nicole Fernandez Tejero
Mark Guilliano
Keith Elliott
McCord Research Group
Florida International University (USA)
University of Southern Mississippi (USA)
General Headquarters, Dubai Police Dubai (UAE)
Catholic University of Rio Grande do Sol (Brazil)
Broward Sheriff’s Office Ft Lauderdale, FL (USA)
San Francisco Police Department (USA)
Institute of Forensic Science, Seoul National University
Streck, Agilent, DNA Polymerase Technology