1. Developmental
Valida/on
of
a
Method
for
Quan/ta/ve
High-‐Throughput
Forensic
Microsatellite
(STR)
Sequencing
Melissa
Scheible,
Sarah
Bailey,
Deborah
Silva,
Marina
Hoggan
and
Seth
A.
Faith
NC
State
University,
Forensic
Sciences
Ins=tute
Introduc*on
Forensic
science
is
poised
to
adopt
new
methods
in
DNA
analysis
u/lizing
next-‐genera/on
sequencing
(NGS)
to
obtain
finer
resolu/on
and
higher
bandwidth
in
gene/c
analysis.
To
date,
NGS
workflows
for
forensic
short
tandem
repeat
(STR)
sequencing
do
not
afford
a
strict
quan/ta/ve
analysis
(e.g.,
input
≅
output),
which
would
be
beneficial
for
mixture
and
low
copy
number
analysis.
Forensic
samples
in
NGS
workflows
are
rou/nely
normalized
for
library
input
quan//es
and
molar
library
concentra/ons
prior
to
sequencing.
Here
we
present
developmental
valida/on
of
a
quan/ta/ve
approach
to
STR
sequencing
with
NGS.
Method
Sensi*vity
Input
gDNA
quan//es
(15
-‐500
pg)
yield
consistent
results
with
low
variability
at
NGS
checkpoints.
The
lower
right
panel
“Data
Analysis”
demonstrates
the
range
of
sensi/vity
for
the
en/re
workflow.
• PowerSeq™
Auto/Y/SE33
(Promega
Corp.)
• 30
cycles
• NIST
SRM
2391c
Components
A-‐D
and
Control
DNA
2800M
• 15
pg
to
500
pg
gDNA
input
range
PCR
Amplifica*on
• 1.8x
AMPure
XP
cleanup
• Automated
on
Eppendorf
epMo/on
5075tc
liquid
handling
worksta/on
• Aliquot
collected
for
Qubit
measurement
Post-‐PCR
cleanup
• KAPA
Hyper
Prep
Kit
–
PCR
Free
(KAPA
Biosystems)
• Automated
on
epMo/on
5075tc
(Eppendorf)
• Total
purified
amp
product
used
as
input
• Dual-‐indexed
adapters
(Integrated
DNA
Technologies)
• Aliquot
collected
for
qPCR
(KAPA
Library
Quan/fica/on
Kit)
Library
construc*on
(Automa*on)
• 48
samples
pooled
(equal
volumes,
non-‐normalized)
• Illumina
MiSeq
v2
300
cycle
sequencing
kit
• 15%
PhiX
spike
Sequencing
• Custom
Python
tool
(Al=us)
• Implemented
in
Amazon
Web
Services
(AWS)
• Reports
core
repeat
size
and
sequence
Data
analysis
Precision
Linear
regression
analysis
demonstrates
high
correla/on
of
input
to
output
over
mul/ple
steps
of
the
workflow.
The
lower
panel
“Whole
Process”
shows
the
high
correla/on
of
input
gDNA
to
the
final
output
of
the
sequencer.
Accuracy
Conclusions
• Valida*on
of
NGS
workflows
will
require
demonstra*on
of
sensi-vity,
precision
and
accuracy
• Methods
that
u*lize
normaliza*on(s)
and
mul*ple
PCR
reac*ons
will
limit
the
ability
to
conduct
valida*ons
and
use
NGS
for
applica*ons
such
as
mixture
interpreta*on
and
low
copy
number
analysis
• The
non-‐normalized,
automated
method
here
shows
high
sensi-vity,
precision,
and
accuracy
across
a
range
of
input
gDNA
• A
high
correla*on
of
input
DNA
≅
NGS
Output
was
observed
This
work
was
supported
by
Na*onal
Ins*tute
of
Jus*ce
FY15
R&D
in
Forensic
Science
Award
2015-‐DN-‐BX-‐K062.
For
more
informa*on:
Seth
A.
Faith
safaith@ncsu.edu,
www.genomicidlab.com
Verifica/on
of
Mass
Ra/o
in
NIST
SRM2391c
Component
D
(3:1
female:male).
Total
reads
matching
known
alleles
in
STR
loci
having
no
overlap
or
stuler
interference
(D1S1656,
D81179,
D19S433,
PentaE,
TPOX)
were
quan/fied
and
reported
as
a
ra/o
of
major/
minor.
Mixtures
DI DO DI DO DI DO
49 2391c'A 500 42 6 0 0 0 0 1
50 2391c'A 500 42 3 0 0 0 0 2
51 2391c'B 500 68 7 0 1 0 0 2
57 2391c'A 250 42 0 0 0 1 0 5
58 2391c'A 250 42 0 0 0 0 0 5
59 2391c'B 250 68 1 0 0 1 0 18
60 2391c'B 250 68 1 0 0 3 0 18
Percent'error
Percent'accuracy
3xSD'per'locus'threshold
84.84 1.08 13.71
95.16 98.92 86.29
File Sample Input'(pg) Expected'alleles
1xSD'per'locus'threshold' 2xSD'per'locus'threshold
Total
‘reads’
per
locus/allele
sequence
were
quan/fied
by
the
Al/us
tool.
One
hundred
percent
of
expected
alleles
were
observed
in
500
and
250
ng
gDNA
samples,
but
sequencer
errors
resulted
in
some
noise.
To
assist
interpreta/on
the
standard
devia/on
(SD)
of
‘reads’
was
calculated
per
each
locus
and
a
threshold
of
1x,
2x,
and
3x
SD
was
used
to
call
alleles.