"Metrology for Identity and Other Nominal Properties" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by National Institute for Standards and Technology October 2014 by David Duewer, PhD from NIST.
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Metrology for Identity and Other Nominal Properties
1. Metrology
for Identity and Other Nominal Properties
David Lee Duewer
Chemical Sciences Division
Materials Measurement Laboratory
National Institute of Standards and Technology
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
2. And we take ourselves very seriously…When I Say “We”…
PhD 1985 Analytical chemist
5 y Perkin-Elmer – Instrument Design/Development
24 y NIST “Innovator”
PhD 1976 Analytical chemist
11 y Monsanto - process & biodiscovery
23y NIST “Data Jock”
Marc Salit Dave Duewer
Leader,
Genome Scale Measurements
Group
Co-Director, NIST/Stanford U.
Joint Initiative on Measurements
in Biology
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
3. Metrology (Measurement Science)
• Metrology is the stuff needed so data can
support informed decision making.
• in a good world, decisions are informed with data
• which are the results of measurements!
• Calculus of Confidence
• we posit that metrology is the ‘formal’ system
that tells us how well we trust those data
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
4. Calculus of Confidence
• The tools of metrology:
• Traceability
• Uncertainty
• Validation
• enable this calculus of confidence by
which decisions are informed by
measurement results with established
confidence.
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
5. Craft
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
• Metrology is more a craft than a technology
• this doesn’t mean that 7 year apprenticeships are
required!
• it does mean that two different skilled metrologists
might take very different approaches to the same
problem
• but they should both come to largely equivalent
solutions!
• matter of style
• must be defensible
6. The “How Much” Worldview
as seen by chemists/biochemists
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
7. Tools of the Trade
Workshop on DNA Methods for Quality Control of Botanical Products USP, 23-Oct-2014
www.bipm.org/en/publications/guides/#vim
www.nist.gov/pml/pubs/sp811/www.bipm.org/en/publications/guides/#gum
“GUM” “VIM”
8. Metrological Traceability
enables comparisons to be made over time and place
SI unit
(amount of substance)
purity analysis
Result
primary methods
reference methods
routine methods
high purity primary RM
primary calibration CRM
secondary calibration RM
routine sample
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
9. Validation
ensures measurement processes are well-understood
• “checks the measurement model”
• tests completeness
• tests assumptions
• helps establish an uncertainty budget
• identifies relevant parameters to keep
under control
• tests scope
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
10. • “how much” results are only useful when
compared
• different results in different places or
measured at different times…
• “comparability over space-and-time”
• Are these results the same?
• is there significant bias?
• Is measurement precision fit-for-purpose
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
Metrological Uncertainty
enables meaningful comparison of results
11. “We think our reported value is good to 1 part in 10,000: we are
willing to bet our own money at even odds that it is correct to 2
parts in 10,000. Furthermore, if by any chance our value is shown
to be in error by more than 1 part in 1000, we are prepared to eat
the apparatus and drink the ammonia.”
Perhaps NIST’s Best Uncertainty Statement
Quote from: Doiron T and Stoup J, Uncertainty and Dimensional Calibrations, JNIST 1997;102:647-676
http://dx.doi.org/10.6028/jres.102.044
Dr. C.H. Meyers, on his measurements of the heat capacity of ammonia (circa 1920):
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
13. Several Different “What”s
• Identification
• “Pure substance” Certified Reference Material (CRM)
• Use/develop convincingly specific methods
• Inclusion
• exclusion
• Define and certify unambiguous “barcode”
• CRMs are expensive
• Verification
• Secondary reference materials (RMs) and controls
• Check “barcode” against CRM
• Can be commercial or home-brew
• Recognition
• Component of a mixture
• Check “barcode” against library
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
15. Metrological Traceability
enables comparisons to be made over time and place
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
Authority
chemical structure, biological nomenclature
identification methods
Result
verification methods
recognition methods
“pure” primary RM
QC and secondary RMs
routine samples
{CAS, IUPAC} {ICZN, ICN}
16. Taxonomic Hierarchy
Ginkgo biloba L.
Kingdom Plantae – plantes, Planta, Vegetal, plants
Subkingdom Viridaeplantae – green plants
Infrakingdom Streptophyta – land plants
Division Tracheophyta – vascular plants, tracheophytes
Subdivision Spermatophytina – spermatophytes, seed plants, phanérogames
Infradivision Gymnospermae – gymnosperms, gymnospermes, gimnosperma
Class Ginkgoopsida – ginkgo
Order Ginkgoales
Family Ginkgoaceae
Genus Ginkgo L. – ginkgo
Species Ginkgo biloba L. – maidenhair tree, common ginkgo
en.wikipedia.org/wiki/Ginkgo_biloba
http://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=183269
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
17. Validation
ensures measurement processes are well-understood
• “checks the measurement model”
• tests if identification criteria fit-for-purpose
• includes everything wanted
• excludes everything else
• (Ideally, this can be done in silico)
• tests if measurements consistent with
identification criteria
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
18. Specificity Validation Design
Chloroplast DNA sequences from authenticated Ginkgo biloba samples are used to establish inclusivity
Chloroplast DNA sequences from close relatives are used to establish exclusivity
Labudde, R.; Harnly, J.M.; Probability of identification (POI): A Statistical Model for the Validation of Qualitative Botanical Identification Methods
Official Methods of Analysis of AOAC International., Vol. 95, pp. 273–285, (2012).
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
https://www-s.nist.gov/srmors/view_cert.cfm?srm=3246
20. • “what” results are only useful when
• The same “things” can be compared
• “measurand” is the metrology-speak term
• Are these barcodes the same?
• how confident are you in the result?
• essential part of being able to compare!
Metrological Confidence
enables meaningful interpretation of results
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
21. “Where uncertainty is assessed qualitatively, it is characterised by
providing a relative sense of the amount and quality of evidence
(that is, information from theory, observations or models indicating
whether a belief or proposition is true or valid) and the degree of
agreement… This approach is used by WG III through a series of
self-explanatory terms such as: high agreement, much evidence;
high agreement, medium evidence; medium agreement, medium
evidence; etc.”
Defining “Confidence”
Climate Change 2007: Synthesis Report
www.ipcc.ch/publications_and_data/ar4/syr/en/contents.html
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
22. “Confidence”: NIST’s Initial Definitions
DNA Sequence
via Sanger sequencing
Workshop on DNA Methods for Quality Control of Botanical Products USP, 23-Oct-2014
23. On Further Thought…
• Highest confidence
• sufficient evidence
• no ambiguities or contradictions
• Very confident
• sufficient evidence
• all ambiguities unambiguously resolved
• Confident
• sufficient evidence
• all ambiguities “understood”
• but insufficient evidence to prove it
• Insufficient evidence to Certify
Acquire Evidence
Sufficient?
HighestUnambiguous?
Resolved? Very
Understood? Confident
Yes
Yes
Yes
Yes
No
No
No
No
Confidence
Maybe
No
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
24. Who Defines “Sufficient”?
You!
and the rest of the experts within your community
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
25. Criteria for Identification of Seized Drugs
SWGDRUG Recommendations :
If one technique from A, then one other (A, B, or C).
If no techniques from A, then three others (two from B).
Category A Category B Category C
Infrared Spectroscopy Capillary Electrophoresis Color Tests
Mass Spectrometry Gas Chromatography Fluorescence Spectroscopy
Nuclear Magnetic Resonance Spectroscopy Ion Mobility Spectrometry Immunoassay
Raman Spectroscopy Liquid Chromatography Melting Point
X-ray Diffractometry Microcrystalline Tests Ultraviolet Spectroscopy
Pharmaceutical Identifiers
Thin Layer Chromatography
http://www.swgdrug.org/approved.htm
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
26. Barcode of Life: Standards and Guidelines
www.barcodeoflife.org/content/resources/standards-and-guidelines
2.D.ii In November 2009, CBOL approved rbcL and matK as the
barcode regions for vascular plants. They are defined
relative to the Arabidopsis thaliana chloroplast NC_000932
sequence annotation as follows: the rbcL barcode region is
at the 5' end of the rbcL gene between bp1-599 (27-579
excluding primer sequences); the matK barcode region is
between bp205-1046 (227- 1019 excluding primer
sequences).
4.C In deciding whether a record will be repeatable and reliable
for species identification, submitters should select as
potential BARCODE records only those for which the contig
was based on bi-directional coverage with non-N base calls
at no less than 40% of the reported sequence. As described
below (5D), CBOL can direct GenBank (or another INSDC
member) to remove the BARCODE designation from
records which have all required elements (1A-I) but have
been shown to be unreliable for species identification due
to low sequence quality and coverage.
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
27. Recent Work in “What” Metrology
Chemical Identification and its Quality Assurance
Boris L. Milman
D.I. Mendeleyev Institute for Metrology, St. Petersburg, Russia
January 12, 2011 Springer, 281 pages, English
“Unlike analytical techniques for qualitative and quantitative
determinations, well-presented in books and reviews,
theoretical principles of identification and general
experimental approaches to its implementation have not
received comprehensive treatment in the literature.”
Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
28. Standards for Pathogen Identification via Next-Generation Sequencing Workshop NIST, 20-Oct-2014
Thank you for your attention!