SlideShare a Scribd company logo
1 of 18
U.S. Food and Drug Administration 
Institute for Genome Sciences 
Development of FDA MicroDB: 
A Regulatory-Grade 
Microbial Reference Database 
Heike Sichtig, Ph.D. 
Division of Microbiology Devices 
OIR/CDRH/FDA/HHS 
Heike.Sichtig@fda.hhs.gov 
Genomics Resource Center 
Institute for Genome Sciences 
ljtallon@som.umaryland.edu 
October 21-22, 2014 
Luke Tallon 
UMSOM 
NIST Workshop to Identify Standards Needed to Support Pathogen Identification 
via Next-Generation Sequencing, NIST, MD
2 
Microbial NGS-Based Diagnostic Devices 
• OIR/DMD working on a fast-tracked Draft Guidance 
• On April 1st 2014 held Public Workshop 
“Advancing Regulatory Science for High Throughput Sequencing 
Devices for Microbial Identification and Detection of Antimicrobial 
Resistance Markers” [FR Doc No: 2014-04940] 
• Workshop agenda, discussion paper and webcast online: 
http://www.fda.gov/MedicalDevices/NewsEvents/WorkshopsConferences/ucm386967.htm 
Objectives: 
1. Streamline/shorten clinical trials for microbial diagnosis/identification 
2. Establish a new comparator algorithm for assays developed using this 
new technology 
3. Develop regulatory science standards for microbial genome sequencing 
4. Investigate the regulatory science required for antimicrobial resistance 
determination through microbial genome sequence information.
3 
Inter-Agency Working Group on Feasibility 
Approach: 
• Formed a diverse working group FDA, NIH-NCBI, NIAID, DTRA, 
LLNL, and CDC 
• Conducted small pilot study to generate information to evaluate 
quality of existing sequences in the public domain (In Progress) 
• Identify the pre-existing high-quality deposits, and build from 
there 
• Will use information to set quality bar for sequence outputs for 
our ongoing sequencing efforts 
• Utilized existing standards (if available) for technical and isolate 
metadata –no need to re-invent 
• Attention given to connecting antimicrobial resistance 
phenotype to genomic deposits – clinical collection site
Looking ahead: Predictions for Reference Databases 
– Multiple levels of Reference DBs likely 
• “High quality” genomes only 
– For validation and clinical use 
• “High quality” + other available genomes 
– For testing and development 
• Requires definition of “high quality” that must include 
some draft genomes 
– Extensive screening required 
• Human and other hosts; chimeras 
• Artificial constructs 
– Separate bacterial, viral, fungal reference DBs 
– Publicly available (NCBI/EMBL/DDBJ) 
4 
Courtesy of Tom Slezak
5 
Current Need 
Robust, Standardized, and High Quality Microbial 
Sequence Database in the Public Sector 
Cover illustration 
(Copyright © 2009, American Society 
for Microbiology. All Rights Reserved.) 
• Representative Samples 
• Metadata 
• High quality raw sequences 
• Assemblies 
• Annotation 
• Public Domain
Latest NCBI Genbank Report on Bacterial Genome 
25000 
20000 
15000 
10000 
5000 
6 
Growth 
0 
Bacterial 
Genomes 
Report 
Jul-­‐98 
Aug-­‐99 
Oct-­‐00 
Nov-­‐01 
Dec-­‐02 
Jan-­‐04 
Feb-­‐05 
Mar-­‐06 
Apr-­‐07 
Jun-­‐08 
Jul-­‐09 
Aug-­‐10 
Sep-­‐11 
Oct-­‐12 
Nov-­‐13 
Dec-­‐14 
Count 
Date 
#Genomes 
#Real 
Species 
Courtesy of NCBI
Microbial Reference Database (MicroDB)($1,67M) 
• Identify “gaps” and target sequencing efforts (Funding awarded by FDA/OCET) 
7 
• All raw reads, assemblies, annotations, metadata sent to NCBI and 
accessible to the PUBLIC 
• Traceable results that could be reevaluated as necessary 
>600 Clinically 
Relevant and MCM 
Microorganisms 
Highly 
Controlled 
and 
Documented 
Approach 
Collaborations with Clinical Labs and Repositories 
• Children’s National Hospital 
• DoD Critical Reagents Program (CRP, USAMRIID) 
• FDA-CFSAN, FDA-CBER, FDA-CDER 
• DHS National Biodefense Analysis and 
Countermeasures Center (NBACC) 
• The Rockefeller University 
• Culture Collections: ATCC, DSMZ 
Sequencing Center (UMD IGS) 
• Hybrid Approach (PacBio and Illumina) 
• Deposit of Raw Reads at NCBI (SRA) 
• Deposit of Assemblies at NCBI 
• Deposit of Annotations at NCBI 
• FDA Interface to Access Data
MicroDB Requirements 
A. Extracted Genomic DNA (gDNA) 
– Extracted gDNA should be of high quality and purity, and at sufficient concentration to 
achieve a suitable yield to assure adequate depth and breadth of genomic coverage for 
the type of sequencing method employed. 
B. BioSample Metadata 
– A minimal description of the isolate source material is necessary for traceability. We are 
using 14 descriptors as outlined below. (Note: Minimal metadata is modeled in part after 
NCBI’s minimal pathogen template) 
– Unique ID, organism, strain/isolate, sample site, specimen type, host disease, collection 
date, collected by, patient age, gender, geographic location, AST method*, AST method 
manufacturer*, Antimicrobial Susceptibilities* 
C. Sequencing Data 
– The minimum requirement for sequencing data is that the generated raw reads should be 
deposited in NCBI’s Sequence Read Archive (SRA) and assemblies should be deposited 
at NCBI’s Assembly division. The availability of raw reads and assemblies will provide a 
pathway to re-analyze the data as newer technologies emerge. Furthermore, annotation 
data should be deposited when available. 
– Raw reads, assemblies, annotations* 
*not used as a criteria for exclusion 8
MicroDB Requirements 
D. Sequencing Metadata 
– A minimal description of the sequencing process is necessary for traceability. We are 
using 7 descriptors as outlined below including bioinformatics tool information for assembly 
and annotation, and genomic coverage information. 
– Library, platform, submitted by, fold coverage, pipeline, assembler, annotation tool* 
E. Suggested phenotypic metadata* 
– A description of the phenotypic information is suggested to create a link between the 
phenotypic traits of particular organisms and their genomic sequence. We are 
recommending 5 descriptors as outlined below (1-4 are also included in sections B and C). 
– Annotation, AST method, AST method manufacturer, antimicrobial susceptibilities, 
additional phenotypic data 
*not used as a criteria for exclusion 9
NCBI Submission Cases 
1. Childrens National Medical Center 
– Submit all data when available 
– Register sample metadata via BioSample 
– Submit raw reads and assemblies generated by IGS when available 
2. FDA/CFSAN 
– Collaborative agreement: Wait for genome announcements 
– Follow same procedures as for 1 and put a ‘6 month hold’ to 
release data, lift hold when genome announcements are out 
3. Rockefeller University 
– Collaborative agreement: Wait for publication 
– Follow same procedures as for 1 and put a ‘6 month hold’ to 
release data, lift hold when publication is out 
Similar agreements in place with other collaborators depending 
on their needs 
10
Project 
Approach 
• Sequencing 
in 
large 
batches 
– Illumina 
HiSeq 
paired-­‐end 
sequencing: 
>200x 
– PacBio 
long-­‐insert 
SMRT 
P4-­‐C2 
sequencing: 
>80-­‐100x 
• Assembly 
– PacBio 
only 
(HGAP, 
PBcR 
CA) 
– Illumina 
only 
(CA, 
MaSuRCA) 
– PacBio/Illumina 
hybrid 
(CA) 
– Minimal 
manual 
QA/QC 
& 
curaon 
• Automated 
Annotaon 
• Base 
modificaon 
detecon 
• Raw 
reads 
-­‐> 
NCBI 
SRA 
• Assembled 
& 
annotated 
genomes 
-­‐> 
Genbank 
– NCBI 
BIOPROJECT 
ID: 
PRJNA231221 
• FDA 
Web 
interface 
to 
aggregate 
data
Progress 
-­‐ 
Batch 
1 
Rockefeller 
(50) 
• Uniform 
sample 
set 
– Staphylococcus 
aureus 
– 2.8Mbp 
genome 
size 
– 32.8 
%GC 
– Significant 
metadata 
CNH/CFSAN 
(41) 
• Diverse 
sample 
set 
– 18 
genera 
represented 
– 2 
– 
8 
Mbp 
genome 
size 
range 
– 38 
– 
67 
%GC 
range 
Wikimedia 
Commons 
Wikimedia 
Commons 
NCBI 
BioProject: 
PRJNA231221
Rockefeller 
Samples 
• Sequencing 
– Avg 
Illumina 
cvg: 
578x 
– Avg 
PacBio 
cvg: 
185x 
– 1 
or 
2 
SMRT 
cells 
each 
• Assembly: 
– 32 
of 
50 
in 
single 
cong 
chromosome 
– Average 
cong 
count 
= 
5 
– “Best” 
assembly: 
• HGAP 
= 
29 
• CA 
hybrid 
= 
21 
• Most 
differences 
subtle 
• Annotaon 
complete 
• Final 
QC 
& 
data 
submissions 
underway
CNH/CFSAN 
Samples 
• Sequencing 
– Avg 
Illumina 
cvg: 
315x 
– Avg 
PacBio 
cvg: 
167x 
• 2 
SMRT 
cells 
each 
• Assembly 
– 12 
of 
41 
in 
single 
cong 
chromosome 
• 29 
in 
<= 
5 
congs 
– Avg 
cong 
count 
= 
4.5 
– Median 
cong 
count 
= 
3 
– “Best” 
assembly 
(of 
41): 
• HGAP 
= 
24 
• PBcR 
CA 
= 
14 
• CA 
hybrid 
= 
3 
• Annotaon 
underway
ROCK_290 Celera8 ctg vs. ref 
0 500000 1000000 1500000 2000000 2500000 
gi|374362062|gb|CP003033.1| 
2500000 
2000000 
1500000 
1000000 
500000 
0 
ctg7180000000002 
100 
80 
60 
40 
20 
0 
Assembly 
QC 
& 
Curaon 
%similarity 
CA8 
– 
Ill/PB 
hybrid 
Largest 
Ctg 
Len: 
2,759,091bp 
Total 
asm 
Ctg 
Len: 
2,770,822 
bp 
ROCK_290 HGAP2 ctg vs. ref 
0 500000 1000000 1500000 2000000 2500000 
gi|374362062|gb|CP003033.1| 
ssccff77118800000000000000001134||qquuiivveerr 
QRY 
ssscccfff777111888000000000000000000000000111012|||qqquuuiiivvveeerrr 
100 
80 
60 
40 
20 
0 
%similarity 
HGAP2 
Largest 
Ctg 
Len: 
2,128,476bp 
Total 
asm 
Ctg 
Len: 
2,802,621 
bp
4bp 
overlap? 
0 500000 1000000 1500000 2000000 2500000 
gi|595636499|gb|CP007454.1| 
2500000 
2000000 
1500000 
1000000 
500000 
0 
scf7180000000002|quiver 
Assembly 
QC 
& 
Curaon 
100 80 
60 
similarity 
%40 
20 
0 
HGAP2 
Largest 
Ctg 
Len: 
2,764,709bp 
Total 
asm 
Ctg 
Len: 
2,764,709bp 
1X 
coverage 
TAAC 
1X 
coverage 
TAGC
Challenges 
& 
Opportunies 
• Sample 
acquision 
& 
quality 
• Efficiency/throughput 
vs. 
accuracy/quality 
– Sequencing 
strategy 
– Assembly 
QA/QC 
& 
curaon 
• Ever 
longer 
reads! 
– Reduced 
coverage 
-­‐> 
higher 
efficiency 
sequencing 
– More 
“closed” 
genomes! 
• Small 
plasmids 
– SageELF 
& 
Illumina
FDA Micro Team 
Peyton Hobson, Brittany Goldberg, Kevin Snyder, Tamara Feldblyum, Uwe Scherf, Sally Hojvat 
C ollaborators 
18 
Thank You 
LLNL 
Tom Slezak 
NIH-NCBI 
Bill Klimke, Martin Shumway, David Lipman 
NIH-NIAID 
Vivien Dugan, Maria Giovani 
DTRA 
Matt Tobelmann, Chris Detter, Eric 
VanGieson, Nels Olsen 
CDC 
Duncan MacCannell 
FDA-CFSAN 
Maria Hoffmann, Cary Pirone, Andrea 
Ottessen, Marc Allard, Eric Brown 
NMRC 
Kim Bishop-Lilly, Ken Frey 
IGS@UMD 
Lisa Sadzewicz, Luke Tallon, Naomi 
Sengamalay, Al Godinez, Sandy 
Ott, Sushma Nagaraj, Claire Fraser 
Rockefeller University 
Bryan Utter, Douglas Deutsch 
Children’s National Medical Center 
Brittany Goldberg, Joseph Campos 
DOD-CRP 
Shanmuga Sozhamannan, Mike Smith 
DOD-USAMRIID 
Tim Minogue 
NBACC 
Adam Phillippy, Nick Bergman 
ATCC 
Liz Kerrigan 
DSMZ 
Cathrin Sproer

More Related Content

What's hot

NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...VHIR Vall d’Hebron Institut de Recerca
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsJoão André Carriço
 
GIAB GRC Workshop slides
GIAB GRC Workshop slidesGIAB GRC Workshop slides
GIAB GRC Workshop slidesGenomeInABottle
 
The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)Genome Reference Consortium
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsGenomeInABottle
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878GenomeInABottle
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Databasenist-spin
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshopGenomeInABottle
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_posterGenomeInABottle
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput SequencingMark Pallen
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsGenomeInABottle
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GenomeInABottle
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224GenomeInABottle
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomicsGenomeInABottle
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomesGenomeInABottle
 
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...João André Carriço
 

What's hot (20)

NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
GIAB GRC Workshop slides
GIAB GRC Workshop slidesGIAB GRC Workshop slides
GIAB GRC Workshop slides
 
The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)The Transforming Genetic Medicine Initiative (TGMI)
The Transforming Genetic Medicine Initiative (TGMI)
 
Sept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequinsSept2016 plenary mercer_sequins
Sept2016 plenary mercer_sequins
 
Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878Aug2013 NIST highly confident genotype calls for NA12878
Aug2013 NIST highly confident genotype calls for NA12878
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
2017 agbt benchmarking_poster
2017 agbt benchmarking_poster2017 agbt benchmarking_poster
2017 agbt benchmarking_poster
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput Sequencing
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference Materials
 
2016 ashg giab poster
2016 ashg giab poster2016 ashg giab poster
2016 ashg giab poster
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomics
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
 

Viewers also liked

[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3Luke Sunghyun Kim
 
Metrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal PropertiesMetrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal PropertiesNathan Olson
 
бизнес комуникация-правила
бизнес комуникация-правилабизнес комуникация-правила
бизнес комуникация-правилаRania Mohamed
 
Common Online terminologies
Common Online terminologiesCommon Online terminologies
Common Online terminologiesChezkaClaudio
 
[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현Luke Sunghyun Kim
 
O net 2553 thai
O net 2553 thaiO net 2553 thai
O net 2553 thaidogmee
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBINathan Olson
 
O net 2550
O net 2550O net 2550
O net 2550dogmee
 
Бизнес етикет -10 правила
Бизнес етикет -10 правилаБизнес етикет -10 правила
Бизнес етикет -10 правилаRania Mohamed
 
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...actualtechmedia
 
Activity 13 common online terminologies
Activity 13 common online terminologiesActivity 13 common online terminologies
Activity 13 common online terminologiesuineomino
 
Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014mcp7576
 
งาน
งานงาน
งานdogmee
 
Scrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and AccuratelyScrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and AccuratelyAmir Syafrudin
 
Get rid of Cellulite
Get rid of CelluliteGet rid of Cellulite
Get rid of CelluliteIdlehands
 
O net 2550
O net 2550O net 2550
O net 2550dogmee
 

Viewers also liked (20)

[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
[WeFocus] KIAT 기술인문융합창작소_사업화를 위한 특허 전략_김성현_20161017_v3
 
Editing images
Editing imagesEditing images
Editing images
 
Metrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal PropertiesMetrology for Identity and Other Nominal Properties
Metrology for Identity and Other Nominal Properties
 
бизнес комуникация-правила
бизнес комуникация-правилабизнес комуникация-правила
бизнес комуникация-правила
 
Common Online terminologies
Common Online terminologiesCommon Online terminologies
Common Online terminologies
 
Jamur
JamurJamur
Jamur
 
[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현[WeFocus] 특허실무기초(1) 특허법기초 김성현
[WeFocus] 특허실무기초(1) 특허법기초 김성현
 
O net 2553 thai
O net 2553 thaiO net 2553 thai
O net 2553 thai
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBI
 
O net 2550
O net 2550O net 2550
O net 2550
 
Бизнес етикет -10 правила
Бизнес етикет -10 правилаБизнес етикет -10 правила
Бизнес етикет -10 правила
 
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
Scale Computing & the Time-Starved Administrator’s Guide to Simplifying the S...
 
Activity 13 common online terminologies
Activity 13 common online terminologiesActivity 13 common online terminologies
Activity 13 common online terminologies
 
Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014Ch. 7 resp system pharm bio 120 sp2014
Ch. 7 resp system pharm bio 120 sp2014
 
Plan cuentas ang
Plan cuentas angPlan cuentas ang
Plan cuentas ang
 
งาน
งานงาน
งาน
 
Scrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and AccuratelyScrum Framework: Manage Anything Efficiently and Accurately
Scrum Framework: Manage Anything Efficiently and Accurately
 
Plan cuentas ang
Plan cuentas angPlan cuentas ang
Plan cuentas ang
 
Get rid of Cellulite
Get rid of CelluliteGet rid of Cellulite
Get rid of Cellulite
 
O net 2550
O net 2550O net 2550
O net 2550
 

Similar to Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database

Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Ed Dodds
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GenomeInABottle
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington UniversitySeth Crosby
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128GenomeInABottle
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqGolden Helix
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2Razzaqe
 
Bioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptBioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptNaglaaFathy42
 
Bioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzcBioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzcAdiM27
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2Razzaqe
 
Platforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esPlatforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esJoaquin Dopazo
 
100,000 Genomes Project.
100,000 Genomes Project.100,000 Genomes Project.
100,000 Genomes Project.David Montaner
 

Similar to Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database (20)

Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...Supporting high throughput high-biotechnologies in today’s research environme...
Supporting high throughput high-biotechnologies in today’s research environme...
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington University
 
First Coast Final
First Coast FinalFirst Coast Final
First Coast Final
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2
 
Bioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.pptBioinformatic_Databases_2.ppt
Bioinformatic_Databases_2.ppt
 
Bioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzcBioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic_Databases_2xcxzczxcxzxcxzc
 
Bioinformatic databases 2
Bioinformatic databases 2Bioinformatic databases 2
Bioinformatic databases 2
 
Platforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esPlatforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-es
 
100,000 Genomes Project.
100,000 Genomes Project.100,000 Genomes Project.
100,000 Genomes Project.
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 

Recently uploaded

GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curveAreesha Ahmad
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)AkefAfaneh2
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 

Recently uploaded (20)

GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 

Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database

  • 1. U.S. Food and Drug Administration Institute for Genome Sciences Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database Heike Sichtig, Ph.D. Division of Microbiology Devices OIR/CDRH/FDA/HHS Heike.Sichtig@fda.hhs.gov Genomics Resource Center Institute for Genome Sciences ljtallon@som.umaryland.edu October 21-22, 2014 Luke Tallon UMSOM NIST Workshop to Identify Standards Needed to Support Pathogen Identification via Next-Generation Sequencing, NIST, MD
  • 2. 2 Microbial NGS-Based Diagnostic Devices • OIR/DMD working on a fast-tracked Draft Guidance • On April 1st 2014 held Public Workshop “Advancing Regulatory Science for High Throughput Sequencing Devices for Microbial Identification and Detection of Antimicrobial Resistance Markers” [FR Doc No: 2014-04940] • Workshop agenda, discussion paper and webcast online: http://www.fda.gov/MedicalDevices/NewsEvents/WorkshopsConferences/ucm386967.htm Objectives: 1. Streamline/shorten clinical trials for microbial diagnosis/identification 2. Establish a new comparator algorithm for assays developed using this new technology 3. Develop regulatory science standards for microbial genome sequencing 4. Investigate the regulatory science required for antimicrobial resistance determination through microbial genome sequence information.
  • 3. 3 Inter-Agency Working Group on Feasibility Approach: • Formed a diverse working group FDA, NIH-NCBI, NIAID, DTRA, LLNL, and CDC • Conducted small pilot study to generate information to evaluate quality of existing sequences in the public domain (In Progress) • Identify the pre-existing high-quality deposits, and build from there • Will use information to set quality bar for sequence outputs for our ongoing sequencing efforts • Utilized existing standards (if available) for technical and isolate metadata –no need to re-invent • Attention given to connecting antimicrobial resistance phenotype to genomic deposits – clinical collection site
  • 4. Looking ahead: Predictions for Reference Databases – Multiple levels of Reference DBs likely • “High quality” genomes only – For validation and clinical use • “High quality” + other available genomes – For testing and development • Requires definition of “high quality” that must include some draft genomes – Extensive screening required • Human and other hosts; chimeras • Artificial constructs – Separate bacterial, viral, fungal reference DBs – Publicly available (NCBI/EMBL/DDBJ) 4 Courtesy of Tom Slezak
  • 5. 5 Current Need Robust, Standardized, and High Quality Microbial Sequence Database in the Public Sector Cover illustration (Copyright © 2009, American Society for Microbiology. All Rights Reserved.) • Representative Samples • Metadata • High quality raw sequences • Assemblies • Annotation • Public Domain
  • 6. Latest NCBI Genbank Report on Bacterial Genome 25000 20000 15000 10000 5000 6 Growth 0 Bacterial Genomes Report Jul-­‐98 Aug-­‐99 Oct-­‐00 Nov-­‐01 Dec-­‐02 Jan-­‐04 Feb-­‐05 Mar-­‐06 Apr-­‐07 Jun-­‐08 Jul-­‐09 Aug-­‐10 Sep-­‐11 Oct-­‐12 Nov-­‐13 Dec-­‐14 Count Date #Genomes #Real Species Courtesy of NCBI
  • 7. Microbial Reference Database (MicroDB)($1,67M) • Identify “gaps” and target sequencing efforts (Funding awarded by FDA/OCET) 7 • All raw reads, assemblies, annotations, metadata sent to NCBI and accessible to the PUBLIC • Traceable results that could be reevaluated as necessary >600 Clinically Relevant and MCM Microorganisms Highly Controlled and Documented Approach Collaborations with Clinical Labs and Repositories • Children’s National Hospital • DoD Critical Reagents Program (CRP, USAMRIID) • FDA-CFSAN, FDA-CBER, FDA-CDER • DHS National Biodefense Analysis and Countermeasures Center (NBACC) • The Rockefeller University • Culture Collections: ATCC, DSMZ Sequencing Center (UMD IGS) • Hybrid Approach (PacBio and Illumina) • Deposit of Raw Reads at NCBI (SRA) • Deposit of Assemblies at NCBI • Deposit of Annotations at NCBI • FDA Interface to Access Data
  • 8. MicroDB Requirements A. Extracted Genomic DNA (gDNA) – Extracted gDNA should be of high quality and purity, and at sufficient concentration to achieve a suitable yield to assure adequate depth and breadth of genomic coverage for the type of sequencing method employed. B. BioSample Metadata – A minimal description of the isolate source material is necessary for traceability. We are using 14 descriptors as outlined below. (Note: Minimal metadata is modeled in part after NCBI’s minimal pathogen template) – Unique ID, organism, strain/isolate, sample site, specimen type, host disease, collection date, collected by, patient age, gender, geographic location, AST method*, AST method manufacturer*, Antimicrobial Susceptibilities* C. Sequencing Data – The minimum requirement for sequencing data is that the generated raw reads should be deposited in NCBI’s Sequence Read Archive (SRA) and assemblies should be deposited at NCBI’s Assembly division. The availability of raw reads and assemblies will provide a pathway to re-analyze the data as newer technologies emerge. Furthermore, annotation data should be deposited when available. – Raw reads, assemblies, annotations* *not used as a criteria for exclusion 8
  • 9. MicroDB Requirements D. Sequencing Metadata – A minimal description of the sequencing process is necessary for traceability. We are using 7 descriptors as outlined below including bioinformatics tool information for assembly and annotation, and genomic coverage information. – Library, platform, submitted by, fold coverage, pipeline, assembler, annotation tool* E. Suggested phenotypic metadata* – A description of the phenotypic information is suggested to create a link between the phenotypic traits of particular organisms and their genomic sequence. We are recommending 5 descriptors as outlined below (1-4 are also included in sections B and C). – Annotation, AST method, AST method manufacturer, antimicrobial susceptibilities, additional phenotypic data *not used as a criteria for exclusion 9
  • 10. NCBI Submission Cases 1. Childrens National Medical Center – Submit all data when available – Register sample metadata via BioSample – Submit raw reads and assemblies generated by IGS when available 2. FDA/CFSAN – Collaborative agreement: Wait for genome announcements – Follow same procedures as for 1 and put a ‘6 month hold’ to release data, lift hold when genome announcements are out 3. Rockefeller University – Collaborative agreement: Wait for publication – Follow same procedures as for 1 and put a ‘6 month hold’ to release data, lift hold when publication is out Similar agreements in place with other collaborators depending on their needs 10
  • 11. Project Approach • Sequencing in large batches – Illumina HiSeq paired-­‐end sequencing: >200x – PacBio long-­‐insert SMRT P4-­‐C2 sequencing: >80-­‐100x • Assembly – PacBio only (HGAP, PBcR CA) – Illumina only (CA, MaSuRCA) – PacBio/Illumina hybrid (CA) – Minimal manual QA/QC & curaon • Automated Annotaon • Base modificaon detecon • Raw reads -­‐> NCBI SRA • Assembled & annotated genomes -­‐> Genbank – NCBI BIOPROJECT ID: PRJNA231221 • FDA Web interface to aggregate data
  • 12. Progress -­‐ Batch 1 Rockefeller (50) • Uniform sample set – Staphylococcus aureus – 2.8Mbp genome size – 32.8 %GC – Significant metadata CNH/CFSAN (41) • Diverse sample set – 18 genera represented – 2 – 8 Mbp genome size range – 38 – 67 %GC range Wikimedia Commons Wikimedia Commons NCBI BioProject: PRJNA231221
  • 13. Rockefeller Samples • Sequencing – Avg Illumina cvg: 578x – Avg PacBio cvg: 185x – 1 or 2 SMRT cells each • Assembly: – 32 of 50 in single cong chromosome – Average cong count = 5 – “Best” assembly: • HGAP = 29 • CA hybrid = 21 • Most differences subtle • Annotaon complete • Final QC & data submissions underway
  • 14. CNH/CFSAN Samples • Sequencing – Avg Illumina cvg: 315x – Avg PacBio cvg: 167x • 2 SMRT cells each • Assembly – 12 of 41 in single cong chromosome • 29 in <= 5 congs – Avg cong count = 4.5 – Median cong count = 3 – “Best” assembly (of 41): • HGAP = 24 • PBcR CA = 14 • CA hybrid = 3 • Annotaon underway
  • 15. ROCK_290 Celera8 ctg vs. ref 0 500000 1000000 1500000 2000000 2500000 gi|374362062|gb|CP003033.1| 2500000 2000000 1500000 1000000 500000 0 ctg7180000000002 100 80 60 40 20 0 Assembly QC & Curaon %similarity CA8 – Ill/PB hybrid Largest Ctg Len: 2,759,091bp Total asm Ctg Len: 2,770,822 bp ROCK_290 HGAP2 ctg vs. ref 0 500000 1000000 1500000 2000000 2500000 gi|374362062|gb|CP003033.1| ssccff77118800000000000000001134||qquuiivveerr QRY ssscccfff777111888000000000000000000000000111012|||qqquuuiiivvveeerrr 100 80 60 40 20 0 %similarity HGAP2 Largest Ctg Len: 2,128,476bp Total asm Ctg Len: 2,802,621 bp
  • 16. 4bp overlap? 0 500000 1000000 1500000 2000000 2500000 gi|595636499|gb|CP007454.1| 2500000 2000000 1500000 1000000 500000 0 scf7180000000002|quiver Assembly QC & Curaon 100 80 60 similarity %40 20 0 HGAP2 Largest Ctg Len: 2,764,709bp Total asm Ctg Len: 2,764,709bp 1X coverage TAAC 1X coverage TAGC
  • 17. Challenges & Opportunies • Sample acquision & quality • Efficiency/throughput vs. accuracy/quality – Sequencing strategy – Assembly QA/QC & curaon • Ever longer reads! – Reduced coverage -­‐> higher efficiency sequencing – More “closed” genomes! • Small plasmids – SageELF & Illumina
  • 18. FDA Micro Team Peyton Hobson, Brittany Goldberg, Kevin Snyder, Tamara Feldblyum, Uwe Scherf, Sally Hojvat C ollaborators 18 Thank You LLNL Tom Slezak NIH-NCBI Bill Klimke, Martin Shumway, David Lipman NIH-NIAID Vivien Dugan, Maria Giovani DTRA Matt Tobelmann, Chris Detter, Eric VanGieson, Nels Olsen CDC Duncan MacCannell FDA-CFSAN Maria Hoffmann, Cary Pirone, Andrea Ottessen, Marc Allard, Eric Brown NMRC Kim Bishop-Lilly, Ken Frey IGS@UMD Lisa Sadzewicz, Luke Tallon, Naomi Sengamalay, Al Godinez, Sandy Ott, Sushma Nagaraj, Claire Fraser Rockefeller University Bryan Utter, Douglas Deutsch Children’s National Medical Center Brittany Goldberg, Joseph Campos DOD-CRP Shanmuga Sozhamannan, Mike Smith DOD-USAMRIID Tim Minogue NBACC Adam Phillippy, Nick Bergman ATCC Liz Kerrigan DSMZ Cathrin Sproer