SlideShare a Scribd company logo
www.citrusgreening.org
Objective 1
Data Integration and Analysis
Genome, annotation and transcriptome
Fifth Annual Meeting
Ft. Pierce, FL
Prashant Hosmani, Mirella Flores, Lukas Mueller and Surya Saha
Boyce Thompson Institute
www.citrusgreening.org
www.citrusgreening.org
Psyllid genomics timeline2014
• Psyllid v1.1 genome
2015
2016
• MCOT de novo transcriptome
• Psyllid annotation OGSv1.0
• Psyllid PacBio genome v1.9
2017
2019
• Psyllid annotation OGSv3.0
• IsoSeq de novo transcriptome
2018
• Psyllid PacBio genome 2.0
• Psyllid annotation OGSv2.0
• Carsonella and Profftella
genomes from FL
• Psyllid PacBio genome v3.0
• Wolbachia strains from FL
Manual annotation
www.citrusgreening.org
https://www.biorxiv.org/content/10.1101/869685v1
17 students among 30 authors
www.citrusgreening.org
Pacbio genome assembly
500ng input DNA for Dovetail Chicago from single male psyllid
Duplicated contigs added to alternate assembly
Asian citrus psyllid(ACP) reference genome
v1.1 v2.0
REFERENCE
v3.0
REFERENCE
Number of
contigs
161,988 1,906 13 + unplaced
Total bases 485 Mb 498 Mb 474 Mb
Longest 1 Mb 4.2 Mb 50.3 Mb
Contig N50 34.4 Kb 749 Kb 40.5Mb
Ns 19.3 Mb 4.5 Mb 13.4Mb
Complete
BUSCO (%)
65.9 75.9 88.3
Repeat (%) 26.37 31.9 30.2
www.citrusgreening.org
Chicago and Hi-C
Microbial contamination on Chr09
• Removing first 2.3Mb
• 4-5000X depth of coverage
• Coverage of >90% of
endosymbiont genome
Carsonella
Chr09
www.citrusgreening.org
Got assembly. Now what?
Comparative genomics
AgriVectors.org
www.citrusgreening.org
Power of comparative genomics
Species Common name Genome size Lead
Cacopsylla pyricola Pear psylla 480-485Mb Rodney Cooper
Leuronota fagarae Lime psyllid 465-483Mb Jawwad Qureshi
Bactericera cockerelli Potato psyllid 421-426Mb Daisy Fu
Pachypyslla venusta Hackberry petiole gall
psyllid
TBD Nancy Moran
Lygus lineolaris Tarnished plant bug TBD OP Perera
Geocoris pallens Western big-eyed bug ~1Gb Nick Booster (Rosenheim lab)
Circulifer tenellus Beet leafhopper ~1Gb Bob Gilbertson
Macrosteles quadrilineatus Aster leafhopper TBD Astri Wayadande
Graminella nigrifrons Black-faced leafhopper TBD Astri Wayadande
Dalbulus maidis Maize leafhopper TBD Astri Wayadande
www.citrusgreening.org
Psyllid genome annotation and
manual curation using Apollo
Prashant Hosmani
Mueller Lab
Gene prediction overview for OGS v3.0
• RepeatModeler
• Protein masking
• RepeatMasker
Repeat
Masking
• RNA-seq HISAT &
StringTie
• Iso-Seq - GMAP &
Cupcake ToFU
Transcriptome • Portcullis
junctions
• StringTie
• Iso-Seq
Mikado
• Mikado Gene
Loci
• Portcullis
junctions
Maker
• AHRD
• Interproscan
Functional
annotation
Augustus
GeneMark
Diaphorina citri Apollo annotation editor
Collaboratory system
● Indian River State College (IRSC)
● Kansas State University (KSU)
● University of Cincinnati (UC)
● BTI / Cornell University
More than 40 registered annotators
Login to Apollo at
https://citrusgreening.org/
Pathway based manual curation
• Development
• Segmentation
• Wnt and other signaling pathways
• Hox genes
• Immune response
• Metabolic and cellular functions
• Carbohydrate metabolism
• Chitin metabolism
• vATPase
• Chromatin remodeling
• Environmental/Sensory
• Circadian rhythm
• Phototransduction
• Reproduction
• 811 curated genes in OGSv3
• 132 updated models from OGSv1
(genome v1.1)
High-quality manually curated genes
Annotation set OGS1.0 OGS2.0 OGS3.0 Curated
No. of genes 19,311 20,793 19,049 811
No. of transcripts 20,966 25,292 21,345 916
No. of Exons Per transcript 5.42 7.06 7.29 7.87
Avg. transcript length (bp) 1,317 1,944 2,034 2,503
Avg. exon length (bp) 243 275 279 318
non-canonical splice sites 6.05% 3.13% 2.47% 1.91%
OGS: Official Gene Set
Completeness using BUSCO single copy
markers
BUSCO Hemiptera
complete
Duplicated Fragmented Missing
OGS1 74.5 13.0 0.3 25.2
OGS2 81.6 37.3 0.2 18.2
OGS3 80.2 29.4 0.1 19.7
www.citrusgreening.org
Pacbio Isoseq transcriptome
Mirella Flores
Mueller Lab
De novo transcriptome input datasets
RNA-Seq
• Gut Clas+ and Clas- (Heck lab)
• Male, female, antenna and
terminal abdomen (Slupsky lab)
• Salivary glands (Heck Lab)
Iso-Seq
• Adult Clas +/CLas-
• Nymph Clas +/-
Workflow
Total transcripts: 60,261
Iso-Seq transcripts
and RNA-seq
transcripts
clustering
Remove
contamination
(endosymbionts,
archaea , viral,
bacteria)
RNA-Seq
De novo
transcriptome
assembly
Genome based
transcripts
filtering
Pfam domains
coding
transcripts
filtering
PacBio
Iso-Seq
pipeline.
Illumina data
correction
Remove
contamination
Filtering by
insecta trembl
set
2,197,769
196k
DcDTr (RNA-Seq transcripts): 41,457
DcDTi (Iso-Seq transcripts): 18,804
De novo transcriptome statistics
Genes 40,637
Transcripts 60,261
Average length 1,736.1
Smallest 108
Largest 35,954
N50 3,657bp
Complete 79.9%
single-copy 53.2%
duplicated 26.7
Fragmented 0.1%
Missing 20%
BUSCO
Hemiptera dataset
Number of BUSCOs: 3350
www.citrusgreening.org
Future work
Genome and annotation paper 2019-2020
• New v3.0 assembly
• First hemipteran chromosomal length genome assembly
• Curated genes from previous v1.1 and v2.0 assemblies
• Meta-paper with 11-15 sub-papers led by students for each pathway
• Isoseq transcriptome
www.citrusgreening.org

More Related Content

What's hot

QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library PrepQIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAGEN
 
Plant genome project
Plant genome projectPlant genome project
Plant genome project
ANANDALEKSHMIL
 
Gene knockout in mice
Gene knockout in miceGene knockout in mice
Gene knockout in mice
AbuKarulai
 
Cloning & Genetic Engineering
Cloning & Genetic EngineeringCloning & Genetic Engineering
Cloning & Genetic Engineering
Reginald V. Finley Sr. M.Ed.
 
Arabidopsis thaliana genome project
Arabidopsis thaliana genome projectArabidopsis thaliana genome project
Arabidopsis thaliana genome project
Karishma Gangwani
 
Gene knockout
Gene knockoutGene knockout
Metagenomics
MetagenomicsMetagenomics
Metagenomics
Surender Rawat
 
DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS
Gull Fatima
 
Genome projects and their Contributions
Genome projects and their ContributionsGenome projects and their Contributions
Genome projects and their Contributions
AlbertPaul18
 
Lecture 1,2
Lecture 1,2Lecture 1,2
Lecture 1,2
Sucheta Tripathy
 
Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)
Qaisar Khan
 
DNA barcoding and Insect Diversity Coservation
DNA barcoding and Insect Diversity CoservationDNA barcoding and Insect Diversity Coservation
DNA barcoding and Insect Diversity Coservation
vishnugm
 
New cs researchgate
New cs researchgateNew cs researchgate
New cs researchgate
charlotte rodricks
 
Genome sequencingprojects
Genome sequencingprojectsGenome sequencingprojects
Genome sequencingprojects
Sucheta Tripathy
 
Genetic Engineering and the future of Evolutiom
Genetic Engineering and the future of EvolutiomGenetic Engineering and the future of Evolutiom
Genetic Engineering and the future of Evolutiom
Richa Khatiwada
 
Biotechnology
BiotechnologyBiotechnology
Biotechnology
Srinivasreddy Patil
 
Poster_Plasmid
Poster_PlasmidPoster_Plasmid
Poster_Plasmid
Udita Chandola
 
Plant genome project(aribidopsis)
Plant genome project(aribidopsis)Plant genome project(aribidopsis)
Plant genome project(aribidopsis)
Muhammad Faizan Khattak
 
Defense Against Infectious Disease (Core)
Defense Against Infectious Disease (Core)Defense Against Infectious Disease (Core)
Defense Against Infectious Disease (Core)
Stephen Taylor
 
Knockout mice
Knockout miceKnockout mice
Knockout mice
Lovnish Thakur
 

What's hot (20)

QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library PrepQIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
 
Plant genome project
Plant genome projectPlant genome project
Plant genome project
 
Gene knockout in mice
Gene knockout in miceGene knockout in mice
Gene knockout in mice
 
Cloning & Genetic Engineering
Cloning & Genetic EngineeringCloning & Genetic Engineering
Cloning & Genetic Engineering
 
Arabidopsis thaliana genome project
Arabidopsis thaliana genome projectArabidopsis thaliana genome project
Arabidopsis thaliana genome project
 
Gene knockout
Gene knockoutGene knockout
Gene knockout
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS DNA BarcodING IN ANIMALS
DNA BarcodING IN ANIMALS
 
Genome projects and their Contributions
Genome projects and their ContributionsGenome projects and their Contributions
Genome projects and their Contributions
 
Lecture 1,2
Lecture 1,2Lecture 1,2
Lecture 1,2
 
Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)
 
DNA barcoding and Insect Diversity Coservation
DNA barcoding and Insect Diversity CoservationDNA barcoding and Insect Diversity Coservation
DNA barcoding and Insect Diversity Coservation
 
New cs researchgate
New cs researchgateNew cs researchgate
New cs researchgate
 
Genome sequencingprojects
Genome sequencingprojectsGenome sequencingprojects
Genome sequencingprojects
 
Genetic Engineering and the future of Evolutiom
Genetic Engineering and the future of EvolutiomGenetic Engineering and the future of Evolutiom
Genetic Engineering and the future of Evolutiom
 
Biotechnology
BiotechnologyBiotechnology
Biotechnology
 
Poster_Plasmid
Poster_PlasmidPoster_Plasmid
Poster_Plasmid
 
Plant genome project(aribidopsis)
Plant genome project(aribidopsis)Plant genome project(aribidopsis)
Plant genome project(aribidopsis)
 
Defense Against Infectious Disease (Core)
Defense Against Infectious Disease (Core)Defense Against Infectious Disease (Core)
Defense Against Infectious Disease (Core)
 
Knockout mice
Knockout miceKnockout mice
Knockout mice
 

Similar to Updates on the ACP v3 genome and annotation from USDA NIFA project meeting

Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Surya Saha
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
Surya Saha
 
Updates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingUpdates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meeting
Surya Saha
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Surya Saha
 
RNA-Seq Analysis of Blueberry Fruit Development and Ripening
RNA-Seq Analysis of Blueberry Fruit Development and RipeningRNA-Seq Analysis of Blueberry Fruit Development and Ripening
RNA-Seq Analysis of Blueberry Fruit Development and Ripening
Ann Loraine
 
Prashant esa2017
Prashant esa2017Prashant esa2017
Prashant esa2017
Prashant Hosmani
 
Omics in crop improvement
Omics in crop improvementOmics in crop improvement
Omics in crop improvement
Kasanaboina Krishna
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural crops
Pulipati Gangadhara Rao
 
RICHELLE SOPKO_resume_042215
RICHELLE SOPKO_resume_042215RICHELLE SOPKO_resume_042215
RICHELLE SOPKO_resume_042215
Richelle Sopko
 
Genomics Assisted biodiversity conservation_12261.pptx
Genomics Assisted biodiversity conservation_12261.pptxGenomics Assisted biodiversity conservation_12261.pptx
Genomics Assisted biodiversity conservation_12261.pptx
latief bashir
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
jennomics
 
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Joe Parker
 
Transgene-free CRISPR/Cas9 genome-editing methods in plants
Transgene-free CRISPR/Cas9 genome-editing methods in plantsTransgene-free CRISPR/Cas9 genome-editing methods in plants
Transgene-free CRISPR/Cas9 genome-editing methods in plants
CIAT
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
Chinthu V Saji
 
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis... CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
Surya Saha
 
31961.ppt
31961.ppt31961.ppt
31961.ppt
DrParamAB
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
Jonathan Eisen
 
Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016
Vasant Janakiraman
 
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
CGIAR Generation Challenge Programme
 
2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big data2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big data
beiko
 

Similar to Updates on the ACP v3 genome and annotation from USDA NIFA project meeting (20)

Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
 
Updates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingUpdates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meeting
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
 
RNA-Seq Analysis of Blueberry Fruit Development and Ripening
RNA-Seq Analysis of Blueberry Fruit Development and RipeningRNA-Seq Analysis of Blueberry Fruit Development and Ripening
RNA-Seq Analysis of Blueberry Fruit Development and Ripening
 
Prashant esa2017
Prashant esa2017Prashant esa2017
Prashant esa2017
 
Omics in crop improvement
Omics in crop improvementOmics in crop improvement
Omics in crop improvement
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural crops
 
RICHELLE SOPKO_resume_042215
RICHELLE SOPKO_resume_042215RICHELLE SOPKO_resume_042215
RICHELLE SOPKO_resume_042215
 
Genomics Assisted biodiversity conservation_12261.pptx
Genomics Assisted biodiversity conservation_12261.pptxGenomics Assisted biodiversity conservation_12261.pptx
Genomics Assisted biodiversity conservation_12261.pptx
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
 
Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...Phylogenomic methods for comparative evolutionary biology - University Colleg...
Phylogenomic methods for comparative evolutionary biology - University Colleg...
 
Transgene-free CRISPR/Cas9 genome-editing methods in plants
Transgene-free CRISPR/Cas9 genome-editing methods in plantsTransgene-free CRISPR/Cas9 genome-editing methods in plants
Transgene-free CRISPR/Cas9 genome-editing methods in plants
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis... CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 
31961.ppt
31961.ppt31961.ppt
31961.ppt
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
 
Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016
 
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
 
2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big data2015 06-12-beiko-irida-big data
2015 06-12-beiko-irida-big data
 

More from Surya Saha

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...
Surya Saha
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
Surya Saha
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Surya Saha
 
Quality Control of Sequencing Data
Quality Control of Sequencing Data Quality Control of Sequencing Data
Quality Control of Sequencing Data
Surya Saha
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
Surya Saha
 
Community resources for all y’all Omics
Community resources for all y’all OmicsCommunity resources for all y’all Omics
Community resources for all y’all Omics
Surya Saha
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Surya Saha
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016
Surya Saha
 
Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Tomato Genome Build SL3.0
Tomato Genome Build SL3.0
Surya Saha
 
Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015
Surya Saha
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing Data
Surya Saha
 
Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015
Surya Saha
 
Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…
Surya Saha
 
Sequencing
SequencingSequencing
Sequencing
Surya Saha
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data
Surya Saha
 
Quality Control of NGS Data Solutions
Quality Control of NGS Data  SolutionsQuality Control of NGS Data  Solutions
Quality Control of NGS Data Solutions
Surya Saha
 
Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN Platform
Surya Saha
 
ICAR Soybean Indore 2014
ICAR Soybean Indore 2014ICAR Soybean Indore 2014
ICAR Soybean Indore 2014
Surya Saha
 
Sequencing: The Next Generation
Sequencing: The Next GenerationSequencing: The Next Generation
Sequencing: The Next Generation
Surya Saha
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
Surya Saha
 

More from Surya Saha (20)

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
 
Quality Control of Sequencing Data
Quality Control of Sequencing Data Quality Control of Sequencing Data
Quality Control of Sequencing Data
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
 
Community resources for all y’all Omics
Community resources for all y’all OmicsCommunity resources for all y’all Omics
Community resources for all y’all Omics
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016
 
Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Tomato Genome Build SL3.0
Tomato Genome Build SL3.0
 
Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing Data
 
Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015
 
Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…
 
Sequencing
SequencingSequencing
Sequencing
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data
 
Quality Control of NGS Data Solutions
Quality Control of NGS Data  SolutionsQuality Control of NGS Data  Solutions
Quality Control of NGS Data Solutions
 
Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN Platform
 
ICAR Soybean Indore 2014
ICAR Soybean Indore 2014ICAR Soybean Indore 2014
ICAR Soybean Indore 2014
 
Sequencing: The Next Generation
Sequencing: The Next GenerationSequencing: The Next Generation
Sequencing: The Next Generation
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 

Recently uploaded

Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
Hitesh Sikarwar
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 

Recently uploaded (20)

Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 

Updates on the ACP v3 genome and annotation from USDA NIFA project meeting

  • 1. www.citrusgreening.org Objective 1 Data Integration and Analysis Genome, annotation and transcriptome Fifth Annual Meeting Ft. Pierce, FL Prashant Hosmani, Mirella Flores, Lukas Mueller and Surya Saha Boyce Thompson Institute
  • 3. www.citrusgreening.org Psyllid genomics timeline2014 • Psyllid v1.1 genome 2015 2016 • MCOT de novo transcriptome • Psyllid annotation OGSv1.0 • Psyllid PacBio genome v1.9 2017 2019 • Psyllid annotation OGSv3.0 • IsoSeq de novo transcriptome 2018 • Psyllid PacBio genome 2.0 • Psyllid annotation OGSv2.0 • Carsonella and Profftella genomes from FL • Psyllid PacBio genome v3.0 • Wolbachia strains from FL Manual annotation
  • 6. 500ng input DNA for Dovetail Chicago from single male psyllid Duplicated contigs added to alternate assembly Asian citrus psyllid(ACP) reference genome v1.1 v2.0 REFERENCE v3.0 REFERENCE Number of contigs 161,988 1,906 13 + unplaced Total bases 485 Mb 498 Mb 474 Mb Longest 1 Mb 4.2 Mb 50.3 Mb Contig N50 34.4 Kb 749 Kb 40.5Mb Ns 19.3 Mb 4.5 Mb 13.4Mb Complete BUSCO (%) 65.9 75.9 88.3 Repeat (%) 26.37 31.9 30.2 www.citrusgreening.org Chicago and Hi-C
  • 7. Microbial contamination on Chr09 • Removing first 2.3Mb • 4-5000X depth of coverage • Coverage of >90% of endosymbiont genome Carsonella Chr09
  • 8. www.citrusgreening.org Got assembly. Now what? Comparative genomics AgriVectors.org
  • 9. www.citrusgreening.org Power of comparative genomics Species Common name Genome size Lead Cacopsylla pyricola Pear psylla 480-485Mb Rodney Cooper Leuronota fagarae Lime psyllid 465-483Mb Jawwad Qureshi Bactericera cockerelli Potato psyllid 421-426Mb Daisy Fu Pachypyslla venusta Hackberry petiole gall psyllid TBD Nancy Moran Lygus lineolaris Tarnished plant bug TBD OP Perera Geocoris pallens Western big-eyed bug ~1Gb Nick Booster (Rosenheim lab) Circulifer tenellus Beet leafhopper ~1Gb Bob Gilbertson Macrosteles quadrilineatus Aster leafhopper TBD Astri Wayadande Graminella nigrifrons Black-faced leafhopper TBD Astri Wayadande Dalbulus maidis Maize leafhopper TBD Astri Wayadande
  • 10. www.citrusgreening.org Psyllid genome annotation and manual curation using Apollo Prashant Hosmani Mueller Lab
  • 11. Gene prediction overview for OGS v3.0 • RepeatModeler • Protein masking • RepeatMasker Repeat Masking • RNA-seq HISAT & StringTie • Iso-Seq - GMAP & Cupcake ToFU Transcriptome • Portcullis junctions • StringTie • Iso-Seq Mikado • Mikado Gene Loci • Portcullis junctions Maker • AHRD • Interproscan Functional annotation Augustus GeneMark
  • 12. Diaphorina citri Apollo annotation editor Collaboratory system ● Indian River State College (IRSC) ● Kansas State University (KSU) ● University of Cincinnati (UC) ● BTI / Cornell University More than 40 registered annotators Login to Apollo at https://citrusgreening.org/
  • 13. Pathway based manual curation • Development • Segmentation • Wnt and other signaling pathways • Hox genes • Immune response • Metabolic and cellular functions • Carbohydrate metabolism • Chitin metabolism • vATPase • Chromatin remodeling • Environmental/Sensory • Circadian rhythm • Phototransduction • Reproduction • 811 curated genes in OGSv3 • 132 updated models from OGSv1 (genome v1.1)
  • 14. High-quality manually curated genes Annotation set OGS1.0 OGS2.0 OGS3.0 Curated No. of genes 19,311 20,793 19,049 811 No. of transcripts 20,966 25,292 21,345 916 No. of Exons Per transcript 5.42 7.06 7.29 7.87 Avg. transcript length (bp) 1,317 1,944 2,034 2,503 Avg. exon length (bp) 243 275 279 318 non-canonical splice sites 6.05% 3.13% 2.47% 1.91% OGS: Official Gene Set
  • 15. Completeness using BUSCO single copy markers BUSCO Hemiptera complete Duplicated Fragmented Missing OGS1 74.5 13.0 0.3 25.2 OGS2 81.6 37.3 0.2 18.2 OGS3 80.2 29.4 0.1 19.7
  • 17. De novo transcriptome input datasets RNA-Seq • Gut Clas+ and Clas- (Heck lab) • Male, female, antenna and terminal abdomen (Slupsky lab) • Salivary glands (Heck Lab) Iso-Seq • Adult Clas +/CLas- • Nymph Clas +/-
  • 18. Workflow Total transcripts: 60,261 Iso-Seq transcripts and RNA-seq transcripts clustering Remove contamination (endosymbionts, archaea , viral, bacteria) RNA-Seq De novo transcriptome assembly Genome based transcripts filtering Pfam domains coding transcripts filtering PacBio Iso-Seq pipeline. Illumina data correction Remove contamination Filtering by insecta trembl set 2,197,769 196k DcDTr (RNA-Seq transcripts): 41,457 DcDTi (Iso-Seq transcripts): 18,804
  • 19. De novo transcriptome statistics Genes 40,637 Transcripts 60,261 Average length 1,736.1 Smallest 108 Largest 35,954 N50 3,657bp Complete 79.9% single-copy 53.2% duplicated 26.7 Fragmented 0.1% Missing 20% BUSCO Hemiptera dataset Number of BUSCOs: 3350
  • 21. Genome and annotation paper 2019-2020 • New v3.0 assembly • First hemipteran chromosomal length genome assembly • Curated genes from previous v1.1 and v2.0 assemblies • Meta-paper with 11-15 sub-papers led by students for each pathway • Isoseq transcriptome www.citrusgreening.org