Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus Greening (HLB) disease: High quality genomes and an open access integrated systems biology portal
Rapidly spreading invasive diseases in systems with little or no prior experimental data or resources pose a unique set of challenges for growers, scientists as well as regulators. As a part of a USDA NIFA CAPS project focused on the psyllid, Diaphorina citri, we have released improved genomics resources including high quality genome assemblies and annotation. We have also created an open access web portal for analyses around the Citrus Greening/Huanglongbing disease complex. Citrusgreening.org includes pathosystem-wide resources and bioinformatics tools for multiple Citrus spp. hosts, the Asian citrus psyllid vector (ACP, Diaphorina citri), and multiple pathogens including Candidatus Liberibacter asiaticus (CLas). To the best of our knowledge, this is the first example of a database to use the pathosystem as a holistic framework to understand an insect transmitted plant disease. Users can submit relevant data sets to enable sharing and allow the community to leverage their data within an integrated system. The system includes the metabolic pathway databases CitrusCyc and DiaphorinaCyc with organism specific pathways that can be used to mine metabolomics, transcriptomics and proteomics results to identify pathways and regulatory mechanisms involved in disease response. The Psyllid Expression Network (PEN) contains expression profiles of ACP genes from multiple life stages, tissues, conditions and hosts. The Citrus Expression Network (CEN) contains public expression data from multiple tissues and conditions for various citrus hosts. All tools connect to a central database. The portal also includes electrical penetration graph (EPG) recordings, information about citrus rootstock trials and metabolomics data in addition to traditional omics data types with a goal of combining and mining all information related to the Huanglongbing pathosystem. User-friendly manual curation tools will allow the continuous improvement of knowledge base as more experimental research is published. The portal can be accessed at https://citrusgreening.org/.
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Similar to Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus Greening (HLB) disease: High quality genomes and an open access integrated systems biology portal
Similar to Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus Greening (HLB) disease: High quality genomes and an open access integrated systems biology portal (20)
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus Greening (HLB) disease: High quality genomes and an open access integrated systems biology portal
1. www.citrusgreening.org
Infrastructure for battling the Citrusgreening disease:
High quality genomes and an integrated systems biology portal
Surya Saha
Boyce Thompson Institute, Ithaca, New York, USA
ss2489@cornell.edu | @SahaSurya
Feb 3rd, 2020
4. www.citrusgreening.org
Citrus Greening: Huanglongbing
• Most significant disease of citrus worldwide. 100% infection in Florida now
• More than $5 billion in lost citrus production and more than 10,000 lost jobs
• Associated with gram negative bacterium Candidatus Liberibacter asiaticus (CLas)
• Spread by insect vector, Diaphorina citri (Asian citrus psyllid, ACP)
Heck Lab September 2017, UC Riverside Extension
7. www.citrusgreening.org
Vector Host
Pathogen
Metabolic
pathway
databases
Expression Atlas with RNAseq,
proteomics and metabolomics
interactome networks
Systems biology data
portal for host, vector and
pathogen interactions
Genome assembly and
annotation of protein coding
and non-coding genes
Dissecting the symbiosis with
beneficial and pathogenic partners
in invasive disease systems
Identification of genetic and
epigenetic factors influencing gene
regulation in interaction networks
Understanding evolutionary factors
controlling transposon expansion and
contraction in disease systems
Identification of resistance and
susceptibility in the population using
genome wide association methods
Analyze the role of plant, environmental
and arthropod microbiomes in disease
transmission and resistance
14. www.citrusgreening.org
First endosymbiont genomes from Psyllid in FL
Wolbachia Profftella Carsonella
10 scaffolds 1 chromosome
and 1 plasmid
1 chromosome
Largest 923 Kb 471 Kb -
Smallest 19 Kb 4.7 Kb -
Total Size 2 Mb 475.7 Kb 150 Kb
Stephanie Hoyt
Mueller lab
Wolbachia Profftella Carsonella
Number of reference genomes 8 2 9
Total number of conserved orthogroups 559 307 116
Number of conserved orthogroups in our assembly 557 307 106
Number of shared orthogroups (<50% genomes) 167 - 12
Orthology Analysis
15. www.citrusgreening.org
Wolbachia Strains
Scaffolds were removed from the Wolbachia
assembly resulting in a large decrease in
duplication, but a small decrease in conserved
orthogroup coverage
Based on these results we hypothesize
that there are two strains of Wolbachia
present in this sample:
• Strain 1: Scaffolds 1 and 2 cover
534/559 conserved orthogroups
• Strain 2: Scaffolds 1 and 3 cover
503/559 conserved orthogroups Comparing genomic sequences of our Wolbachia strain 2 and
reference genomes to our Wolbachia strain 1
19. www.citrusgreening.org
Build a collaboratory ecosystem
• Build an ecosystem of resources and integrated toolkit
• Identify curation targets according to project goals
• Collaboration between scientists and students
Train undergraduate annotators and formalize curation
practices
• Recruiting annotators – Early career researchers
• Build teams according to expertise and annotation targets
• Establish the protocols for curation
Manual Curation Workflow
20. Diaphorina citri Apollo annotation editor
Request access by contacting
https://citrusgreening.org/contact/form
www.citrusgreening.org
21. Pathway based manual curation
• Development
• Segmentation
• Wnt and other signaling pathways
• Hox genes
• Immune response
• Metabolic and cellular functions
• Carbohydrate metabolism
• Chitin metabolism
• vATPase
• Chromatin remodeling
• Environmental/Sensory
• Circadian rhythm
• Phototransduction
• Reproduction
• ~1000 curated genes in OGSv3
• ~200 updated models from OGSv1
(Diaci v1.1)
www.citrusgreening.org
22. www.citrusgreening.org
Cumulative Annotation Outcomes
Group Annotation
2016-2017 2017-2018 2018-2019
14 total student
annotators
17 total student
annotators
17 total student annotators
18 total students 20 total students 30 total students involved
>250 gene models >250 gene models >300 gene models in pathways
>400 gene models in v3
>30 gene families >10 pathways >15 pathways
13 gene reports for
publication
10 pathway reports 7 pathway reports
23. High-quality manually curated genes
Annotation set OGS1.0 OGS2.0 OGS3.0 Curated
No. of genes 19,311 20,793 19,049 811
No. of transcripts 20,966 25,292 21,345 916
No. of Exons Per transcript 5.42 7.06 7.29 7.87
Avg. transcript length (bp) 1,317 1,944 2,034 2,503
Avg. exon length (bp) 243 275 279 318
non-canonical splice sites 6.05% 3.13% 2.47% 1.91%
OGS: Official Gene Set
www.citrusgreening.org
43. www.citrusgreening.org
Power of comparative genomics
Species Common name Genome size Lead
Cacopsylla pyricola Pear psylla 480-485Mb Rodney Cooper
Leuronota fagarae Lime psyllid 465-483Mb Jawwad Qureshi, Liliana Cano
Bactericera cockerelli Potato psyllid 421-426Mb Daisy Fu
Pachypyslla venusta Hackberry petiole gall
psyllid
TBD Nancy Moran
Bactericera maculipennis Bindweed psyllid 442-451Mb Rodney Cooper
Circulifer tenellus Beet leafhopper ~1Gb Bob Gilbertson, Bill Winter
Lygus lineolaris Tarnished plant bug TBD OP Perera
Geocoris pallens Western big-eyed bug ~1Gb Rosenheim lab
Macrosteles quadrilineatus Aster leafhopper TBD Astri Wayadande
Graminella nigrifrons Black-faced leafhopper TBD Astri Wayadande
Dalbulus maidis Maize leafhopper TBD Astri Wayadande
AgriVectors.org
Ag100Pest
44. www.citrusgreening.org
Portal for all Agricultural Diseases and
Vector Systems
Citrusgreening
Zebra chip
Pierce’s disease
Pathogens: Bacteria, virus and fungi
AgriVectors Home Page
AgriVectors.org
45. www.citrusgreening.org
AgriVectors
Knowledge Base
Data Producers Data Consumers
Public
Repository
(remote)
Pathosystem
Repository
(local and remote)
Topic
Repository
(local and remote)
Researchers
Extension agents
Industry
INRA
USDA ARS / APHIS
CRISPR / RNAi genes
Bacterial effectors
Microbiome
Geospatial disease data
Your Pathosystem
Zebrachip
Citrusgreening
Pierce’s disease
Researchers
INRA / USDA ARS / APHIS
IPM product development
Outreach and extension
Educators
Secure portal
Patents
Commercial
NCBI / EMBL / DDBJ
Ag Data Commons
i5k
AgriVectors
Data Schema
AgriVectors.org
46. www.citrusgreening.org
AgriVectors Data types
Integrated pest management pathosystem-wide data
• Inclusive of Vector, pathogen, host, environment and beyond
• Gene family based data sets (P450, RNAi pathway)
• E.g. Virus, Bacteria, or fungal infection assays
• Electrical Penetration Graph (EPG) feeding data
• Phenotyping data from disease trials
• Ecological and climactic data
• Behavioral assays
• Toxicology, Insecticide resistance, etc.
Publications, notes, posters, videos and extension abstracts…………
AgriVectors.org
50. www.citrusgreening.org
AgriVectors
Knowledge Base
Data Producers Data Consumers
Public
Repository
(remote)
Pathosystem
Repository
(local and remote)
Topic
Repository
(local and remote)
Researchers
Extension agents
Industry
INRA
USDA ARS / APHIS
CRISPR / RNAi genes
Bacterial effectors
Microbiome
Geospatial disease data
Your Pathosystem
Zebrachip
Citrusgreening
Pierce’s disease
Researchers
INRA / USDA ARS / APHIS
IPM product development
Outreach and extension
Educators
Secure portal
Patents
Commercial
NCBI / EMBL / DDBJ
Ag Data Commons
i5k
AgriVectors
Data Schema Questions??
@Citrusgreening
@SahaSurya