SlideShare a Scribd company logo
1 of 47
Download to read offline
Functional annotation of invertebrate
genomes
Surya Saha, Fiona McCarthy, Amanda Cooksey,
Anna K. Childers & Monica Poelchau
suryasaha@arizona.edu | @SahaSurya
August 31st, 2020
Acknowledgements
Fiona M McCarthy Monica Poelchau Chris Childers
Mueller Lab, Boyce Thompson Institute
Roadmap
1. Functional annotation tools for invertebrates
2. Example: Citrus greening
3. Asian citrus psyllid (Diaphorina citri)
• Genome assembly
• Microbiome and interaction with pathogen
4. Structural annotation of genes
5. Functional annotation
• Gene Ontology (GO)
• Pathways
6. Example: Functional modeling of Infected vs Uninfected D. citri
7. Upcoming resources and annotation plans
How do we move from
sequence to biology?
• ARS-UA joint project to develop
common workflows and
practices for functionally
annotating invertebrate
genomes.
• Training events to support use of
these workflows.
Annotation
1
2 3
4
5
Annotation
1. Functional annotation tools
1. Identify proteins
2. Transfer function based upon sequence homology
3. Assign function based upon functional motifs/domains
4. Combine GO, QC, formatting for use
5. Pathway information
1. Identify proteins
2. Transfer function based upon sequence homology
3. Assign function based upon functional motifs/domains
4. Combine GO, QC, formatting for use
5. Pathway information
1
2 3
4
5
Annotation
Current Size of EXP Only Dbs
SwissProt 72,337
TrEMBL 50,258
Invertebrate 20,741
Arthropod 12,081
Insecta 11,886
Nematode 4,941
So What Does this Process Get Us?
Motif/domain information for comparative & evolutionary studies
• Evolution of gene families
• Targets for genome annotation
GO information for GO enrichment
• Support for functional genomics
• GO enrichment tools that allow you import your own GO annotations
• Targets for genome annotation
Pathways information for functional enrichment
• Identification of arthropod specific pathways
• KOBAS tool for pathways enrichment
2. Example: Citrus Greening (Huanglongbing)
• Most significant disease of citrus worldwide. 100% infection in Florida now
• More than $5 billion in lost citrus production and more than 10,000 lost jobs
• Associated with gram negative bacterium Candidatus Liberibacter asiaticus (CLas)
• Spread by insect vector, Diaphorina citri (Asian citrus psyllid, ACP)
Heck Lab September 2017, UC Riverside Extension
www.citrusgreening.org
Diaphorina citri
Asian citrus psyllid (ACP)
ACP bacterial
symbionts
CLas
Citrus spp.
The biological players
Wolbachia
Profftella
Carsonella
Kruse et. al. 2019 Insectswww.citrusgreening.org
500ng input DNA from single male psyllid
Duplicated contigs added to alternate assembly
Error correction
• DNA sequencing data
• RNA sequencing data
Duplication removal with Redundans
Scaffolding with Hi-C
3. Asian citrus psyllid genome (Diaphorina citri)
v1.1 v2.0
REFERENCE
v3.0
REFERENCE
Number of
contigs
161,988 1,906 13 + unplaced
Total bases 485 Mb 498 Mb 474 Mb
Longest 1 Mb 4.2 Mb 50.3 Mb
Contig N50 34.4 Kb 749 Kb 40.5Mb
Ns 19.3 Mb 4.5 Mb 13.4Mb
Complete
BUSCO (%)
65.9 75.9 88.3
Repeat (%) 26.37 31.9 30.2
www.citrusgreening.org
CLas induces mitochondrial dysfunction in
the gut
Kruse et al. 2017, Mann et al. 2018
MitoSOX staining
CLas +
CLas -
www.citrusgreening.org
CLas and Wolbachia localize in the
same ACP gut cells
DAPI nuclear stain CLas
(Pathogen)
Wolbachia
(Endosymbiont)
Merged
60X magnification
Kruse et. al. PLoS One 2017www.citrusgreening.org
First endosymbiont genomes from Psyllid in FL
Wolbachia Profftella Carsonella
10 scaffolds 1 chromosome
and 1 plasmid
1 chromosome
Largest 923 Kb 471 Kb -
Smallest 19 Kb 4.7 Kb -
Total Size 2 Mb 475.7 Kb 150 Kb
Stephanie Hoyt
Mueller lab
Wolbachia Profftella Carsonella
Number of reference genomes 8 2 9
Total number of conserved orthogroups 559 307 116
Number of conserved orthogroups in our assembly 557 307 106
Number of shared orthogroups (<50% genomes) 167 - 12
Orthology Analysis
www.citrusgreening.org
Wolbachia Strains
Scaffolds were removed from the Wolbachia
assembly resulting in a large decrease in
duplication, but a small decrease in conserved
orthogroup coverage
Based on these results we hypothesize
that there are two strains of Wolbachia
present in this sample:
• Strain 1: Scaffolds 1 and 2 cover
534/559 conserved orthogroups
• Strain 2: Scaffolds 1 and 3 cover
503/559 conserved orthogroups Comparing genomic sequences of our Wolbachia strain 2 and
reference genomes to our Wolbachia strain 1
www.citrusgreening.org
High quality annotation and databases are required to
identify targets for interdiction
15
Genome Annotation
Target for interdiction molecules
Pathway Databases
Expression Networks
…….
Host
Vector
Pathogen
www.citrusgreening.org
4. Gene Prediction Workflow
• RepeatModeler
• Protein masking
• RepeatMasker
Repeat
Masking
• RNA-seq HISAT &
StringTie
• Iso-Seq - GMAP &
Cupcake ToFU
Transcriptome
• Portcullis
junctions
• StringTie
• Iso-Seq
Mikado
• Mikado Gene
Loci
• Portcullis
junctions
Maker
• AHRD
• InterProScan
Functional
annotation
Augustus
GeneMark
www.citrusgreening.org
Prashant Hosmani
Mueller Lab
Student-driven community annotation
www.citrusgreening.org
High-quality Manually Curated Genes
Annotation set OGS1.0 OGS2.0 OGS3.0 Curated
No. of genes 19,311 20,793 19,049 811
No. of transcripts 20,966 25,292 21,345 916
No. of Exons Per transcript 5.42 7.06 7.29 7.87
Avg. transcript length (bp) 1,317 1,944 2,034 2,503
Avg. exon length (bp) 243 275 279 318
Non-canonical splice sites 6.05% 3.13% 2.47% 1.91%
OGS: Official Gene Set
www.citrusgreening.org
Pathway based manual curation
• Development
• Segmentation
• Wnt and other signaling pathways
• Hox genes
• Detoxification
• Immune response
• Metabolic and cellular functions
• Carbohydrate metabolism
• Chitin metabolism
• vATPase
• Chromatin remodeling
• Environmental/Sensory
• Circadian rhythm
• Phototransduction
• Reproduction
• ~1000 curated genes in OGSv3
• ~200 updated models from OGSv1 (Diaci v1.1)
www.citrusgreening.org
https://www.biorxiv.org/content/10.1101/869685v1
17 students among 30 authors
www.citrusgreening.org
5. Functional annotation: InterproScan results
10,946 (57%) genes have 2,281 unique GO terms
5,311 (27%) genes are assigned to 1,159 unique pathways
Runtime
• 1-3 days Cyverse Discovery Environment app
• 4 hours on 64 core single node with Docker container
InterProScan Motifs and Domains
16,081 (84%) proteins have at least one motif or domain assigned
8,752 unique InterPro domains
Average 3 domains per annotated protein
0 100 200 300 400 500 600 700 800 900 1000
GPCR family 3, GABA-B receptor
GPCR, family 2-like
WD40 repeat
MFS transporter
WD40/YVTN repeat-like
ARM-type_fold
Ig-like_fold
Kinase-like
Znf C2H2
P-loop NTPase
Motifs & Domains Identified by InterProScan
Poorly represented gene families
0 500 1000 1500 2000 2500 3000
nuclease activity
isomerase activity
lyase activity
structural molecule activity
phosphatase activity
enzyme regulator activity
ligase activity
GTPase activity
transferase activity, transferring acyl groups
methyltransferase activity
transferase activity, transferring glycosyl…
enzyme binding
cytoskeletal protein binding
ATPase activity
structural constituent of ribosome
DNA-binding transcription factor activity
RNA binding
peptidase activity
kinase activity
DNA binding
transmembrane transporter activity
oxidoreductase activity
ion binding
Summary of GO Biological Process
WARNING
Dcitr05g1219011:
Slim id: GO:0044403 symbiont process
GO:0019079 viral genome replication
InterProScan GO Results: Biological Process
Poorly represented gene families
0 200 400 600 800 1000 1200 1400 1600
extracellular region
endoplasmic reticulum
nucleoplasm
chromosome
cytoskeleton
plasma membrane
mitochondrion
ribosome
cytoplasm
nucleus
organelle
protein-containing complex
intracellular
cell
Summary of GO Cellular Component
WARNING Slim id: GO:0005618 cell wall 3
Dcitr00g0323011 GO:0009277 fungal-type cell wall
Dcitr00g0493011 GO:0009277 fungal-type cell wall
Dcitr00g1172011 GO:0009277 fungal-type cell wall
InterProScan GO Results: Cellular Component
How do we measure GO Quality?
BREADTH: all gene products should have GO
annotation (for CC, MF, BP).
DEPTH: function should be as detailed as possible. EVIDENCE: Published experiments provide direct
evidence of function in that species.
Buza et al 2008. Gene Ontology annotation quality analysis in model eukaryotes. Nucleic acids research, 36(2), e12-e12.
Adding Details to InterProScan GO: GOanna
0
20
40
60
80
100
120
140
160
180
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
InterPro GOanna Combined
Annotation Type
GO Annotation Quality
No. GO annotations proteins annotated Av Quality Score
Interpro & GOanna are complementary approaches.
InterProScan provides "breadth" (some GO annotation for most proteins)
GOanna provides "depth" (more detailed GO terms for some proteins)
What does GOanna add to GO annotation?
What does GOanna add to GO annotation?
Poorly represented functions
in InterProScan derived GO.
5. Functional annotation: Pathways
InterProScan results
• 5,311 (27%) genes are assigned to 1,159 unique pathways
• Average of 18.7 genes per pathway
• All are Reactome human pathways (R-HSA)
KOBAS Annotate results
• Assigns pathways via hits to Drosophila proteins
• 13,582 (71%) genes assigned pathways from following databases
• 24,101 Reactome
• 3,207 KEGG PATHWAY
• 1,003 PANTHER
• 7 BioCyc
Tissues
Gut
Abdomen
Antennae
Whole body
Terminal abdomen
Leg
Thorax
Head
Midgut
Sexes
Male
Female
Stages
Egg
Nymph
Adult
Infection states
CLas-
CLas+
CLas+ Low infection
CLas+ High Infection
Host
C. sinensis
C. medica
C. reticulata
C. macrophylla
6. Example: D. citri Infected / Uninfected
RNAseq samples from various tissues and citrus hosts for
the Asian citrus psyllid
www.citrusgreening.org
Comparison of Infected and Uninfected Samples
Infected samples: 22
Uninfected samples: 35
79% genes have > 1 read/million in at least
22 libs
Lot of variability across samples!!
InfectedUninfected
Differential Expression Results
16,879 genes with nonzero total read
count with adjusted p-value < 0.05
LFC > 0 (up) : 3162, 19%
LFC < 0 (down): 3627, 21%
Gene-wise estimates (black) and fitted
values (red)
Blue circles are genes with high dispersion
that are outliers
topGO Enriched GO Biological Processes
All GO terms with p-val 0.05
Deeper shades of red indicate smaller p-values
Larger circles represent higher proportion of proteins
Genes
GO BP
mappable
genes GO terms
GO terms p <
0.01
InterProScan 10,946 7,130 1,384 61
InterProScan
+ GOanna
11,490 7,673 2,022 58
topGO Enriched GO Molecular Functions
All GO terms with p-val 0.05
Deeper shares of red indicate smaller p-values
Larger circles represent higher proportion of proteins
Genes
GO MF
mappable
genes GO terms
GO terms p <
0.01
InterProScan 10,946 3,280 270 6
InterProScan
+ GOanna
11,490 9,365 713 16
DEGs associated with the cytoskeleton were
upregulated in the CLas-infected midguts
topGO Enriched GO Cellular Component
Genes
GO CC
mappable
genes GO terms
GO terms p <
0.01
InterProScan 10,946 536 111 0
InterProScan
+ GOanna
11,490 4,498 447 4All GO terms with p-val 0.05
Deeper shares of red indicate smaller p-values
Larger circles represent higher proportion of proteins
Enriched Pathways (KOBAS Identify)
“Localized mitochondrial dysfunction in the gut when
insects are exposed to CLas-infected trees”
Nuclear swelling and
fragmentation of the
heterochromatin
Green: universal set
Red: Annotated genes
“D. citri might inhibit the expression of endocytosis-
related genes in the midgut to prevent the further
transmission of Clas”
Pathway
Input
number
Background
number
P-Value
Gene Expression 281 1303 3.78E-07
Endocytosis 53 221 0.004147464
Cell Cycle 84 378 0.002532175
Nonsense-Mediated Decay (NMD)
52 195
0.000666531
siRNA biogenesis 9 18 0.007008944
One carbon pool by folate 11 25 0.00629025
Pathway
Input
number
Background
number
P-Value
Fatty Acyl-CoA Biosynthesis 31 92 0.00357436
ABC-family proteins mediated
transport
24 70 0.008101975
COPI-mediated anterograde
transport
33 106 0.007223449
Cellular response to hypoxia 12 25 0.008017566
Formation of ATP by chemiosmotic
coupling
12 23 0.004869061
Regulation of cytoskeletal
remodeling and cell spreading by IPP
complex components
6 6 0.005526965
Enriched Pathways: Up & Down Regulated Genes
Pathways enriched from Up-regulated genes Pathways enriched from Down-regulated genes
6. Summary of Functional Modeling
Tools for YOU!!
• Functional modeling tools to link
genomics back to biological
context
• Can now provide GO and pathway
information for functional
genomics
• InterPro motif analysis may help
guide manual annotations &
supports comparative analyses
• Tools available via AgBase &
Docker
Analysis of data sets
• Citrus greening vector (D. citri) now
has GO & pathways information
available
• GO and pathways analyses are
complementary (shared insights)
• During infection, vector
transcription and translation
responses are tissue-specific
• Lipid synthesis is down regulated
and protein transport is disrupted
• Strong links to mitochondrial
dysfunction
Accessing functional annotation resources
agbase-docs.readthedocs.io
de.cyverse.org
hub.docker.com/u/agbase
7. Future Plans & Acknowledgements
• Continued testing and deployment of the workflows
• When to use InterProScan and when to add GOanna GO?
• Prioritizing genes for manual curation
• Identification of missing or erroneous gene families
• Optimizing pathways information
• What format will make this most useful?
• How can we improve pathway reconstruction?
• Training sessions
• Feedback on tools and documentation
• Making functional data from this project available
• i5k, NAL, AgBase and Citrusgreening.org
• Docker and Singularity based pipeline
This work was supported by funding from
the USDA Agricultural Research Service
Thank
you!!

More Related Content

What's hot

Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
Senthil Natesan
 

What's hot (20)

genomic comparison
genomic comparison genomic comparison
genomic comparison
 
15 molecular markers techniques
15 molecular markers techniques15 molecular markers techniques
15 molecular markers techniques
 
artificial neural network-gene prediction
artificial neural network-gene predictionartificial neural network-gene prediction
artificial neural network-gene prediction
 
Comparative genomics in eukaryotes, organelles
Comparative genomics in eukaryotes, organellesComparative genomics in eukaryotes, organelles
Comparative genomics in eukaryotes, organelles
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
Histone modification in living cells
Histone modification in living cellsHistone modification in living cells
Histone modification in living cells
 
Application of genetic mapping
Application of genetic mappingApplication of genetic mapping
Application of genetic mapping
 
YEAST TWO HYBRID SYSTEM
 YEAST TWO HYBRID SYSTEM YEAST TWO HYBRID SYSTEM
YEAST TWO HYBRID SYSTEM
 
BITS - Introduction to comparative genomics
BITS - Introduction to comparative genomicsBITS - Introduction to comparative genomics
BITS - Introduction to comparative genomics
 
Gene expression profiling i
Gene expression profiling  iGene expression profiling  i
Gene expression profiling i
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomics
 
Types of genomics ppt
Types of genomics pptTypes of genomics ppt
Types of genomics ppt
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Cell cell hybridization or somatic cell hybridization
Cell cell hybridization or somatic cell hybridizationCell cell hybridization or somatic cell hybridization
Cell cell hybridization or somatic cell hybridization
 
Single nucleotide polymorphism, (SNP)
Single nucleotide polymorphism, (SNP)Single nucleotide polymorphism, (SNP)
Single nucleotide polymorphism, (SNP)
 
Human Genome Project
Human Genome ProjectHuman Genome Project
Human Genome Project
 
Genome concept, types, and function
Genome  concept, types, and functionGenome  concept, types, and function
Genome concept, types, and function
 
Ribozyme technology
Ribozyme technology Ribozyme technology
Ribozyme technology
 

Similar to Functional annotation of invertebrate genomes

Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Surya Saha
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Surya Saha
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Monica Munoz-Torres
 
High throughput approaches to understanding gene function and mapping archite...
High throughput approaches to understanding gene function and mapping archite...High throughput approaches to understanding gene function and mapping archite...
High throughput approaches to understanding gene function and mapping archite...
Tintumann
 

Similar to Functional annotation of invertebrate genomes (20)

Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meetingUpdates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
2016. Motoaki seki. RIKEN cassava initiative
2016. Motoaki seki. RIKEN cassava initiative2016. Motoaki seki. RIKEN cassava initiative
2016. Motoaki seki. RIKEN cassava initiative
 
Apollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionApollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 Introduction
 
Modern techniques of crop improvement.pptx final
Modern techniques of crop improvement.pptx finalModern techniques of crop improvement.pptx final
Modern techniques of crop improvement.pptx final
 
Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920Tyler future of genomics thurs 0920
Tyler future of genomics thurs 0920
 
Prashant esa2017
Prashant esa2017Prashant esa2017
Prashant esa2017
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
Omics in crop improvement
Omics in crop improvementOmics in crop improvement
Omics in crop improvement
 
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis... CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 
Crowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen Science
Crowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen ScienceCrowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen Science
Crowdsourcing Biology: The Gene Wiki, BioGPS, and Citizen Science
 
High throughput approaches to understanding gene function and mapping archite...
High throughput approaches to understanding gene function and mapping archite...High throughput approaches to understanding gene function and mapping archite...
High throughput approaches to understanding gene function and mapping archite...
 
31961.ppt
31961.ppt31961.ppt
31961.ppt
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Community resources for all y’all Omics
Community resources for all y’all OmicsCommunity resources for all y’all Omics
Community resources for all y’all Omics
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 

More from Surya Saha

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...
Surya Saha
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
Surya Saha
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Surya Saha
 

More from Surya Saha (20)

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...
 
Updates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingUpdates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meeting
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
 
Quality Control of Sequencing Data
Quality Control of Sequencing Data Quality Control of Sequencing Data
Quality Control of Sequencing Data
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016
 
Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Tomato Genome Build SL3.0
Tomato Genome Build SL3.0
 
Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing Data
 
Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015
 
Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…
 
Sequencing
SequencingSequencing
Sequencing
 
Quality Control of NGS Data
Quality Control of NGS Data Quality Control of NGS Data
Quality Control of NGS Data
 
Quality Control of NGS Data Solutions
Quality Control of NGS Data  SolutionsQuality Control of NGS Data  Solutions
Quality Control of NGS Data Solutions
 
Sequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN PlatformSequencing, Genome Assembly and the SGN Platform
Sequencing, Genome Assembly and the SGN Platform
 
ICAR Soybean Indore 2014
ICAR Soybean Indore 2014ICAR Soybean Indore 2014
ICAR Soybean Indore 2014
 
Sequencing: The Next Generation
Sequencing: The Next GenerationSequencing: The Next Generation
Sequencing: The Next Generation
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 

Recently uploaded

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 

Recently uploaded (20)

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 

Functional annotation of invertebrate genomes

  • 1. Functional annotation of invertebrate genomes Surya Saha, Fiona McCarthy, Amanda Cooksey, Anna K. Childers & Monica Poelchau suryasaha@arizona.edu | @SahaSurya August 31st, 2020
  • 2. Acknowledgements Fiona M McCarthy Monica Poelchau Chris Childers Mueller Lab, Boyce Thompson Institute
  • 3. Roadmap 1. Functional annotation tools for invertebrates 2. Example: Citrus greening 3. Asian citrus psyllid (Diaphorina citri) • Genome assembly • Microbiome and interaction with pathogen 4. Structural annotation of genes 5. Functional annotation • Gene Ontology (GO) • Pathways 6. Example: Functional modeling of Infected vs Uninfected D. citri 7. Upcoming resources and annotation plans
  • 4. How do we move from sequence to biology? • ARS-UA joint project to develop common workflows and practices for functionally annotating invertebrate genomes. • Training events to support use of these workflows. Annotation
  • 5. 1 2 3 4 5 Annotation 1. Functional annotation tools 1. Identify proteins 2. Transfer function based upon sequence homology 3. Assign function based upon functional motifs/domains 4. Combine GO, QC, formatting for use 5. Pathway information
  • 6. 1. Identify proteins 2. Transfer function based upon sequence homology 3. Assign function based upon functional motifs/domains 4. Combine GO, QC, formatting for use 5. Pathway information 1 2 3 4 5 Annotation Current Size of EXP Only Dbs SwissProt 72,337 TrEMBL 50,258 Invertebrate 20,741 Arthropod 12,081 Insecta 11,886 Nematode 4,941
  • 7. So What Does this Process Get Us? Motif/domain information for comparative & evolutionary studies • Evolution of gene families • Targets for genome annotation GO information for GO enrichment • Support for functional genomics • GO enrichment tools that allow you import your own GO annotations • Targets for genome annotation Pathways information for functional enrichment • Identification of arthropod specific pathways • KOBAS tool for pathways enrichment
  • 8. 2. Example: Citrus Greening (Huanglongbing) • Most significant disease of citrus worldwide. 100% infection in Florida now • More than $5 billion in lost citrus production and more than 10,000 lost jobs • Associated with gram negative bacterium Candidatus Liberibacter asiaticus (CLas) • Spread by insect vector, Diaphorina citri (Asian citrus psyllid, ACP) Heck Lab September 2017, UC Riverside Extension www.citrusgreening.org
  • 9. Diaphorina citri Asian citrus psyllid (ACP) ACP bacterial symbionts CLas Citrus spp. The biological players Wolbachia Profftella Carsonella Kruse et. al. 2019 Insectswww.citrusgreening.org
  • 10. 500ng input DNA from single male psyllid Duplicated contigs added to alternate assembly Error correction • DNA sequencing data • RNA sequencing data Duplication removal with Redundans Scaffolding with Hi-C 3. Asian citrus psyllid genome (Diaphorina citri) v1.1 v2.0 REFERENCE v3.0 REFERENCE Number of contigs 161,988 1,906 13 + unplaced Total bases 485 Mb 498 Mb 474 Mb Longest 1 Mb 4.2 Mb 50.3 Mb Contig N50 34.4 Kb 749 Kb 40.5Mb Ns 19.3 Mb 4.5 Mb 13.4Mb Complete BUSCO (%) 65.9 75.9 88.3 Repeat (%) 26.37 31.9 30.2 www.citrusgreening.org
  • 11. CLas induces mitochondrial dysfunction in the gut Kruse et al. 2017, Mann et al. 2018 MitoSOX staining CLas + CLas - www.citrusgreening.org
  • 12. CLas and Wolbachia localize in the same ACP gut cells DAPI nuclear stain CLas (Pathogen) Wolbachia (Endosymbiont) Merged 60X magnification Kruse et. al. PLoS One 2017www.citrusgreening.org
  • 13. First endosymbiont genomes from Psyllid in FL Wolbachia Profftella Carsonella 10 scaffolds 1 chromosome and 1 plasmid 1 chromosome Largest 923 Kb 471 Kb - Smallest 19 Kb 4.7 Kb - Total Size 2 Mb 475.7 Kb 150 Kb Stephanie Hoyt Mueller lab Wolbachia Profftella Carsonella Number of reference genomes 8 2 9 Total number of conserved orthogroups 559 307 116 Number of conserved orthogroups in our assembly 557 307 106 Number of shared orthogroups (<50% genomes) 167 - 12 Orthology Analysis www.citrusgreening.org
  • 14. Wolbachia Strains Scaffolds were removed from the Wolbachia assembly resulting in a large decrease in duplication, but a small decrease in conserved orthogroup coverage Based on these results we hypothesize that there are two strains of Wolbachia present in this sample: • Strain 1: Scaffolds 1 and 2 cover 534/559 conserved orthogroups • Strain 2: Scaffolds 1 and 3 cover 503/559 conserved orthogroups Comparing genomic sequences of our Wolbachia strain 2 and reference genomes to our Wolbachia strain 1 www.citrusgreening.org
  • 15. High quality annotation and databases are required to identify targets for interdiction 15 Genome Annotation Target for interdiction molecules Pathway Databases Expression Networks ……. Host Vector Pathogen www.citrusgreening.org
  • 16. 4. Gene Prediction Workflow • RepeatModeler • Protein masking • RepeatMasker Repeat Masking • RNA-seq HISAT & StringTie • Iso-Seq - GMAP & Cupcake ToFU Transcriptome • Portcullis junctions • StringTie • Iso-Seq Mikado • Mikado Gene Loci • Portcullis junctions Maker • AHRD • InterProScan Functional annotation Augustus GeneMark www.citrusgreening.org Prashant Hosmani Mueller Lab
  • 18. High-quality Manually Curated Genes Annotation set OGS1.0 OGS2.0 OGS3.0 Curated No. of genes 19,311 20,793 19,049 811 No. of transcripts 20,966 25,292 21,345 916 No. of Exons Per transcript 5.42 7.06 7.29 7.87 Avg. transcript length (bp) 1,317 1,944 2,034 2,503 Avg. exon length (bp) 243 275 279 318 Non-canonical splice sites 6.05% 3.13% 2.47% 1.91% OGS: Official Gene Set www.citrusgreening.org
  • 19. Pathway based manual curation • Development • Segmentation • Wnt and other signaling pathways • Hox genes • Detoxification • Immune response • Metabolic and cellular functions • Carbohydrate metabolism • Chitin metabolism • vATPase • Chromatin remodeling • Environmental/Sensory • Circadian rhythm • Phototransduction • Reproduction • ~1000 curated genes in OGSv3 • ~200 updated models from OGSv1 (Diaci v1.1) www.citrusgreening.org
  • 21. 5. Functional annotation: InterproScan results 10,946 (57%) genes have 2,281 unique GO terms 5,311 (27%) genes are assigned to 1,159 unique pathways Runtime • 1-3 days Cyverse Discovery Environment app • 4 hours on 64 core single node with Docker container
  • 22. InterProScan Motifs and Domains 16,081 (84%) proteins have at least one motif or domain assigned 8,752 unique InterPro domains Average 3 domains per annotated protein 0 100 200 300 400 500 600 700 800 900 1000 GPCR family 3, GABA-B receptor GPCR, family 2-like WD40 repeat MFS transporter WD40/YVTN repeat-like ARM-type_fold Ig-like_fold Kinase-like Znf C2H2 P-loop NTPase Motifs & Domains Identified by InterProScan Poorly represented gene families
  • 23. 0 500 1000 1500 2000 2500 3000 nuclease activity isomerase activity lyase activity structural molecule activity phosphatase activity enzyme regulator activity ligase activity GTPase activity transferase activity, transferring acyl groups methyltransferase activity transferase activity, transferring glycosyl… enzyme binding cytoskeletal protein binding ATPase activity structural constituent of ribosome DNA-binding transcription factor activity RNA binding peptidase activity kinase activity DNA binding transmembrane transporter activity oxidoreductase activity ion binding Summary of GO Biological Process WARNING Dcitr05g1219011: Slim id: GO:0044403 symbiont process GO:0019079 viral genome replication InterProScan GO Results: Biological Process Poorly represented gene families
  • 24. 0 200 400 600 800 1000 1200 1400 1600 extracellular region endoplasmic reticulum nucleoplasm chromosome cytoskeleton plasma membrane mitochondrion ribosome cytoplasm nucleus organelle protein-containing complex intracellular cell Summary of GO Cellular Component WARNING Slim id: GO:0005618 cell wall 3 Dcitr00g0323011 GO:0009277 fungal-type cell wall Dcitr00g0493011 GO:0009277 fungal-type cell wall Dcitr00g1172011 GO:0009277 fungal-type cell wall InterProScan GO Results: Cellular Component
  • 25. How do we measure GO Quality? BREADTH: all gene products should have GO annotation (for CC, MF, BP). DEPTH: function should be as detailed as possible. EVIDENCE: Published experiments provide direct evidence of function in that species. Buza et al 2008. Gene Ontology annotation quality analysis in model eukaryotes. Nucleic acids research, 36(2), e12-e12.
  • 26. Adding Details to InterProScan GO: GOanna 0 20 40 60 80 100 120 140 160 180 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 InterPro GOanna Combined Annotation Type GO Annotation Quality No. GO annotations proteins annotated Av Quality Score Interpro & GOanna are complementary approaches. InterProScan provides "breadth" (some GO annotation for most proteins) GOanna provides "depth" (more detailed GO terms for some proteins)
  • 27. What does GOanna add to GO annotation?
  • 28. What does GOanna add to GO annotation? Poorly represented functions in InterProScan derived GO.
  • 29. 5. Functional annotation: Pathways InterProScan results • 5,311 (27%) genes are assigned to 1,159 unique pathways • Average of 18.7 genes per pathway • All are Reactome human pathways (R-HSA) KOBAS Annotate results • Assigns pathways via hits to Drosophila proteins • 13,582 (71%) genes assigned pathways from following databases • 24,101 Reactome • 3,207 KEGG PATHWAY • 1,003 PANTHER • 7 BioCyc
  • 30. Tissues Gut Abdomen Antennae Whole body Terminal abdomen Leg Thorax Head Midgut Sexes Male Female Stages Egg Nymph Adult Infection states CLas- CLas+ CLas+ Low infection CLas+ High Infection Host C. sinensis C. medica C. reticulata C. macrophylla 6. Example: D. citri Infected / Uninfected RNAseq samples from various tissues and citrus hosts for the Asian citrus psyllid www.citrusgreening.org
  • 31. Comparison of Infected and Uninfected Samples Infected samples: 22 Uninfected samples: 35 79% genes have > 1 read/million in at least 22 libs Lot of variability across samples!! InfectedUninfected
  • 32. Differential Expression Results 16,879 genes with nonzero total read count with adjusted p-value < 0.05 LFC > 0 (up) : 3162, 19% LFC < 0 (down): 3627, 21% Gene-wise estimates (black) and fitted values (red) Blue circles are genes with high dispersion that are outliers
  • 33. topGO Enriched GO Biological Processes All GO terms with p-val 0.05 Deeper shades of red indicate smaller p-values Larger circles represent higher proportion of proteins Genes GO BP mappable genes GO terms GO terms p < 0.01 InterProScan 10,946 7,130 1,384 61 InterProScan + GOanna 11,490 7,673 2,022 58
  • 34. topGO Enriched GO Molecular Functions All GO terms with p-val 0.05 Deeper shares of red indicate smaller p-values Larger circles represent higher proportion of proteins Genes GO MF mappable genes GO terms GO terms p < 0.01 InterProScan 10,946 3,280 270 6 InterProScan + GOanna 11,490 9,365 713 16
  • 35. DEGs associated with the cytoskeleton were upregulated in the CLas-infected midguts
  • 36. topGO Enriched GO Cellular Component Genes GO CC mappable genes GO terms GO terms p < 0.01 InterProScan 10,946 536 111 0 InterProScan + GOanna 11,490 4,498 447 4All GO terms with p-val 0.05 Deeper shares of red indicate smaller p-values Larger circles represent higher proportion of proteins
  • 38. “Localized mitochondrial dysfunction in the gut when insects are exposed to CLas-infected trees” Nuclear swelling and fragmentation of the heterochromatin
  • 39. Green: universal set Red: Annotated genes
  • 40. “D. citri might inhibit the expression of endocytosis- related genes in the midgut to prevent the further transmission of Clas”
  • 41. Pathway Input number Background number P-Value Gene Expression 281 1303 3.78E-07 Endocytosis 53 221 0.004147464 Cell Cycle 84 378 0.002532175 Nonsense-Mediated Decay (NMD) 52 195 0.000666531 siRNA biogenesis 9 18 0.007008944 One carbon pool by folate 11 25 0.00629025 Pathway Input number Background number P-Value Fatty Acyl-CoA Biosynthesis 31 92 0.00357436 ABC-family proteins mediated transport 24 70 0.008101975 COPI-mediated anterograde transport 33 106 0.007223449 Cellular response to hypoxia 12 25 0.008017566 Formation of ATP by chemiosmotic coupling 12 23 0.004869061 Regulation of cytoskeletal remodeling and cell spreading by IPP complex components 6 6 0.005526965 Enriched Pathways: Up & Down Regulated Genes Pathways enriched from Up-regulated genes Pathways enriched from Down-regulated genes
  • 42. 6. Summary of Functional Modeling Tools for YOU!! • Functional modeling tools to link genomics back to biological context • Can now provide GO and pathway information for functional genomics • InterPro motif analysis may help guide manual annotations & supports comparative analyses • Tools available via AgBase & Docker Analysis of data sets • Citrus greening vector (D. citri) now has GO & pathways information available • GO and pathways analyses are complementary (shared insights) • During infection, vector transcription and translation responses are tissue-specific • Lipid synthesis is down regulated and protein transport is disrupted • Strong links to mitochondrial dysfunction
  • 43. Accessing functional annotation resources agbase-docs.readthedocs.io de.cyverse.org hub.docker.com/u/agbase
  • 44. 7. Future Plans & Acknowledgements • Continued testing and deployment of the workflows • When to use InterProScan and when to add GOanna GO? • Prioritizing genes for manual curation • Identification of missing or erroneous gene families • Optimizing pathways information • What format will make this most useful? • How can we improve pathway reconstruction? • Training sessions • Feedback on tools and documentation • Making functional data from this project available • i5k, NAL, AgBase and Citrusgreening.org • Docker and Singularity based pipeline This work was supported by funding from the USDA Agricultural Research Service
  • 45.
  • 46.