Dissecting plant genomes with thePLAZA 2.5 comparative genomicsplatformIntegrating sequence orthology with expression data...
Genome sequencing in different plant clades                                                                1.0            ...
Exploiting cross-species genome information       Centralized infrastructure       Detailed gene catalog per species    ...
PLAZA, a resource for plant comparative genomics             http://bioinformatics.psb.ugent.be/plaza/4
Gene family analysis    Genome analysis                                                                                   ...
6
Comparative sequence analysis       Homology = shared ancestral common origin       Inferred based on             seque...
Gene family Similarity heatmap, Multiple                sequence alignment & Phylogenetic trees                           ...
Gene family analysis    Genome analysis9
Gene colinearity & genome organization                        Chromosome 1                                            • Re...
Genome-wide colinearity (WGDotplot)     Z. mays                                                     O. sativa11
Multi-species colinearity12
Multi-species WGDotplots - applet13
Whole-genome Circular Dotplot                                              Reference: O. sativa                           ...
Synteny Plot: local genome organization15
Gene family analysis     Genome analysis16
Workbench data import                Create a custom gene set (~experiment) using gene identifiers or                 BLA...
GO enrichment analysis for           all 25 species!18
Detection of orthologous plant genes        Meaning…          Orthology = genes derived from a common ancestor          ...
Orthologous genes – Table view20
Integrative Orthology Viewer - an ensemble of     different gene orthology prediction approaches      •Tree-based ortholog...
How to evaluate sequence-based orthology                methods?     Cross-species analysis of orthologs using Expression ...
Orthology support & expression conservation for     Arabidopsis – rice orthologs     OrthoMCL (60% ECC global)            ...
Conclusions        PLAZA 2.5 provides a versatile toolbox for plant genomics        Expression Context Conservation prov...
Acknowledgments     •              – plant comparative genomics              Michiel Van Bel              Sebastian Proo...
Upcoming SlideShare
Loading in …5
×

Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform

1,115 views

Published on

Dissecting plant genomes with the PLAZA comparative genomics platform.
Van Bel M, Proost S, Wischnitzki E, Movahedi S, Scheerlinck C, Van de Peer Y, Vandepoele K.

Plant Physiol. 2012 Feb;158(2):590-600.

With the arrival of low-cost, next-generation sequencing, a multitude of new plant genomes are being publicly released, providing unseen opportunities and challenges for comparative genomics studies. Here, we present PLAZA 2.5, a user-friendly online research environment to explore genomic information from different plants. This new release features updates to previous genome annotations and a substantial number of newly available plant genomes as well as various new interactive tools and visualizations. Currently, PLAZA hosts 25 organisms covering a broad taxonomic range, including 13 eudicots, five monocots, one lycopod, one moss, and five algae. The available data consist of structural and functional gene annotations, homologous gene families, multiple sequence alignments, phylogenetic trees, and colinear regions within and between species. A new Integrative Orthology Viewer, combining information from different orthology prediction methodologies, was developed to efficiently investigate complex orthology relationships. Cross-species expression analysis revealed that the integration of complementary data types extended the scope of complex orthology relationships, especially between more distantly related species. Finally, based on phylogenetic profiling, we propose a set of core gene families within the green plant lineage that will be instrumental to assess the gene space of draft or newly sequenced plant genomes during the assembly or annotation phase.

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,115
On SlideShare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • 23 plant genomes: 11 dicots, 5 monocotspico-PLAZA: 10 green algae
  • Intuitive & complete view of gene orthology
  • Method to quantify expression conservation across species by comparing coexpression networks of orthologous genes.These quantifications are robust with respect to moderate modifications in the underlying expression data set, and the method corrects for network connectivity or tissue specific expression when determining significance levels.
  • How many genes included?
  • Dissecting plant genomes with the PLAZA 2.5 comparative genomics platform

    1. 1. Dissecting plant genomes with thePLAZA 2.5 comparative genomicsplatformIntegrating sequence orthology with expression data topredict functional homologs across plant speciesKlaas VandepoelePLANT GENOMES & BIOTECHNOLOGY: FROM GENES TONETWORKS (CSHL, 1 December 2011)Comparative & Integrative GenomicsVIB – Ghent University, Belgium
    2. 2. Genome sequencing in different plant clades 1.0 2.0 2.5 Green algae Chlorophyceae C. reinhardtii V. carteri Prasinophyceae O. lucimarinus Micromomas O. tauri Club-mosses P. patens S. moellondorffii Mosses Monocots O. sativa japonica O. sativa indica S. bicolor Z. mays B. distachon Basal Eudicots V. vinifera L. japonics, M. truncatula, G. max Eudicots Angiosperms P. trichocarpa M. esculenta, R. communis, F. vesca Rosids C. papaya M. domestica, T. cacao A. thaliana A. lyrata Asterids 9 genomes 25 genomes2
    3. 3. Exploiting cross-species genome information  Centralized infrastructure  Detailed gene catalog per species  Structural annotation (gene models, UTRs)  Functional annotation (experimental, sequence-based)  Intuitive & advanced data mining tools for non-expert users • Gene function • Genome organization • Pathway evolution • Data manipulation  Computational resources3
    4. 4. PLAZA, a resource for plant comparative genomics http://bioinformatics.psb.ugent.be/plaza/4
    5. 5. Gene family analysis Genome analysis 20 tools available! More information? Check Help – Documentation • Data content & Construction • Tutorial & FAQ Proost , Van Bel, … & Vandepoele, Plant Cell 20095
    6. 6. 6
    7. 7. Comparative sequence analysis  Homology = shared ancestral common origin  Inferred based on  sequence similarity (BLAST)  similar (multi-)domain composition & organization  So sequence similarity means homology? No, it depends! JGI TAIR All-against-all sequence BLASTCLUST similarity search (BLAST) Tribe-MCL EMBL Inparanoid OrthoMCL C/KOG7
    8. 8. Gene family Similarity heatmap, Multiple sequence alignment & Phylogenetic trees >780K proteins from 25 species 18K trees incl. 420K 22K multi-species gene families annotated tree nodes covering 83% of the total proteome8
    9. 9. Gene family analysis Genome analysis9
    10. 10. Gene colinearity & genome organization Chromosome 1 • Represent chromosomes as sorted gene lists Chromosome 2 • Identify all homologous gene pairs between chromosomes (all- against-all BLASTP). • Score pairs of homologues in matrix 1 Gene Homology Matrix (GHM) i-ADHoRe 3.0 210 Proost , Fostier, … & Vandepoele, NAR in press
    11. 11. Genome-wide colinearity (WGDotplot) Z. mays O. sativa11
    12. 12. Multi-species colinearity12
    13. 13. Multi-species WGDotplots - applet13
    14. 14. Whole-genome Circular Dotplot Reference: O. sativa Inner circle: duplicated regions14 Outer circle: inter-species colinear regions
    15. 15. Synteny Plot: local genome organization15
    16. 16. Gene family analysis Genome analysis16
    17. 17. Workbench data import  Create a custom gene set (~experiment) using gene identifiers or BLAST  External/internal gene IDs (e.g. AN3, AT5G28640, GRMZM2G180246_T01)  BLAST interface can be used to map sequence data from a non-model species to a reference species present in PLAZA  A toolbox is available to analyze user-defined gene sets Microarray transcript profiling WGMapping Gene Families EST Functional PLAZA GO enrichment sequencing annotations Workbench Sequence Tandem/block retrieval duplicates Genes reported in Suppl. data Orthologs Export data…17
    18. 18. GO enrichment analysis for all 25 species!18
    19. 19. Detection of orthologous plant genes  Meaning…  Orthology = genes derived from a common ancestor in different species  Functionally conserved homologs = genes in different species having similar functions  Due to gene duplication events , complex many-to-many gene orthology is frequently observed  Functional homologs in different species share …  similar expression?  regulation?19  protein-protein interactions?
    20. 20. Orthologous genes – Table view20
    21. 21. Integrative Orthology Viewer - an ensemble of different gene orthology prediction approaches •Tree-based orthologs (TROG) inferred using tree reconciliation •Orthologous gene families (ORTHO) inferred using OrthoMCL •Anchor points refer to gene-based colinearity between species21 •Best hit families (BHIF) inferred from Blast hits including inparalogs
    22. 22. How to evaluate sequence-based orthology methods? Cross-species analysis of orthologs using Expression Context Conservation (ECC) Expression context conservation quantifies shared orthologs in coexpression networks ECC score = 0.088 (16 shared orthologs / 182 in both coexpression clusters) P-value(conserved)<0.00122 Movahedi, Van de Peer & Vandepoele, Plant Physiology 2011
    23. 23. Orthology support & expression conservation for Arabidopsis – rice orthologs OrthoMCL (60% ECC global) BHIF (58% ECC global) 3888 2880 4196 5561 8364 60 % 6699 44% 41% 8869 2338 9411 68 % 3875 3022 4281 57% 41% 5886 16367 Legend 41% # Ath genes # Ath – Osa gene pairs TROG (54% ECC global) % Expression conservation (ECC) >3506 Arabidopsis – rice orthologs missed by OrthoMCL show23 expression conservation (41% ECC)
    24. 24. Conclusions  PLAZA 2.5 provides a versatile toolbox for plant genomics  Expression Context Conservation provides a valuable approach to study orthologs and predict functional homologs across species  The integration of complementary data types extends the scope of complex orthology relationships24
    25. 25. Acknowledgments • – plant comparative genomics  Michiel Van Bel  Sebastian Proost  Yves Van de Peer http://bioinformatics.psb.ugent.be/plaza/  Evolutionary analysis of expression networks  Sara Movahedi Plant Physiology 2011 paper25

    ×