Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PLAZA 3.0 - an access point for plant comparative genomics

1,001 views

Published on

An overview of the new species and improved gene function information present in PLAZA 3.0. URL: http://bioinformatics.psb.ugent.be/plaza/

Published in: Science
  • Be the first to comment

PLAZA 3.0 - an access point for plant comparative genomics

  1. 1. PLAZA 3.0: an access point for plant comparative genomics Klaas Vandepoele 29 September 2014 VIB – Ghent University, Belgium
  2. 2. Overview  Plant genomes: status & challenges  Comparative genomics using PLAZA: concepts & tools  What’s new in PLAZA 3.0 ? 2
  3. 3. Plant genome sequencing is booming  New and faster sequencing technologies  Generating a draft genome sequence has become cheap  The number of published plant genomes grows exponentially  Unlocking biological information is the real challenge 3 Michael & Jackson, 2013
  4. 4. Genome annotation  Structural annotation shows where genes are  Describes their intron-exon gene structure  Functional annotation tells you what genes do  Can be downloaded along with the genome sequence 4
  5. 5. Comparative genomics  Comparative genomics is a powerful tool allowing us:  to link genomic changes to environmental adaptation  to transfer knowledge from model species to others plants  to trace structural changes within a genome trough time 5
  6. 6. Comparative genomics has a steep learning curve  A thorough knowledge of data processing tools is required  Computer clusters and high memory machines are used  New visualizations and methods are necessary to explore genomic geatures across multiple species  Limited access to high-quality comparative genomics information 6
  7. 7. http://bioinformatics.psb.ugent.be/plaza/
  8. 8. http://bioinformatics.psb.ugent.be/plaza/
  9. 9. Gene families & genome organization Gene family analysis Genome analysis >20 tools available! 11
  10. 10. Exploiting cross-species genome information  Centralized infrastructure  Detailed gene catalog per species  Structural annotation (gene models, UTRs)  Functional annotation (experimental, sequence-based)  Intuitive & advanced data mining tools for non-expert users • Sequence retrieval • Gene functions • Genome organization • Pathway evolution • Data manipulation 12
  11. 11. 13
  12. 12. 14
  13. 13. 15
  14. 14. 16 Text-mining Orthology-based
  15. 15. Comparative sequence analysis 17  Homology = shared ancestral common origin  Inferred based on  sequence similarity (BLAST)  similar (multi-)domain composition & organization TAIR JGI EMBL BLASTCLUST Tribe-MCL Inparanoid OrthoMCL C/KOG All-against-all sequence similarity search (BLAST)
  16. 16. Gene families, Multiple sequence alignment & Phylogenetic trees 18 26K multi-gene families covering 90% of the total proteome >1M proteins from 31 species 17K trees incl. 580K annotated tree nodes
  17. 17. 19
  18. 18. Integrative Orthology Viewer •Tree-based orthologs (TROG) inferred using tree reconciliation •Orthologous gene families (ORTHO) inferred using OrthoMCL •Anchor points refer to gene-based colinearity between species •Best hit families (BHIF) inferred from Blast hits including inparalogs
  19. 19. 21 Gene colinearity & genome organization Gene Homology Matrix (GHM) i-ADHoRe 3.0 • Represent chromosomes as sorted gene lists • Identify all homologous gene pairs between chromosomes (all-against- all BLASTP). • Score pairs of homologues in matrix 1 2
  20. 20. Genome-wide colinearity (WGDotplot) 22 O. sativa Z. mays
  21. 21. Multi-species colinearity 23
  22. 22. PLAZA Workbench 25  Create a custom gene set (~experiment) using gene identifiers or BLAST  External/internal gene IDs (e.g. AN3, AT5G28640, GRMZM2G180246_T01)  BLAST interface can be used to map sequence data from a non-model species to a reference species present in PLAZA  A toolbox is available to analyze user-defined gene sets PLAZA Workbench WGMapping Functional annotations Gene Families GO enrichment Tandem/block duplicates Sequence retrieval Microarray transcript profiling EST / RNA-sequencing Genes reported in Suppl. data iOrthologs Export data…
  23. 23. 26
  24. 24. GO enrichment analysis 27
  25. 25. What’s new in PLAZA 3.0?  New genomes  Dicots (13) • Gossypium raimondii (cotton), Eucalyptus grandis (eucalyptus), Solanum lycopersicum (tomato), Solanum tuberosum (potato), Beta vulgaris (sugar beet), Prunus persica (peach), Citrus sinensis (sweet orange), Cucumis melo (melon), Citrullus lanatus (watermelon) • Capsella rubella, Brassica rapa and Thelungiella parvula • Amborella trichopoda  Monocots (3) • Musa acuminata (banana), Setaria italica (foxtail millet) and Hordeum vulgare (barley) 28
  26. 26. What’s new in PLAZA 3.0?  Gene function information  Free-text gene descriptions • Primary data provider + UniProt • AnnoMine* text-mining  Protein domains • InterPro  Structured functional annotations • Gene Ontology • MapMan • PlnTFDB and PlantTFDB 29 * Sofie Van Landeghem
  27. 27. Extended GO projection 30 Orthology-based Homology-based Transfer of experimentally confirmed GO information to orthologs and homologs
  28. 28. Coverage gene function information 31 Gene Ontology (Biological Process) Gene descriptions blue = primary GO; green = GO projection (orthology + homology)
  29. 29. Conclusions  PLAZA 3.0 provides a versatile toolbox for plant genomics  Integration of complementary data sources describing gene functions  Improved algorithms to transfer functional annotation from well-characterized plant genomes to other species  Technical improvements  database design  comparative genomics tools  speed  visualizations 32
  30. 30. 33 Acknowledgments • – plant comparative genomics  Sebastian Proost  Michiel Van Bel  Dries Vaneechoutte  Yves Van de Peer  Dirk Inzé plaza_genomics http://bioinformatics.psb.ugent.be/plaza/

×