Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

UseGalaxy.* Architecture


Published on

Slides from Simon Gladman for the 14 Jun 2018 EMBL-ABR Webinar about Galaxy Australia.

Simon, Galaxy Systems Architect and Tools expert, who is based at Melbourne Bioinformatics presented on how Galaxy Australia is becoming better integrated and aligned with the two other large public Galaxy services (in USA and Europe).

A recording of the webinar is available here:

Published in: Science
  • Be the first to comment

  • Be the first to like this

UseGalaxy.* Architecture

  1. 1. usegalaxy.* Global Galaxy for Everyone Simon Gladman, 2018
  2. 2. Public Galaxy servers - In the beginning.. - Galaxy Main ● Run out of many institutions in the USA ○ Penn State, Johns Hopkins, Oregon Health and Science Uni. ○ TACC, JetStream, CyVerse, Massive ● Many users ● Constantly large queue
  3. 3. Public Galaxy servers - In Australia.. ● Galaxy-QLD ○ Set up for QLD but globally public ● Galaxy-Mel ○ Set up for VIC but globally public ● Galaxy-Tut ○ For workshops ● And many others!
  4. 4. Public Galaxy servers - In Australia.. Galaxy-QLD + Galaxy-Mel + Galaxy-Tut = Galaxy-Australia
  5. 5. Public Galaxy servers - And the list grew.. ● Different versions ● Different tools ● Different references ● User confusion ● Lots of duplication of effort
  6. 6. So we got together.. ● At Galaxy Australasia Meeting 2017 ○ Key people from US, Europe, Australia including Galaxy PI. ○ Agreed to work together to support publically available Galaxys ○ Do as much together as possible And so usegalaxy.* was born
  7. 7. What is usegalaxy.*? Group of public Galaxy servers ● Present a similar experience to users no matter which they use ● Guarantee a minimum service ○ Tools & versions ○ Reference Data ○ Reproducibility ○ Training materials ● Starting with USA, Europe and Australia, more welcome! ● Manage with community assets/repositories ● Don’t prescribe hardware resources usegalaxy.* servers Community assets
  8. 8. usegalaxy.* - Tools Shared repository of tool lists ● Minimum tools ○ Genomics and others ● Extra tools ○ Metagenomics ○ Proteomics ○ Metabolomics ● Curation/maintenance of tools and versions ● Automatic upgrades and installation of trusted tools ● Still allow local specialisations
  9. 9. usegalaxy.* - Reference Data/Indices CVM-FS Tier 0 Reference Data/Indices CVM-FS Tier 1 Data/Indices Cache CVM-FS Tier 1 Data/Indices Cache Run by staff ● Genomic references ● Tool indices Tier 1 uses smart caching ● On demand ● Local specialisation
  10. 10. usegalaxy.* - Look and Feel usegalaxy.* servers should: ● Run the latest stable Galaxy release ● Present the user with similar tool list layout ● Be able to run all of the Galaxy Training Network’s core tutorials ● Have same testing/training datasets available
  11. 11. usegalaxy.* - Future - Tools Global repository of Tools in containers ● Use CVMFS for smart distribution similar to References ● Singularity containers ● Galaxy just uses appropriate container ● Much easier to manage tool lists and versioning CernVM-FS Singularity containers
  12. 12. usegalaxy.* - Future - References ● Move to a community based model ● Improve metadata availability ● Improve reference data provenance usegalaxy.orgCVM-FS Tier 1 Data/Indices Cache CVM-FS Tier 1 Data/Indices Cache CVM-FS Tier 0 Reference Data/Indices
  13. 13. usegalaxy.* - Proposed Global Architecture Community Managed Globally Distributed Reference Data/Indices Tool Containers
  14. 14. usegalaxy.* - Future - BYOC Trialling BYOC now in Australia and Europe ● Adding Melbourne/Sydney/Other resources to Galaxy Australia ● Adding Czech/Dutch and other resources to Galaxy Europe Goal is to allow user/group to add their compute/storage resources Dynamic job allocation will know ● Where user is from ● What resources they can use
  15. 15. usegalaxy.* - Australian Architecture
  16. 16. usegalaxy.* - Ultimate goal - Single Sign-on The ultimate goal! ● Sign on in Global Galaxy WebServer! ● Job is run where- ever your compute and data is located! Compute Compute Compute Compute Compute Compute Compute
  17. 17. It’s a big effort.. Gareth Price Simon Gladman Derek Benson Anna Syme Igor Makunin Nuwan Goonasekera Christina Hall Helen van der Pol Andrew Lonie Björn Grüning Helena Rasche Bérénice Batut Anika Erxleben Torsten Houwaart Joachim Wolff Mehmet Tekman Rolf Backofen & Other Elixir Members Nate Coraor Enis Afgan Anton Nekrutenko James Taylor John Chilton Martin Čech Dave Bouvier & The Galaxy Project Team & Galaxy Project Community