Slides from Simon Gladman for the 14 Jun 2018 EMBL-ABR Webinar about Galaxy Australia.
Simon, Galaxy Systems Architect and Tools expert, who is based at Melbourne Bioinformatics presented on how Galaxy Australia is becoming better integrated and aligned with the two other large public Galaxy services (in USA and Europe).
A recording of the webinar is available here: https://youtu.be/5PBOoBo_ySM
2. Public Galaxy servers - In the beginning..
usegalaxy.org - Galaxy Main
● Run out of many
institutions in the USA
○ Penn State, Johns Hopkins,
Oregon Health and Science Uni.
○ TACC, JetStream, CyVerse,
Massive
● Many users
● Constantly large queue
3. Public Galaxy servers - In Australia..
● Galaxy-QLD
○ Set up for QLD but
globally public
● Galaxy-Mel
○ Set up for VIC but
globally public
● Galaxy-Tut
○ For workshops
● And many others!
4. Public Galaxy servers - In Australia..
Galaxy-QLD
+ Galaxy-Mel
+ Galaxy-Tut
=
Galaxy-Australia
usegalaxy.org.au
5. Public Galaxy servers - And the list grew..
● Different versions
● Different tools
● Different references
● User confusion
● Lots of duplication of effort
6. So we got together..
● At Galaxy Australasia Meeting 2017
○ Key people from US, Europe, Australia
including Galaxy PI.
○ Agreed to work together to support
publically available Galaxys
○ Do as much together as possible
And so usegalaxy.* was born
7. What is usegalaxy.*?
Group of public Galaxy servers
● Present a similar experience to
users no matter which they use
● Guarantee a minimum service
○ Tools & versions
○ Reference Data
○ Reproducibility
○ Training materials
● Starting with USA, Europe and
Australia, more welcome!
● Manage with community
assets/repositories
● Don’t prescribe hardware resources
usegalaxy.* servers
Community assets
8. usegalaxy.* - Tools
Shared repository of tool lists
● Minimum tools
○ Genomics and others
● Extra tools
○ Metagenomics
○ Proteomics
○ Metabolomics
● Curation/maintenance of tools and
versions
● Automatic upgrades and installation
of trusted tools
● Still allow local specialisations
9. usegalaxy.* - Reference Data/Indices
CVM-FS Tier 0
Reference
Data/Indices
usegalaxy.eu
usegalaxy.org.au
usegalaxy.org
CVM-FS Tier 1
Data/Indices
Cache
CVM-FS Tier 1
Data/Indices
Cache
Run by usegalaxy.org staff
● Genomic references
● Tool indices
Tier 1 uses smart caching
● On demand
● Local specialisation
10. usegalaxy.* - Look and Feel
usegalaxy.* servers should:
● Run the latest stable Galaxy release
● Present the user with similar tool list
layout
● Be able to run all of the Galaxy
Training Network’s core tutorials
● Have same testing/training datasets
available
11. usegalaxy.* - Future - Tools
Global repository of Tools in
containers
● Use CVMFS for smart distribution
similar to References
● Singularity containers
● Galaxy just uses appropriate container
● Much easier to manage tool lists and
versioning
CernVM-FS
Singularity containers
12. usegalaxy.* - Future - References
● Move to a
community based
model
● Improve metadata
availability
● Improve reference
data provenance
usegalaxy.eu
usegalaxy.org.au
usegalaxy.orgCVM-FS Tier 1
Data/Indices
Cache
CVM-FS Tier 1
Data/Indices
Cache
CVM-FS Tier 0
Reference
Data/Indices
13. usegalaxy.* - Proposed Global Architecture
Community Managed
Globally Distributed
Reference
Data/Indices
Tool Containers
usegalaxy.eu
usegalaxy.org.au
usegalaxy.org
14. usegalaxy.* - Future - BYOC
Trialling BYOC now in Australia and Europe
● Adding Melbourne/Sydney/Other resources to Galaxy Australia
● Adding Czech/Dutch and other resources to Galaxy Europe
Goal is to allow user/group to add their compute/storage
resources
Dynamic job allocation will know
● Where user is from
● What resources they can use
16. usegalaxy.* - Ultimate goal - Single Sign-on
The ultimate goal!
● Sign on in Global
Galaxy WebServer!
● Job is run where-
ever your compute
and data is located!
Compute
Compute
Compute
Compute
Compute
Compute
Compute
17. It’s a big effort..
Gareth Price
Simon Gladman
Derek Benson
Anna Syme
Igor Makunin
Nuwan Goonasekera
Christina Hall
Helen van der Pol
Andrew Lonie
Björn Grüning
Helena Rasche
Bérénice Batut
Anika Erxleben
Torsten Houwaart
Joachim Wolff
Mehmet Tekman
Rolf Backofen
& Other Elixir Members
Nate Coraor
Enis Afgan
Anton Nekrutenko
James Taylor
John Chilton
Martin Čech
Dave Bouvier &
The Galaxy Project Team &
Galaxy Project Community
Editor's Notes
G’day everyone, my name is Simon Gladman and I’m going to talk to you about something called usegalaxy.*. An initiative of the Galaxy community to provide publicly accessible bioinformatics analysis for everybody.
There’s been a long history of public Galaxy servers. The first one was run by the Galaxy project itself. It runs out of the USA and is usually referred to as Galaxy Main.
It has many users, is backed by a very large set of compute resources and is constantly in high demand.