Desktop as a Service supporting
Environmental ‘omics
Dr David Wallom
University of Oxford
On behalf of the EOSCloud team
Biolinux: A scalable solution
• Comprehensive, free bioinformatics workstation based on
Ubuntu Linux
• 10 years & 8 major releases
• 200+ bioinf packages including big integrative tools :-
QIIME, Galaxy Server, PredictProtein, EMBOSS,
...Incorporates all software
Dual BootLinux Live Local Servers Cloud
Why Cloud?
Big e-Infrastructure
Biologist
Why Cloud?
• Tools such as Biolinux are community enablers
• Data sets can be too big or restricted to easily move
– move the compute to the data
– Researcher work patterns are maintained
• More efficient use of shared resources
• Central maintenance of infrastructure
• Lower barrier to entry (Compared to traditional HPC
and Grid)
JASMIN Infrastructure at RAL
• 10/12/14
Thanks to Philip Kershaw
• <number>
JASMIN Infrastructure at RAL
• 10/12/14 • <number>
Thanks to Philip Kershaw
EOSCloud
• A NERC Big Data capital project
• Organisationally a tenancy in the Unmanaged Cloud
• Interfaces based on JASMIN IaaS software platform
• ‘Users’ or VMAdmin are registered JASMIN users
– Each receives two VMs
• Biolinux
• Ubuntu Docker hosting environment
– Total responsibility for instantiated system
– Accessible though standard remote desktop tools
• Utilising single scale of resources would be a waste
– Can we scale the users virtual services to take into account demand?
Boosting Resource Capabilities
• Users VMs operate in native state ‘Standard’
– Enough capability to access stored data
– Configure applications and workflows
– Free
• User may boost his running VM to increased capability
– Enough to run installed BioLInux analysis applications on useful timescale
– Credit consumption only for Boosted instances
• Reference datasets available to users through shared storage
Name # Core Memory (GB) Cost(Credit/hou
r)
Standard 1 16 0
Standard+ 2 40 1
Big 8 140 4
Max 16 500 8
Desktop as a Service for research
• Giving researchers an environment they are confident in by
changing the infrastructure around them
• Creating a new user facility for NERC research communities
• Utilise generic cloud capabilities
– Package would operate on any IaaS Cloud
– Currently using native VMWare interfaces
– Not using platform specific capabilities - no impediment to standards
adoption
• Launch for pilot user communities 31st Mar 2015
– Testing and documentation in final days
– Engaging pilot user communities (e.g. OSD)
• Investigating other key usage models such as teaching or online
learning
THANK YOU & QUESTIONS

Desktop as a Service supporting Environmental 'Omics

  • 1.
    Desktop as aService supporting Environmental ‘omics Dr David Wallom University of Oxford On behalf of the EOSCloud team
  • 2.
    Biolinux: A scalablesolution • Comprehensive, free bioinformatics workstation based on Ubuntu Linux • 10 years & 8 major releases • 200+ bioinf packages including big integrative tools :- QIIME, Galaxy Server, PredictProtein, EMBOSS, ...Incorporates all software Dual BootLinux Live Local Servers Cloud
  • 3.
  • 4.
    Why Cloud? • Toolssuch as Biolinux are community enablers • Data sets can be too big or restricted to easily move – move the compute to the data – Researcher work patterns are maintained • More efficient use of shared resources • Central maintenance of infrastructure • Lower barrier to entry (Compared to traditional HPC and Grid)
  • 5.
    JASMIN Infrastructure atRAL • 10/12/14 Thanks to Philip Kershaw • <number>
  • 6.
    JASMIN Infrastructure atRAL • 10/12/14 • <number> Thanks to Philip Kershaw
  • 7.
    EOSCloud • A NERCBig Data capital project • Organisationally a tenancy in the Unmanaged Cloud • Interfaces based on JASMIN IaaS software platform • ‘Users’ or VMAdmin are registered JASMIN users – Each receives two VMs • Biolinux • Ubuntu Docker hosting environment – Total responsibility for instantiated system – Accessible though standard remote desktop tools • Utilising single scale of resources would be a waste – Can we scale the users virtual services to take into account demand?
  • 8.
    Boosting Resource Capabilities •Users VMs operate in native state ‘Standard’ – Enough capability to access stored data – Configure applications and workflows – Free • User may boost his running VM to increased capability – Enough to run installed BioLInux analysis applications on useful timescale – Credit consumption only for Boosted instances • Reference datasets available to users through shared storage Name # Core Memory (GB) Cost(Credit/hou r) Standard 1 16 0 Standard+ 2 40 1 Big 8 140 4 Max 16 500 8
  • 12.
    Desktop as aService for research • Giving researchers an environment they are confident in by changing the infrastructure around them • Creating a new user facility for NERC research communities • Utilise generic cloud capabilities – Package would operate on any IaaS Cloud – Currently using native VMWare interfaces – Not using platform specific capabilities - no impediment to standards adoption • Launch for pilot user communities 31st Mar 2015 – Testing and documentation in final days – Engaging pilot user communities (e.g. OSD) • Investigating other key usage models such as teaching or online learning
  • 13.
    THANK YOU &QUESTIONS