Interoperability and scalability with
microservices
Ola Spjuth <ola.spjuth@farmbio.uu.se>
Science for Life Laboratory
Uppsala University
Service-Oriented Architectures (SOA) in
the life sciences
• Standardize
– Agree on e.g. interfaces, data formats,
protocols etc.
• Decompose and compartmentalize
– Experts (scientists) should provide
services – do one thing and do it well
– Achieve interoperability by exposing
data and tools as Web services
• Integrate
– Users should access and integrate
remote services
API
Scientist
service
Scientist
consume
Service-Oriented Architectures (SOA) in
the life sciences, ~2005
Scientist
downtime
API
changed
Not maintained
Difficult to sustain,
unreliable solutions
API
API
API
Cloud Computing
• Cloud computing offers advantages over
contemporary e-infrastructures in the life sciences
– On-demand elastic resources and services
– No up-front costs, pay-per-use
• A lot of businesses (and software development)
moving into the cloud
– Vibrant ecosystem of frameworks and tools, including for
big data
• High potential for science
Virtual Machines and Containers
Virtual machines
• Package entire systems (heavy)
• Completely isolated
• Suitable in cloud environments
Containers:
• Share OS
• Smaller, faster, portable
• Docker!
5
MicroServices
• Similar to Web services: Decompose functionality into smaller, loosely
coupled services communicating via API
– “Do one thing and do it well”
• Preferably smaller, light-weight and fast to instantiate on demand
• Easy to replace, language-agnostic
– Suitable for loosely coupled teams (which we have in science)
– Portable - easy to deploy and scale
– Maximize agility for developers
• Suitable to deploy as containers in cloud environments
Scaling microservices
7
http://martinfowler.com/articles/microservices.html
8
Shipping
containers?
Orchestrating containers
9
Kubernetes: Orchestrating containers
• A declarative language for
launching containers
• Start, stop, update, and manage
a cluster of machines running
containers in a consistent and
maintainable way
• Origin: Google
• Suitable for microservices
Containers
Scheduled and packed containers on nodes
Virtual Research Environment (VRE)
• Virtual (online) environments for research
– Easy and user-friendly access to computational resources, tools and
data, commonly for a scientific domain
• Multi-tenant VRE – log into shared system
• Private VRE
– Deploy on your favorite cloud provider
11
• Horizon 2020-project, €8 M, 2015-2018
– “standardized e-infrastructure for the processing, analysis and information-
mining of the massive amount of medical molecular phenotyping and
genotyping data generated by metabolomics applications.”
• Enable users to provision their own virtual infrastructure (VRE)
– Public cloud, private cloud, local servers
– Easy access to compatible tools exposed as microservices
– Will in minutes set up and configure a complete data-center (compute
nodes, storage, networks, DNS, firewall etc)
– Can achieve high-availability, scalability and fault tolerance
• Use modern and established tools and frameworks supported by industry
– Reduce risk and improve sustainability
• Offer an agile and scalable environment to use, and a straightforward
platform to extend
http://phenomenal-h2020.eu/
Users should not see this…
Deployment and user access
Launch on reference installation
Launch on public cloud
Private VRE
In-house deployment scenarios
MRC-NIHR Phenome Centre
• Medium-sized
IT-infrastructure
• Dedicated IT-
personnel
• Users: ICL staff
Hospital environment
• Dedicated
server
• No IT-personnel
• User: Clinical
researcher
Private VRE
Build and test
tools, images,
infrastructure
Docker Hub
PhenoMeNal
Jenkins
PhenoMeNal
Container Hub
Development: Container lifecycle
Source code repositories
Two proof of concepts so far
Kultima group Pablo Moreno
Implications
• Improve sustainability
– Not dependent on specific data centers
• Improve reliability and security
– Users can run their own service environments (VREs) within isolated
environments
– High-availability and fault tolerance
• Scalability
– Deploy in elastic environments
• Agile development
– Automate “from develop to deploy”
• Agile science
– Simple access to discoverable, scalable tools on elastic compute
resources with no up-front costs
• NB: Many problems of interoperability remains!
– Data
– APIs
– etc.
19
Ongoing research on VREs
20
Data
federation
Compute
federation
Privacy
preservation
Workflows
Big Data
frameworks
Data management and
modeling
Acknowledgements
Pharmb.io
Wesley Schaal
Maris Lapins
Jonathan Alvarsson
Arvid Berg
Samuel Lampa
Marco Capuccini
Martin Dahlö
Valentin Georgiev
Anders Larsson
Polina Georgiev
Staffan Arvidsson
Laeeq Ahmed
21
AstraZeneca
Lars Carlsson
Ernst Ahlberg-Helgee
University Vienna
David Kreil
Maciej Kańduła
SNIC Cloud
Salman Toor
Andreas Hellander
Caramba.clinic
Kim Kultima
Stephanie Herman
Payam Emami
ToxHQ team

Interoperability and scalability with microservices in science

  • 1.
    Interoperability and scalabilitywith microservices Ola Spjuth <ola.spjuth@farmbio.uu.se> Science for Life Laboratory Uppsala University
  • 2.
    Service-Oriented Architectures (SOA)in the life sciences • Standardize – Agree on e.g. interfaces, data formats, protocols etc. • Decompose and compartmentalize – Experts (scientists) should provide services – do one thing and do it well – Achieve interoperability by exposing data and tools as Web services • Integrate – Users should access and integrate remote services API Scientist service Scientist consume
  • 3.
    Service-Oriented Architectures (SOA)in the life sciences, ~2005 Scientist downtime API changed Not maintained Difficult to sustain, unreliable solutions API API API
  • 4.
    Cloud Computing • Cloudcomputing offers advantages over contemporary e-infrastructures in the life sciences – On-demand elastic resources and services – No up-front costs, pay-per-use • A lot of businesses (and software development) moving into the cloud – Vibrant ecosystem of frameworks and tools, including for big data • High potential for science
  • 5.
    Virtual Machines andContainers Virtual machines • Package entire systems (heavy) • Completely isolated • Suitable in cloud environments Containers: • Share OS • Smaller, faster, portable • Docker! 5
  • 6.
    MicroServices • Similar toWeb services: Decompose functionality into smaller, loosely coupled services communicating via API – “Do one thing and do it well” • Preferably smaller, light-weight and fast to instantiate on demand • Easy to replace, language-agnostic – Suitable for loosely coupled teams (which we have in science) – Portable - easy to deploy and scale – Maximize agility for developers • Suitable to deploy as containers in cloud environments
  • 7.
  • 8.
  • 9.
  • 10.
    Kubernetes: Orchestrating containers •A declarative language for launching containers • Start, stop, update, and manage a cluster of machines running containers in a consistent and maintainable way • Origin: Google • Suitable for microservices Containers Scheduled and packed containers on nodes
  • 11.
    Virtual Research Environment(VRE) • Virtual (online) environments for research – Easy and user-friendly access to computational resources, tools and data, commonly for a scientific domain • Multi-tenant VRE – log into shared system • Private VRE – Deploy on your favorite cloud provider 11
  • 12.
    • Horizon 2020-project,€8 M, 2015-2018 – “standardized e-infrastructure for the processing, analysis and information- mining of the massive amount of medical molecular phenotyping and genotyping data generated by metabolomics applications.” • Enable users to provision their own virtual infrastructure (VRE) – Public cloud, private cloud, local servers – Easy access to compatible tools exposed as microservices – Will in minutes set up and configure a complete data-center (compute nodes, storage, networks, DNS, firewall etc) – Can achieve high-availability, scalability and fault tolerance • Use modern and established tools and frameworks supported by industry – Reduce risk and improve sustainability • Offer an agile and scalable environment to use, and a straightforward platform to extend http://phenomenal-h2020.eu/
  • 13.
    Users should notsee this…
  • 15.
    Deployment and useraccess Launch on reference installation Launch on public cloud Private VRE
  • 16.
    In-house deployment scenarios MRC-NIHRPhenome Centre • Medium-sized IT-infrastructure • Dedicated IT- personnel • Users: ICL staff Hospital environment • Dedicated server • No IT-personnel • User: Clinical researcher Private VRE
  • 17.
    Build and test tools,images, infrastructure Docker Hub PhenoMeNal Jenkins PhenoMeNal Container Hub Development: Container lifecycle Source code repositories
  • 18.
    Two proof ofconcepts so far Kultima group Pablo Moreno
  • 19.
    Implications • Improve sustainability –Not dependent on specific data centers • Improve reliability and security – Users can run their own service environments (VREs) within isolated environments – High-availability and fault tolerance • Scalability – Deploy in elastic environments • Agile development – Automate “from develop to deploy” • Agile science – Simple access to discoverable, scalable tools on elastic compute resources with no up-front costs • NB: Many problems of interoperability remains! – Data – APIs – etc. 19
  • 20.
    Ongoing research onVREs 20 Data federation Compute federation Privacy preservation Workflows Big Data frameworks Data management and modeling
  • 21.
    Acknowledgements Pharmb.io Wesley Schaal Maris Lapins JonathanAlvarsson Arvid Berg Samuel Lampa Marco Capuccini Martin Dahlö Valentin Georgiev Anders Larsson Polina Georgiev Staffan Arvidsson Laeeq Ahmed 21 AstraZeneca Lars Carlsson Ernst Ahlberg-Helgee University Vienna David Kreil Maciej Kańduła SNIC Cloud Salman Toor Andreas Hellander Caramba.clinic Kim Kultima Stephanie Herman Payam Emami ToxHQ team

Editor's Notes

  • #4 Idea with SOA (~2005) Achieve interoperability by exposing data and functionality as Web services Experts (scientists) should set up and host their own Web services Users should integrate a multitude of distributed services, connect into workflows (e.g. Taverna), and share (parts of) workflows What happened? Users could not rely on Web services (downtime, API changes, abandoned) and they could not be mirrored Workflows never gained widespread popularity Today, stable web services mainly remain at large data and tool providers (EBI, NCBI etc)
  • #6 Drop applications into VMs running Docker in different clouds.