Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Container Platforms

1,203 views

Published on

Elastic container platforms (like Kubernetes, Docker Swarm, Apache Mesos) fit very well with existing cloud-native application architecture approaches. So it is more than astonishing, that these already existing and open source available elastic platforms are not considered more consequently for multi-cloud approaches. Elastic container platforms provide inherent multi-cloud support that can be easily accessed. We present a solution proposal of a control process which is able to scale (and migrate as a side effect) elastic container platforms across different public and private cloud-service providers. This control loop can be used in an execution phase of self-adaptive auto-scaling MAPE loops (monitoring, analysis, planning, execution). Additionally, we present several lessons learned from our prototype implementation which might be of general interest for researchers and practitioners. For instance, to describe only the intended state of an elastic platform and let a single control process take care to reach this intended state is far less complex than to define plenty of specific and necessary multi-cloud aware workflows to deploy, migrate, terminate, scale up and scale down elastic platforms or applications.

Published in: Technology
  • Be the first to comment

Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Container Platforms

  1. 1. Smuggling Multi-Cloud Support into Cloud-native Applications using Elastic Container Platforms 1 Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems Nane Kratzke
  2. 2. The next 30 minutes are about ... • What are Cloud-native Applications? • Elastic Container Platforms and why they should be considered for multi-cloud research. • A control loop to scale Elastic Container Platforms across Cloud Service Providers • Some data of our evaluation • 7 Lessons Learned and Conclusion Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 2 Presentation URL Paper URL
  3. 3. Maturity Criteria 3 Cloud Native • Application can dynamically migrate across infrastructure providers without interruption of service. • Application can elastically scale out/in appropriately based on stimuli. 2 Cloud Resilient • Services are stateless. • Application is unaware and unaffected by failure of dependent services. • Application is infrastructure agnostic and can run anywhere. 1 Cloud Friendly • Application is composed of loosely coupled services. • Application services are discoverable by name. • Application deployment units are designed according to cloud patterns (e.g. 12-factor app principles) • Application compute and storage are separated. • Application consumes one or more cloud services: compute, storage, network. 0 Cloud Ready • Application runs on virtualized infrastructure. • Application can be instantiated from an image or script. According to OPEN DATA CENTER ALLIANCE Best Practices (Architecting Cloud-Aware Applications), 2014 with add-ons by practitioner Mario-Leander Reimer (QAWare) Cloud Application Maturity Model (CAMM) Covered by a lot of SOA and cloud deployment approaches. This contri- bution‘s focus ...
  4. 4. Research Surveillance of Practitioners Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 4 Docker Swarm Swarm Mode (since Docker 1.12) „copies“ the idea of Kubernetes-like control processes but integrates them in just one component. Secure by default (control and data plane). Hides operation complexity. Google Control processes that continuously drive current state of container based applications towards an intended desired state. Makes Google‘s experience of running large scale production workloads available as open source (especially from the Google internal Borg system). Mesosphere Apache Mesos based datacenter operating system for fine grained resource allocation. Frameworks to operate containers and data services. Datacenter focused. Mesos operates successfully large scale datacenters since years (Twitter, Netflix, ...) Practitioners ask for simple solutions (elastic platforms) ...
  5. 5. The very basic idea ... Prof. Dr. rer. nat. Nane Kratzke Praktische Informatik und betriebliche Informationssysteme 5 Operate application on current provider. Scale cluster into prospective provider. Shutdown nodes on current provider. Cluster reschedules lost container. Migration finished.Quint, P.-C., & Kratzke, N. (2016). Overcome Vendor Lock-In by Integrating Already Available Container Technologies - Towards Transferability in Cloud Computing for SMEs. In Proceedings of CLOUD COMPUTING 2016 (7th. International Conference on Cloud Computing, GRIDS and Virtualization). Avoiding Vendor Lock-In: • Make use of elastic container platforms to operate elastic services being deployable to any IaaS cloud infrastructure. • Transfer of these services from one private or public cloud infrastructure to another would be possible at runtime.
  6. 6. But the idea provides more options ... Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 6 Simply stop „a transfer“ somewhere in between and you get ...
  7. 7. One Control Loop for All Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 7 Operate application on current provider. Scale cluster into prospective provider. Shutdown nodes on current provider. Cluster reschedules lost container. Migration finished.
  8. 8. Control Loop Example to deploy a cluster Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 8 Definition of an intended state. { "type": "cluster", "platform": "Swarm", "deployments": [ { "district": "gce-europe", "flavor": "small", "role": "master", "quantity": 1 }, { "district": "gce-europe", "flavor": "small", "role": "worker", "quantity": 9 }, { "district": "aws-europe", "flavor": "small", "role": "worker", "quantity": 0 } ] }
  9. 9. Control Loop Example to deploy a cluster Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 9 Derive a prioritized action list. || Create secgroup for gce-europe -- Create master in gce-europe || Create worker in gce-europe || Create worker in gce-europe || Create worker in gce-europe || Create worker in gce-europe || Create worker in gce-europe || Create worker in gce-europe || Create worker in gce-europe || Create worker in gce-europe || Create worker in gce-europe || executed in parallel -- executed sequentially
  10. 10. Control Loop Example to deploy a cluster Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 10 Updated resources. - Secgroup for gce-europe - Master node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe All detail data like IP-adresses, identifiers, etc. omitted for better readability.
  11. 11. - Secgroup for gce-europe - Master node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe Control Loop Example: Transfer of five worker nodes Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 11 { "type": "cluster", "platform": "Swarm", "deployments": [ { "district": "gce-europe", "flavor": "small", "role": "master", "quantity": 1 }, { "district": "gce-europe", "flavor": "small", "role": "worker", "quantity": 9 }, { "district": "aws-europe", "flavor": "small", "role": "worker", "quantity": 0 } ] } 4 5 || Create secgroup for aws-europe || Create worker in aws-europe || Create worker in aws-europe || Create worker in aws-europe || Create worker in aws-europe || Create worker in aws-europe -- Delete worker in gce-europe -- Delete worker in gce-europe -- Delete worker in gce-europe -- Delete worker in gce-europe -- Delete worker in gce-europe || executed in parallel -- executed sequentially - Secgroup for gce-europe - Secgroup for aws-europe - Master node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in gce-europe - Worker node in aws-europe - Worker node in aws-europe - Worker node in aws-europe - Worker node in aws-europe - Worker node in aws-europe
  12. 12. Resulting Architecture (Domain Model) Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 12 Extension point for elastic platforms Currently supported: Kubernetes, Swarm Extension point for IaaS infrastructures Currently supported: AWS, GCE, Azure, OpenStack
  13. 13. Evaluation: 5 Experiments (with a 1 Master and 9 Worker Cluster) Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 13 OpenStack Google Compute Engine (GCE, n1-standard-2) Elastic Compute Cloud (EC2, m3.large) E1 E2 E2 E1 E3, E4, E5 E3, E4, E5 The same experiments have been done with OpenStack as well. E1: Launch a 10 node cluster. E2: Terminate a 10 node cluster. E3: Transfer one node of the cluster. E4: Transfer 5 nodes of the cluster. E5: Transfer all nodes of the cluster. Cluster was Docker Swarm (operated a Sock Shop Reference Application and a Redis-based Guestbook) Kubernetes Different elastic container platforms had no significant impact on the runtimes. Therefore data is only presented for Docker Swarm. Docker Swarm
  14. 14. Evaluation (Single Cloud) Deploying and terminating clusters Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 14 Experiment E1 Experiment E2 10 times longer ???
  15. 15. Evaluation (Multi-Cloud) Transfer GCE ⇠⇢ AWS Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 15 Experiment E3 Experiment E4 Experiment E5 Comparable with a shutdown. Node termination times seem to dominate the transfer times massively.
  16. 16. Why these (dramatic) differences? Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 16 Analysis turned out: 1. GCE API works synchronously (a node termination call blocks until termination is completed) 2. AWS API works asychronously (so node termination call did not block until termination completed, fire and forget) 3. GCE SDN related processing times take far longer than AWS SDN related processing times.
  17. 17. Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 17
  18. 18. Conclusion Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 18 • Elastic container platforms provide often overlooked multi-cloud opportunities • We could succesfully demonstrate multi-cloud transfers between AWS, GCE, Azure and OpenStack using a simple control loop (scaling Kubernetes and Docker SwarmMode). • The control loop is designed to be integratable in a MAPE loop as execution phase. • A cybernetic understanding (intended state vs. current state) makes a lot of multi-cloud workflows easier. • On the downside: The solution is limited to container-based applications (CNMM Level 3) and services (but that seems to become a dominating architectural style). • New research opportunities and future research directions: • Making the solution available as Open Source • P2P-based elastic platforms would make deployments even easier (no worker/master roles) • There is room for improvements (e.g. resource efficient action planning)
  19. 19. Acknowledgement • Elastic Straps: Pixabay (CC0 Public Domain, PublicDomainPictures) • Definition: Pixabay (CC0 Public Domain, PDPics) • Class room: Pixabay (CC0 Public Domain, Unsplash) • Railway: Pixabay (CC0 Public Domain, Fotoworkshop4You) • Air Transport: Pixabay (CC0 Public Domain, WikiImages) Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 19 Picture Reference This research is funded by German Federal Ministry of Education and Research (03FH021PX4). I would like to thank Peter Quint, Christian Stüben, and Arne Salveter for their hard work and their contributions to the Project Cloud TRANSIT. Presentation URL Paper URL
  20. 20. About Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 20 Nane Kratzke CoSA: http://cosa.fh-luebeck.de/en/contact/people/n-kratzke Blog: http://www.nkode.io Twitter: @NaneKratzke GooglePlus: +NaneKratzke LinkedIn: https://de.linkedin.com/in/nanekratzke GitHub: https://github.com/nkratzke ResearchGate: https://www.researchgate.net/profile/Nane_Kratzke SlideShare: http://de.slideshare.net/i21aneka
  21. 21. Backup Slides Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 21
  22. 22. Elastic Platforms and Multi-cloud requirements Multi-Cloud Requirements Contributing Platform concepts Transferability Integration of nodes into one logical cluster Designed for failure Cross-provider deployable Data location awareness Pod concept (Kubernetes) Volume orchestrator (Flocker for Docker) Geolocation awareness Tagging of nodes with geolocation, pricing, policy or on-premise informations Platform schedulers have selectors (Swarm) / affinitities (Kubernetes) / constraints (Mesos/Marathon) to evaluate these taggings Pricing awareness Legislation/policy awareness Local resources awareness Security requirements Encrypted data / control plane (Swarm) Encrypted overlay networks (e.g. Weave for Kubernetes) Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 22 Several transferability, awareness and security requirements come along with multi-cloud approaches. Already existing elastic container platforms contribute to fulfill these requirements.
  23. 23. Cloud-native Application What? Be IDEAL • Isolated State • Distributed • Elastic • Automated management • Loosely coupled Why? There is a need for .. • Speed (delivery) • Safety (fault tolerance, design for failure) • Scalability • Client diversity How? Integrate ... • (Micro)service oriented architectures (M)SOA • Use API-based collaboration • Consider cloud-focused pattern catalogues • Use self-service agile platforms Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 23 C. Fehling, F. Leymann, R. Retter, W. Schupeck, and P. Arbitter, Cloud Computing Patterns: Fundamentals to Design, Build, and Manage Cloud Applications. Springer, 2014. M. Stine, Migrating to Cloud-Native Application Architectures. O’Reilly, 2015 A. Balalaie, A. Heydarnoori, and P. Jamshidi, “Migrating to Cloud-Native Architectures Using Microservices”, CloudWay 2015, Taormina, Italy S. Newman, Building Microservices. O’Reilly, 2015. Often heard by practitioners: „A cloud-native application is an application intentionally designed for the cloud.“ True, but helpful?
  24. 24. Cloud-native Application Definition Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 24 [KQ2017a] Kratzke, N., & Quint, P.-C. (2017). Understanding Cloud-native Applications after 10 Years of Cloud Computing - A Systematic Mapping Study. Journal of Systems and Software, 126 (April).
  25. 25. We need some guidance ... ClouNS – Cloud-native Application Reference Model Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 25 [KP2016] Kratzke, N., & Peinl, R. (2016). ClouNS - a Cloud-Native Application Reference Model for Enterprise Architects. In 2016 IEEE 20th International Enterprise Distributed Object Computing Workshop (EDOCW) (pp. 1–10).
  26. 26. Did you know? Prof. Dr. rer. nat. Nane Kratzke Praktische Informatik und betriebliche Informationssysteme 26 2 2 2 4 6 7 7 7 7 11 11 1 1 2 4 7 10 14 21 26 42 44 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Relation of considered services considered by CIMI, OCCI, CDMI, OVF, OCI, TOSCA not considered Cloud standards improved over the last 10 years. However, cloud standardization coverage decreased (in relation to all available services). Analyzed using over 2300 offical release notes of Amazon Web Services (AWS). Data for other providers like Google, Azure, Rackspace, etc. not presented. Basic conclusions for these providers are the same. [KQP+2016] Kratzke, N., Quint, P.-C., Palme, D., & Reimers, D. (2016). Project Cloud TRANSIT - Or to Simplify Cloud-native Application Provisioning for SMEs by Integrating Already Available Container Technologies. In V. Kantere & B. Koch (Eds.), European Project Space on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges.
  27. 27. Research Methodology Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 27 Main focus of this contribution CNA == Cloud-native Application
  28. 28. Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 28 Evaluation: Virtual Machine Type Selection [KQ2015] Kratzke, N., & Quint, P.-C. (2015). About Automatic Benchmarking of IaaS Cloud Service Providers for a World of Container Clusters. Journal of Cloud Computing Research, 1(1), 16–34. We searched for the most similar machine types of different public cloud service providers. The similarity indicator maps processing, memory, network, and disk I/O performance to just one similarity value (1 means identical, 0 means no similarity at all).
  29. 29. This reference model guides our research Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 29 Developing a description language for cloud-native applications. Developing a standardized way of deploying a clustered container runtime environment for cloud-native applications (CNMM Level 3 conform deploying/operation) Make use of commodity services of public cloud service providers only (IaaS).
  30. 30. Research Surveillance of Practitioners Prof. Dr. rer. nat. Nane Kratzke Computer Science and Business Information Systems 30 Practitioners often prefer layer-based reference models ... Jason Lavigne, ”Don’t let aPaaS you by - What is aPaaS and why Microsoft is excited about it”, see https://atjasonunderscorelavigne.wordpress.com/2014/01/27/dont-let- apaas-you-by/ (last access 4th August 2016) Johann den Haan, ”Categorizing and Comparing the Cloud Landscape”, see http://www.theenterprisearchitect.eu/blog/categorize-compare-cloud- vendors/ (accessed 4th August 2016) Josef Adersberger, Andreas Zitzelsberger, Mario-Leander Reimer, ”Der Cloud-Native- Stack: Mesos, Kubernetes und Spring Cloud”, see http://www.qaware.de/fileadmin/user_upload/QA ware-Cloud-Native-Artikelserie-Java_Magazin- 1.pdf (accessed 4th August 2016) MEKUNSCloud Landscape Model

×