Successfully reported this slideshow.
Your SlideShare is downloading. ×

Containerizing legacy applications


Check these out next

1 of 100 Ad

More Related Content

Slideshows for you (20)

Similar to Containerizing legacy applications (20)


Recently uploaded (20)

Containerizing legacy applications

  1. 1. Containerising legacy applications with dynamic file-based configurations and secrets DevOps Days Toronto, May 2019

 Andrew Kirkpatrick
  2. 2. Question 1: How many of you have what you consider to be “legacy” applications at your place of work?
  3. 3. Question 2: For those who do have legacy applications, how many of you have no previous maintainers (committers) of that code still employed at your place of work?
  4. 4. Question 3: For those who have legacy applications you cannot (easily) maintain, how many of you could not (easily) redeploy those applications on brand new infrastructure in a disaster recovery scenario?
  5. 5. This presentation is not about ‣ How wonderful Kubernetes/Nomad/DCOS is (or isn’t…) ‣ What “DevOps” culture should look or feel like ‣ How you can proxy traffic halfway round the moon and produce pretty confusing distributed tracing graphs ‣ How you can observe even more metrics that you’ll probably never look at
  6. 6. This is more about the harsh reality that there is always legacy code…
  7. 7. So why would anyone want to migrate their application(s) into containers without rewriting code?
  8. 8. ‣ Too busy working on newer (micro)services ‣ Avoid touching the nightmare codebase that the long-gone Developer(s) that worked on it left behind ‣ Business wants to fast-track the cost-saving benefits of containerisation
  9. 9. Would you be able to even if you could? ‣Availability of hireable Developers? ‣Prohibitive cost of remaining “experts” ‣Un-decipherable yet crucial business logic ‣Maintenance/ documentation of older libraries/packages/ frameworks/tooling 0 17.5 35 52.5 70 2019 2018 2017 2016 2015 2014 2013 Javascript Java PHP Ruby Node C# Python C
  10. 10. Assumptions ‣ You can’t change the code in your application(s) ‣ Familiarity ‣ Time ‣ Permissions/licensing ‣ You don’t want to change the code in your application(s) ‣ Bugs ‣ Liability ‣ You want to take the path of least resistance
  11. 11. Steps ‣ Lets find a legacy application ‣ Run your application in a container ‣ (re)deploy your application using templated static configuration ‣ Update your application on-the-fly using dynamic configuration ‣ Avoid things that might catch you out
  12. 12. Lets find a legacy application…
  13. 13. Old(er) application candidate ‣ Must have been designed pre-containerisation ‣ Must rely on filesystem-based configuration, which it may update itself ‣ Likely makes assumptions about running on a single server (thus a single filesystem, etc.) ‣ Has little/no examples of a quickstart using a container orchestrator
  14. 14. Lets pick an old(er) application What about WordPress? Primarily relies on wp-config.php
  15. 15. Lets pick an old(er) application Alright then, perhaps something less obvious but just as ubiquitous? phpBB? Primarily relies on config.php
  16. 16. Lets pick an old(er) application Need to go a bit more obscure… how about another forum, MyBB? Primarily relies on config.php and settings.php
  17. 17. Found the right era of application ‣ “Once you’ve uploaded your files you will need to set the permissions on certain files and directories. ‣ Before granting certain files and directories chmod 777, you may want to try chmod 755 or chmod 775.”
  18. 18. Older applications live on Codebases like these are still being actively used and maintained despite their longevity ‣ MyBB was first released on 9th December 2005 (v1.0.0) ‣ Last released on 27th February 2019 (v1.8.20) Still plenty of issues being reported and resolved 7 days ago, 13 days ago, 15 days ago…
  19. 19. MyBB ‣ Majority of the file-based configuration for MyBB resides in 2 files ‣ inc/config.php ‣ inc/settings.php ‣ Note that these are not JSON/YAML or even INI/XML files, but language-specific configuration files
  20. 20. config.php Environment configuration such as database connectivity, logging, IP blacklisting, etc. Essential to the bootstrapping of the application
  21. 21. settings.php Application-specific settings pre-database configuration load, such as formatting, routing, basic settings and some string replacements. Example of configuration that is traditionally updated by the application itself
  22. 22. Demo 0 Run MyBB using the PHP built-in web server
  23. 23. Demo 0 key points ‣ Only works on my laptop with a filesystem-based database ‣ Local development server
 php -S localhost:9000 -t application ‣ SQLite database for simplicity
  24. 24. Run your application in a container
  25. 25. Run your application in a container First, how do you run your application in a container that resembles your existing environment? ‣ Bake a base container image re-using existing provisioning ‣ Modifying your build pipeline to create container images ‣ Additional environment configuration considerations (external libraries, vendor integrations, etc.)
  26. 26. What is your “base”? How bespoke is your application runtime environment? ‣ How are you currently provisioning yourVM/bare metal fleet?
 (hopefully using infrastructure automation? 🤞) ‣ Do you compile your own packages/libraries/ drivers and/or maintain your own Debian/RPM/etc. repository? ‣ How many other configuration changes are you making to hosts?
  27. 27. Container tutorial assumptions Problem with most containerisation tutorials is that they assume that your application will run with near-vanilla configuration in a vanilla environment
  28. 28. Container tutorial assumptions … and to keep the images small, they strip a lot of “extras” MySQL Improved Extension Memcached Extension
  29. 29. Container tutorial assumptions ‣ You may end up with a rather messy Dockerfile ‣ Also AUFS has a 42 layer limit (or used to) ‣ Easier to use multi-stage builds, or “bakes”
  30. 30. Why not just use Dockerfiles? Developers will often ask…
 “why not just use a Dockerfile to install the bare minimum?” or “can’t we just use a community image?” ‣ “the bare minimum” could be hundreds of packages or binaries, custom configured or compiled ‣ If using a different distribution than the rest of your fleet, do you really know if it’s secure?
  31. 31. Why not just use Dockerfiles? CVE-2019-5021 “Alpine Linux Docker Images Shipped for 3 Years with Root Accounts Unlocked Alpine Linux Docker images available via the Docker Hub contained a critical flaw allowing attackers to authenticate on systems using the root user and no password.” “This CVE does not impact Alpine distros that are not delivered as Docker images.”
  32. 32. Why not just use Dockerfiles? ‣ If using a community image, are you actually vetting the contents for vulnerabilities or malware?
 “Backdoored images downloaded 5 million times finally removed from Docker Hub” ‣ Also your community image may mysteriously vanish for reasons beyond your control (legal, disagreements or even abandonment)
 "My image with 10M+ pulls has just gone (completely removed) from Docker Hub"
  33. 33. Agentless provisioning ‣ All major configuration management tools now have an option to run agentless ‣ This allows you to createVM-esque images (e.g. AMIs) for containers ‣ Same principle as baking images for ASG/MIG/VMSS instance groups
  34. 34. Agentless provisioning You will need to run your familiar provisioning tool (ideally) agentless and masterless ‣ Puppet Bolt ‣ Chef Solo (Chef Zero) ‣ Salt Masterless ‣ Ansible 😏
  35. 35. Differences between base VM and container images ‣ *nix distributions may not necessarily package the same binaries in their baseVM image and base container image ‣ YourVM provisioning may stumble on OS detection assumptions CentOS 7.6 via Docker Image
 148 packages installed CentOS 7.6 viaVagrant Box
 318 packages installed
  36. 36. Baking container images ‣ HashiCorp Packer with 
 Puppet/Chef/Ansible/SaltStack/etc. ‣ OpenShift Source-To-Image (s2i) ‣ Ansible Bender (formerly Ansible Container) and Buildah ‣ Dockerfile ADD/COPY and RUN ‣ with —squash ‣
  37. 37. Modifying your build pipeline What does the build phase of your deployment pipeline involve? ‣ Version Control pull into active directory ‣ Upgrade via OS package manager (deb, rpm, etc.) ‣ Symlink swap of pre-built artifact on long-running hosts ‣ AutoScalingGroup/ManagedInstanceGroup rolling updates or blue/green
  38. 38. Build-phase examples Doesn’t matter if you are… ‣ Fetching build artifacts from a repository ‣ Building build artifacts there-and-then ‣ Installing dependencies from a registry … as long as it ends up in the image
  39. 39. Additional environment configuration There are usually other packages/libraries/binaries that will have their own configuration also. How do you configure these? ‣ Bake into image (different images per environment) ‣ Pull from remote source at container start 🤢 ‣ Inject as files using container orchestrator ‣ Set environment variables
  40. 40. Example: boto “Boto looks for credentials … through a list of possible locations and stop as soon as it finds credentials” ‣ Passing credentials as parameters in the boto.client() method ‣ Passing credentials as parameters when creating a Session object ‣ Environment variables ‣ Shared credential file (~/.aws/ credentials) ‣ AWS config file (~/.aws/config) ‣ Assume Role provider ‣ Boto config file (/etc/boto.cfg and ~/.boto) ‣ Instance metadata service on an Amazon EC2 instance that has an IAM role configured.
  41. 41. Demo 1 Build and run MyBB container image
  42. 42. Demo 1 key points ‣ Single container with port forwarding ‣ Still uses SQLite database ‣ Configuration and database baked into the image ‣Any changes made to settings.php by MyBB itself will be lost when the container stops
  43. 43. Docker Compose ‣ “Compose is a tool for defining and running multi-container Docker applications.” ‣ “Traditionally been focused on development and testing workflows” ‣ Limited to a single host (without Swarm)
  44. 44. Demo 2 Run MyBB using Docker Compose
  45. 45. Demo 2 key points ‣Running MySQL as separate container ‣Volume mounting configuration directory into MyBB container ‣Symlinking configuration files into application include directory ‣Limited to single instance of MyBB
  46. 46. Container Orchestrators ‣ Docker Swarm
 Very basic ‣ CNCF Kubernetes
 De facto standard ‣ Mesosphere Marathon
 Flexible but complicated, separate from Mesos itself ‣ HashiCorp Nomad
 Basic but simpler and can schedule non-container work (plus Consul/Vault integration) ‣ Uber Peloton
 Have fun installing it
  47. 47. Docker Swarm Basically Docker Compose for multiple machines ‣ “A swarm is a group of machines that are running Docker and joined into a cluster. ” ‣ “You continue to run the Docker commands you’re used to, but now they are executed on a cluster by a swarm manager”
  48. 48. Demo 3 Run MyBB using Docker Swarm
  49. 49. Demo 3 key points ‣Inject configuration files via the Swarm manager ‣MyBB now load balanced across multiple containers ‣ Specify CPU and memory limits ‣However, once again any changes made to settings.php by MyBB itself will be lost when the container(s) stops
  50. 50. Demo 2 key points Demo 2, Docker Compose Demo 3, Docker Compose… …Demo 3, Docker Compose
  51. 51. (re)deploy your application using templated static configuration
  52. 52. Kubernetes Just in case someone has been frozen like Captain America… “Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery.”
  53. 53. Container quickstarts ‣ Almost every container quickstart guide will either demonstrate how to run containers with either no configuration or using environment variables ‣ So how would you inject file-based configuration into a container?
  54. 54. Stateful configuration ‣ Bake specific configuration files into the image ‣ Different images required per environment/tenant ‣ Security vulnerability if image is compromised ‣ Potentially dangerous?
  55. 55. Stateless configuration ‣ Orchestrator defines what configuration should be injected ‣ Developer and/or deployer only has access to configuration and secrets they need ‣ Assumes that applications are configured once on startup and only reconfigured if/when necessary ‣ Facilitates blue/green and canary deployments
  56. 56. ConfigMap config.php
  57. 57. Demo 4 Run MyBB using Kubernetes
  58. 58. Demo 4 key points ‣ Same as Docker Swarm… for now ‣ Replicated containers for MyBB on single node Kubernetes ‣ Replaced Docker Configs with Kubernetes ConfigMaps
  59. 59. Demo 4 key points Demo 4, KubernetesDemo 3, Docker Compose… …Demo 3, Docker Compose
  60. 60. Templating ‣ Docker Compose files, Kubernetes Manifests, Nomad Job Specifications and Marathon application definitions are just YAML/JSON ‣ Any templating language (Python/Go/etc.) should work ‣ Endless choices these days…
  61. 61. Templating ‣ Kubernetes ‣ Helm ‣ ksonnet ‣ Kapitan ‣ Draft ‣ Metaparticle ‣ Nomad ‣ Levant ‣ Generic ‣ jsonnet ‣ Render ‣ etc.
  62. 62. Helm ‣ Go-based templating for Kubernetes manifests ‣ Usually works on a server-client model ‣ Tiller (server) manages releases in the cluster itself ‣ Best suited for off-the-shelf software
  63. 63. Helm Can be used just for rendering templates
 (e.g. Spinnaker)
  64. 64. Kustomize ‣ “Template-free” ‣ Default manifest(s) withYAML “overlays” for environment-specific “variant” values
  65. 65. Jsonnet/Ksonnet ‣ JSON-based templating DSL ‣ “work on ksonnet will end and the GitHub repositories will be archived”
  66. 66. Persisting configuration changes How do we share runtime changes made to configuration files between containers? ‣ Easiest way is a shared volume ‣ Relies on ‣ Application generating it’s own configuration ‣ Pre-populating the volume with configuration
  67. 67. Kubernetes Volumes ‣ Storage that is mounted in per container in a Pod ‣ Requires a PersistentVolume Claim ‣ Many different storage types supported ‣ If you already use NFS/iSCSI, keep doing so
  68. 68. Portworx, StorageOS ‣ Useful for orchestrators with no native persistent storage support (such as Nomad)
  69. 69. Rook ‣ Storage orchestration (orchestrate all the things) ‣ Abstract away storage implementation details ‣ Multiple storage providers* but plays well with Ceph ‣ Ceph offers object, block and file distributed storage *These abstractions have caveats, see Avoid things that might catch you out section at the end
  70. 70. Demo 5 Deploy MyBB using Helm and Kubernetes Shared Volume
  71. 71. Demo 5 key points Demo 4, Kubernetes Demo 5 Helm
  72. 72. Demo 5 key points Demo 4, Kubernetes Demo 5 HelmDemo 5 Helm
  73. 73. Persistent Volume for settings.php
  74. 74. Update your application on-the- fly using dynamic configuration
  75. 75. ConfigMap updates tl;dr They don’t update consistently ‣ “When a ConfigMap already being consumed in a volume is updated, projected keys are eventually updated as well. Kubelet is checking whether the mounted ConfigMap is fresh on every periodic sync. ‣ However, it is using its local ttl-based cache for getting the current value of the ConfigMap. ‣ As a result, the total delay from the moment when the ConfigMap is updated to the moment when new keys are projected to the pod can be as long as kubelet sync period + ttl of ConfigMaps cache in kubelet.”
  76. 76. ConfigMap updates … or you Indiana Jones it 😅
 (that feature request has been open since March 2016)
  77. 77. Dynamic configuration Templating static configuration makes for easier deployments, but means every time you want to update configuration you will need to redeploy. But what if you want to ‣ Update configuration independent of deployments? ‣ Frequently change configuration values/feature toggles and need quick rollbacks? ‣ Use automated short-lived credential rotations? (for databases, etc.)
  78. 78. Consul Template ‣ Consul Template is an agent that will (re)generate templated files based on changes detected in Consul and/orVault ‣ Primary use-cases are Key-Value Store updates or Service Discovery registration changes ‣ Built into Nomad Template Stanza, but easy to run otherwise ‣ Integration withVault for secrets injection
  79. 79. confd ‣ Wider compatibility with data sources beyond just Consul ‣ Primarily etcd ‣ DynamoDB ‣ Redis ‣ Zookeeper ‣ AWS SSM Parameter Store ‣ etc.
  80. 80. Demo 6 Deploy with Consul Template exec mode entrypoint
  81. 81. Demo 6 key points Demo 6, Consul Template Demo 5 Helm
  82. 82. Demo 6 key points Vault “Key/Value” KV Secrets Engine Consul “Key/Value” KV Store
  83. 83. Secrets management ‣ Kubernetes Secrets ‣ Torus
 Beta (free for now) cloud service ‣ Confidant by Lyft ‣ Secrethub
 Cloud paid service ‣ Credstash
 AWS-specific, integrates with KMS and DynamoDB ‣ AWS SSM Parameter Store ‣ Keywhiz by Square ‣ Vault by HashiCorp using-parameter-store-and-iam-roles-for-tasks/
  84. 84. Kubernetes Secrets ‣ Are not encrypted, but Base64 encoded in etcd ‣ Is your etcd instance secured? ‣ “With etcd and Kubernetes the setup is all or nothing, there’s no authorisation used, so be very careful” (pre 2.1) ‣ If it’s known to be less-than-secure, what alternatives are there?
  85. 85. HashiCorp Vault “Vault secures, stores, and tightly controls access to tokens, passwords, certificates,API keys, and other secrets in modern computing. Vault handles leasing, key revocation, key rolling, auditing, and provides secrets as a service through a unified API.”
  86. 86. Database Secrets Engine ‣ “The database secrets engine generates database credentials dynamically based on configured roles. ‣ Since every service is accessing the database with unique credentials, it makes auditing much easier when questionable data access is discovered. ‣ Vault makes use of its own internal revocation system to ensure that users become invalid within a reasonable time of the lease expiring.”
  87. 87. Demo 7 Vault database credential rotation and Consul External Services Monitor
  88. 88. Demo 7 key points Password Rotation with short-term credential leases Vault Database Secrets Engine
  89. 89. Demo 7 key points Consul Nodes and Services (including those registered with External Service Monitor)
  90. 90. Demo 7 key points Demo 6, Consul KV and Vault KV Secrets Engine Demo 7, Consul Service Discovery and Vault Database Secrets Engine
  91. 91. Avoid things that might catch you out
  92. 92. Container environment detection Is your application making assumptions about the environment it runs in?
  93. 93. Forking ‣ Does your application create child processes assuming a normal PID namespace? ‣ ENTRYPOINT will be PID 1 by default ‣ If it forks a child that dies before any grandchildren exit, zombie processes can accumulate ‣ “A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.”
  94. 94. Restart splay ‣ If your process/container is restarting to reload configuration (or otherwise) ‣ What splay time do you have (0s, 5s, 30s, 5m…) ‣ Given restart delay combined with splay, what is your 0% readiness risk? ‣ Can you buffer incoming requests and replay if 0% ready? (via gateway, traffic manager, etc.)
  95. 95. Storage inconsistencies ‣ Are file locks being released upon container termination correctly? ‣ Just because something says it behaves like x, doesn’t mean it actually will e.g. limitations of Ceph RGW NFS ‣ Links, including symlinks, are not supported ‣ NFS ACLs are not supported ‣ Directories may not be moved/ renamed ‣ Only full, sequential write i/o is supported ‣ many typical i/o operations such as editing files in place will necessarily fail as they perform non-sequential stores
  96. 96. Pick the simplest orchestrator possible ‣ Don’t use Kubernetes just because everyone else is ‣ Understand the security implications of default networking ‣ If you can roll it out with just Swarm, Fargate or even ECS…
  97. 97. Recap ‣ Lets find a legacy application ‣ Run your application in a container ‣ (re)deploy your application using templated static configuration ‣ Update your application on-the-fly using dynamic configuration ‣ Avoid things that might catch you out
  98. 98. Acknowledgements ‣ Thanks to my former team of Matthew Wright, Nikolai Orenstrakh and Stefan Kolesnikowiczfor listening to my crazy ideas which led to this talk ‣ Thanks to the ExploreTech Toronto meetup for helping me with my first public speaking opportunity a couple years ago ‣ Thanks to the London (UK) and Toronto (CA) tech communities for both being so welcoming and supportive ‣ Finally, special thanks to the DevOps Days Toronto team for the opportunity to speak today and the support they provided
  99. 99. That’s a wrap! • Slides will be posted online • ContainerisingLegacyApplicationsTalk • • andrewkirkpatrick/