Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dockercon EU 2015

2,720 views

Published on

My talk from Dockercon 2015

Published in: Technology

Dockercon EU 2015

  1. 1. Running Docker in Production Successfully John Fiedler Sr. Director of Engineering @ SalesforceIQ
  2. 2. About me ● I work for SalesforceIQ formerly RelateIQ ● I’ve used Docker for over 2 years ● I’ve done a couple of talks on Docker o http://blog.heavybit.com/blog/2015/3/2 3/dockermeetup o https://engineering.twitter.com/universi ty/videos/chef-versus-docker-at- relateiq o https://www.youtube.com/watch?v=z9 yNq-IjCcM ● I co-authored this book: o http://bleedingedgepress.com/docker- in-the-trenches/
  3. 3. Docker Book ● 50% off for everyone! ● Click here! https://gum.co/lQGH/dockerconeu ● Only $11.50 ● 200 pages
  4. 4. Agenda Docker Journey with SalesforceIQ Lessons Learned PaaS/CaaS
  5. 5. Docker Journey with SalesforceIQ Two years in production...
  6. 6. What is production? Production != test dev Isolation, Security, Performance, Monitoring, Logging… Scale, templates, automation… What is successful? >99% uptime or low # of outages? Fast code deployment? 0 Security Incidents?
  7. 7. 100% of our web infrastructure running with Docker Boom
  8. 8. SalesforceIQ journey into production 2013 2014 2014 2014Q4Q4 Q1 Q2 Dev Environment Continuous Deployment in Teamcity Web Zero Downtime Deployments Full Stack Container Azkaban DockerMe Integrations Batch Jobs Mesos Kafka Dev/ Ops CLI Craft CMS Main Website Beanstalk 2015+ Devenv 2.0 P a a S Now2015
  9. 9. Database CI/CD Server Dev or Ops Environment Web Server Api Server Batch Jobs Integrations What we’ve put in containers Rate of Change Dependencies
  10. 10. Database CI/CD Server Dev or Ops Environment Web Server Api Server Batch Jobs Integrations Stateful Long-Life Stateless Short-Life What we’ve put in containers
  11. 11. Zoom in a little Persistent Storage Middleware / Integrations / Internal Tools / Scripts / Jobs Web Monitoring Logging Security Dev Environment Ops Environment CI / CD Fully Somewhat No Create Deploy Run Operate Dockerized Batch & Stream processing
  12. 12. Lessons Learned Alot...
  13. 13. Lots of tidbits ● Docker is prod ready but many surrounding solutions are not (alpha and beta) o Caution with the new toys is required ● Don’t go straight towards a PaaS if you're just starting out o Kubernetes, Mesos, CoreOS, Swarm, ECS ● Keep it simple o Know what works and what doesn’t ● Old tools still work great, and I’ll show you how o Know how to scale what you're doing ● You're going to have to roll your own at some point (orchestration) o As of version 1.5.11, HAProxy does not support zero downtime restarts or reloads of configuration. ● Learn from others, Tons of people in production now o Read the whole internet ● You can secure running containers o Twistlock, Conjur, Banyanops ● Get creative o Docker is golden and mobile
  14. 14. You can docker with Chef, Ansible, SaltStack... • You can use the tools you have today if you're not dockerized already • What… • But those are the tools i’m already using... • Yes they still work and work great
  15. 15. Our current prod web server ● Worked with all our existing tools! ○ Chef, Monitoring, Logging ● Security didn’t change ○ Security keys ○ Firewall ● Super easy to scale ○ Could pack with Packer to create AMI ○ Shell script was super easy ● Zero downtime ● Rollbacks Web Container v1 Web Container v2 Hipache/Redis Container Amazon AMI setup with Chef Cron job to run shell script to orchestrate containers
  16. 16. Demo It’s time
  17. 17. #1 thing we found!!!!
  18. 18. You WILL have disk/file system issues...
  19. 19. File system... Volumes not unmounting Long deletion times on device mapper –storage-opt dm.blkdiscard=false Kernel version matters! Great visual deep dive http://merrigrove.blogspot.com/2015/10/visualizi ng-docker-containers-and-images.html?m=1 What we used overtime 1. Started with AUFS - hit 42 layer limit 2. Then moved to device mapper a. Device/Volume not found b. NNOOOOOOOOOO 3. Back using AUFS again after bug fixes and layer 42 limit removal a. Continue to fight layer issues, mount issues 4. Back to device mapper with Docker 1.7 dynamic binaries! What we’ve landed on Ubuntu = AUFS Amazon Linux = Device mapper
  20. 20. Get a good registry Great options • Hub.docker.com • Quay.io • Trusted registry • Google • Azure • AWS • S3.. no registry… save/load 1. We started private registry a. went insane with buggy releases, failed pulls/pushes 2. Went to quay.io a. happy but slow, and costs $$ 3. Back to private registry 0.9 release… now stable 4. Scaled it and working great 5. Now working on upgrading to Docker Registry 2.1
  21. 21. Storage -Unlimited -Cheap Elasticache -Redis Beanstalk -Autoscale Scaling our registry • 100% AWS • Beanstalk ELB Auto scaling Group Docker web service • Redis Cache Elasticache Had issues when a node failed • S3 Backend Had huge issues on layer corruption ELB Docker Registry Cache S3
  22. 22. Isolation is your friend Low container to host ratio • Compute Spikey Processing… no problem • Storage Out of disk… no problem • Networking Shared bandwidth… no problem • Ram Swapping issue… no problem • Security Groups Least privilege… no problem Web Container v2 Amazon AMI setup with Chef Cron job to run shell script to orchestrate containers Hipache/Redis Container Web Container v1
  23. 23. CI/CD with Docker • The biggest ROI with Docker • Teamcity • Used to use Docker in Docker https://jpetazzo.github.io/2015/09/03/do-not- use-docker-in-docker-for-ci/ • Agents used to run in a docker container Now built with chef and packer • Autoscaling with Docker? Github.com Dockerfile Teamcity Agent Agent Agent Registry Server
  24. 24. Many PaaS/CaaS utilize sidekicks • Amazon ECS https://github.com/aws/amazon-ecs-agent • Amazon Beanstalk https://github.com/aws/aws-eb-python- dockerfiles • Netflix Prana • Smartstack • Docker Ambassador http://www.slideshare.net/Docker/slides hare-burns • CoreOS - Sidekick • Rancher • Logging Container Container Container Container (sidekick) Rest Api Service Discovery Health checks Orchestration Container Host
  25. 25. PaaS/CaaS How you’ll scale a single service
  26. 26. Beanstalk -Cloud formation EC2 Server Autoscaling Isolation Security Groups Environment Variables Beanstalk architecture • Run Over 50+ services on beanstalk today • Automagically built web container per branch of code • Corp site/Help site • 100% automated!! • Great for Web services SOA • You will have disk issues Storage Easy to spin up DNS service discovery Load balancer SSL Termination ELB Container RDS
  27. 27. Demo Beanstalk
  28. 28. One year ago • CoreOS... so cool • Mesos… cool with scale • Beanstalk… with docker support • Swarm… beta • Deis… oooo saas • ECS… ok now we're getting somewhere • Kubernetes… where did that come from… looks cool too Now….. • Kubernetes on top of DCOS, on top of Mesos, on top of CoreOS… facepalm
  29. 29. PaaS/CaaS Overview CoreOS DCOS Kubernetes ECS Orchestration Scheduler Resource Allocation Service Discovery More than Containers Health Check Storage clustering... Live Migration... Affinity rules...
  30. 30. DCOS Mesos Private Slave Auto Scaling Health Checks Intelligence Being successful with a PaaS/CaaS Our DCOS Architecture Built a edge router Built a Brain router Infra CLI This will run all of our stateless services Mesos Public Slave Auto Scaling Service Discovery Public <> Private DNS Can be Internal as well Storage SSL Termination DNS ELB Edge Router DB2 ServiceService Edge Router DB3DB1 Mesos Master Marathon Health Check API Change Event Bus InfraIQ
  31. 31. Demo InfraIQ
  32. 32. Summary • Starting out? Just use the same tools you have • You’ll need to roll up your sleeves • Security is not hard but you need to think about it • Many vendors are entering container space • Build towards a PaaS • Many solutions to PaaS • Know what you're trying to solve • Have fun!
  33. 33. Thank you! John Fiedler@johnfiedler johnfiedler@gmail.com

×