Successfully reported this slideshow.
Your SlideShare is downloading. ×

The pain and gains running Docker in live @Pipedrive

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 28 Ad

More Related Content

Slideshows for you (20)

Similar to The pain and gains running Docker in live @Pipedrive (20)

Advertisement

Recently uploaded (20)

The pain and gains running Docker in live @Pipedrive

  1. 1. The pain and gains running Docker in live @Pipedrive Renno Reinurm 17.01.17
  2. 2. ● Pipedrive helps small businesses control the complex selling process ● Founded in 2010 ● 30,000 paying customers worldwide ● 200+ employees ● Offices in Tallinn and Tartu New York, NY
  3. 3. Pipedrive helps small businesses control the complex selling process
  4. 4. Why to use Docker? ● Growth pains with Chef ● New language + new tools = entry barrier ● You write recipes seldom enough and forget how it’s done ● But it runs fine in test!
  5. 5. Early docker platform started with evaluating running docker inside Vagrant box. Instead we started to use custom built docker-machine. Lately moved to Docker4Mac
  6. 6. First use case for containers Provision on demand test environments per branch. Was implemented only for test coverage-suite execution environment. Lot of custom hacks to make it work.
  7. 7. Docker infrastructure v1 The first Docker builds using Codeship Docker CI beta The first usage of Tutum (Docker Cloud) as orchestration service
  8. 8. Yeah we were using Docker, but CI processes with Codeship was slow, Docker build itself took ~15minutes Deployment in Docker Tutum cluster took another ~10minutes Sometimes it was so slow we wondered if it still works Stability issues - we experienced “data loss” and “service downtime”
  9. 9. The Birth of Docker Infrastructure v2.0 Requirements: Improve the speed of CI processes Improve the reliability of Docker Infrastructure
  10. 10. Docker Infrastructure v2.0 Jenkins for automating processes Docker image builds Container deployment Docker Swarm Container Scheduler Shipyard Troubleshooting
  11. 11. Pain 1 You shall not build/test/deploy Docker container over 5 minutes Based on: xkcd.com
  12. 12. Improved Docker builds First iteration: FROM node ENV SERVICE_NAME=statistics ENV SERVICE_DESC="Statistics" ENV SERVICE_TAGS=statistics ENV SERVICE_CHECK_HTTP=/health ENV SERVICE_CHECK_INTERVAL=10s ENV SERVICE_CHECK_TIMEOUT=5s EXPOSE 8000 WORKDIR /src COPY . /src/ RUN npm install CMD ["node", "."] Improved: FROM node:6-alpine ENV SERVICE_NAME=statistics SERVICE_DESC="Statistics" SERVICE_TAGS=statistics SERVICE_CHECK_HTTP=/health-statistics SERVICE_CHECK_INTERVAL=10s SERVICE_CHECK_TIMEOUT=5s EXPOSE 8000 WORKDIR /src USER node CMD ["node", "."] COPY libraries/ /src/ COPY src/ /src/
  13. 13. https://youtu.be/X_q2l8hotAc?t=365
  14. 14. Deployment process optimizations NB! https://docs.docker.com/engine/userguide/storagedriver/selectadriver/ Replacement of Devicemapper to AUFS reduced deployment process time 10x. There are still improvements possible: ● Handle Linux signals ● Parallel rolling updates https://teespring.com/sigkill
  15. 15. Pain 2 Consumers shall connect only to healthy services
  16. 16. Beware the service discovery corruption ● Always enable health checks ● Use unique health checks or validate output SERVICE_CHECK_HTTP=/health vs SERVICE_CHECK_HTTP=/statistics-health
  17. 17. Pain 3 - Every day maintenance of Jenkins jobs
  18. 18. Pain 4 Container shall handle 10 000 connections and constant high load.
  19. 19. https://youtu.be/PivpCKEiQOQ We deployed Killer-Container to the cluster and rescheduled it every time then it managed to crash the Docker host
  20. 20. Issues ● Linux kernel 3.13 ● Fluentd logging agent ● Graylog logging driver ● Kernel sysctl parameters ● Swap usage ● PEBKAC ○ "net.ipv4.ip_forward" => 0 ● WARNING: No memory limit support ● WARNING: No swap limit support ● WARNING: No kernel memory limit support ● WARNING: No oom kill disable support ● WARNING: No cpu cfs quota support ● WARNING: No cpu cfs period support
  21. 21. Service risk mitigation ● Number of nodes in cluster ● Spreading policies ● Multiple instances ● Memory limitations ● Healing policies ○ Autorestart ○ Reschedule
  22. 22. Gains Evolution of applications generic enough to run in multiple regions, environments Delivery time from idea to live From 2 weeks to 1 day Servers vs Services those be managed asynchronously
  23. 23. Statistics ~ 70 inhouse built Dockerized services ~ 90 Docker images ~ 500 containers running 3200 container deploys since October
  24. 24. Remember - Every Day 1 new container borns to stay @Pipedrive 30 container deployments
  25. 25. Recommendations for going Live with Docker ● You still need to take care of OS ● Read Github issues ● Read from the source ● Keep it up to date ● (Performance) Test it
  26. 26. Thank you! Give me your feedback @rreinurm

×