2. Agenda
- SAW Product and our DevOps Challenges
- Docker as solutions to our challenges
- Docker CI Pipeline
- Local Dev Docker Deployment
- SND: Single Docker Node Deployment
- AWS & Public Cloud
- SNB: Single Node Docker Based Build
8. SAW DevOps challenges given our architecture
- Our Architecture is very advanced & complex (many
components)
- In the past (Before Docker and our automatic deployment
done today by ansible) we did a lot of manual steps
- Dev Env – challenge is to encapsulate the infra complexity from
developer, allowing him independency from shared infra
- CI Build has big challenges in its scale
10. We began to use Docker ~2.5 years ago
Today we are using Docker as solution in 4 areas:
- Local Developer Docker Deployment
- Single Node Docker Deployment internal farms
- AWS Docker Deployment
- CI: Docker Based Build
11. Docker Solution #1 (of 4):
Local Developer Docker Deployment
-First we used it for Docker Deployment locally for our
developers to avoid “infra noise” for them when they
work
-This made huge difference in our RnD Efficiency
12. Docker Solution #2 (of 4):
Single Node Docker Deployment internal farms
-Second we created farm of Single Node Docker Farm
we use for many e2e use cases:
-Deployment on feature branches
-Deployment for Bug hunts and Regressions
-Deployment for PMs, Discover etc
13. -Third we began to use Docker deployment for
Public Clouds
-Started with AWS
-Used today for internal users only
Docker Solution #3 (of 4):
AWS Docker Deployment
14. Docker Solution #4 (of 4):
CI: Docker Based Build
- We’ve implemented Docker for CI builds.
-We provision dedicated Docker infrastructure
services for each build
-We maintain unified infrastructure across
development, build & deployment environment.
16. Our Docker Images…
- We have 16 Infra & App images, deployed as 32 container instances:
- Infra Images such as: redis, idol, elastic search, mongo, postgress etc
- App Images such as: tomcat, gateway, lcm, platform, saw, ui etc
- In addition as have Base images such as JDK, Consul-template etc
- And last we have, Utilities images such as Provision, selenium etc.
17. Pipeline to create Docker images – our flow
Triggers Build Integration
Test
Push to
registry
18. Triggers
We have different triggers that can cause this flow to start:
SCM change
–Change in Dockerfile – PostgreSQL upgrade version
–Added new Container to build - PPO container
–Change of vagrant flow
Other build:
–SAW build
–Docker base image build
Triggers Build Integration
Test
Push to
registry
19. Build
Docker build scripts written in gradle using Docker API
The build lifecycle
- Build Docker images from Docker files
- Create and run container
- Run unit test for the container
(E.G. test connection to tomcat port on tomcat container)
- Push container to repository (to dev in this stage)
Triggers Build Integration
Test
Push to
registry
20. Integration test
Running integration test
- Call vagrant up on virtual box - Validation for developer
- Pull Docker images from registry
- Run all farm on that VM
- Run test
Call vagrant up on Manage host – Validation for SND
- Pull Docker images from registry
- Run all farm on that VM
- Run test
Triggers Build Integration
Test
Push to
registry
21. Push
Push Images to registry
- Call gradle build to push images
- HP prod registry
- AWS registry
- Storage in S3
Triggers Build Integration
Test
Push to
registry
22. Docker CI Pipeline – Summery
22
Infra Build
and Push
Maas Build
and Push
Dev
Registry
Prod
Registry
AWS
Registry
Triggered
Integration test
Vagrant provision
Infra Push
Maas Push
Triggers Build
Integration
Test
Push to
registry
24. AWS Deployment for SAW
- As we said, we have 16 images, deployed as 32 container instances (HA)
- Provising infrastructure of a farm takes ~15 min.
- We provision new farm as VPC by using terraform.
- Deploying SAW on this farm takes ~1h and keep improving by ansible
- Auto registration of farms in public DNS
25. Deployment process in AWS Flow
25
Provision
container
Jenkins run
Terraform create
VPC and all AWS
resources
Manage host
Ansible
playbook
Orchestrate
containers
(pull and run)
Registrator
Use S3
Storage
End Point
Copy Ansible
resources
Instance with
Docker service
For Infra and Saw
Run ansible
playbooks
Paas ,Infra ,Nfs
,Maas
VPC
26. Terraform
Deploy AWS farm resources
- VPC
- Subnets
- Route tables
- Instances
- Security Group
- Route 53 DNS
- Registry S3 storage end point
Jenkins
run
Terraform
Copy Ansible
Run ansible
VPC
27. Ansible playbooks
Deployment and Orchestrate of:
Maas Dockers using 4 playbooks…
–PAAS - deploy all PAAS containers on all Docker servers
–Consul , registrator , logstash-agent , monitor-agent
–INFRA – Deploy infra containers on relevant TAGs instance
–Dataebases , ….
–NFS – Create NFS cluster and create mount to the relevant instance
–MAAS – Deploy MAAS containers
– create initialized data , test Tenants.
Jenkins
run
Terraform
Copy Ansible
Run ansible
VPC
28. Deployment process in AWS Flow – Finaly we have a VPC ready
28
Provision
container
Jenkins run
Terraform create
VPC and all AWS
resources
Manage host
Ansible
playbook
Orchestrate
containers
(pull and run)
Registrator
Use S3
Storage
End Point
Copy Ansible
resources
Instance with
Docker service
For Infra and Saw
Run ansible
playbooks
Paas ,Infra ,Nfs
,Maas
VPC
30. CI Build Facts
- We have 30 build servers (32 CPU, 128GB RAM, 500GB storage)
- Our CI build takes 1 hour
- We’re running over 100 builds a day
31. Motivation for Single Node Build
- Provide isolated environment for each build
- Reduce build time
- Improve build stability
- Simplify troubleshooting and reduce maintenance effort
32. Docker based Build CI Flow:
Compilation
Start
Server
Git Push
Vagrant up
Upload to
Nexus
Integration
Tests
33. SNB: Build Server Configuration
Vertica
Platform GatewayNginx
IDOL
MongoDB
PostgreSQL
Openfire
Redis
RabbitMQ
SMTPServer
HAProxy
Kibana
ElasticSearch
Logstash
Consul
Registrator
Cadvisor
Integration
Test
Build Server - Build servers is dedicated to single
build.
- Build server is running all
compilation and runtime processes.
- Infrastructure processes are
running in Docker containers at the
same server
- Server load is regulated by number
of running threads.