5. 5
The Goal – Restated as Solvable
Build a continuous delivery pipeline for the Trulia Mobile API that
is usable for all stakeholders.
6. 6
The Problem(s) – Dev Version
- my code works on the shared dev host, but not on prod
- no real visibility into what is happening in prod
- troubleshooting is difficult
- the Ops team is not helpful
7. 7
The Problem(s) – QA Version
- code tested in QA doesn’t work in prod
- inability to test multiple builds at the same time
- no shared language to bridge the Dev/Ops teams
- the Ops team is not helpful
8. 8
The Problem(s) – Ops Version
- Dev/Stage environments are inconsistent
- Prod environment is un-reproducable
- Files are copied around in prod
- Incoming requests are difficult to parse
9. 9
The Problems – Stated as Solvable
- Need to build a common language (culture)
- Need to build a reproducable platform in all environments (tech)
- Need to provide automation and visibility tools (tech/culture)
11. 11
Docker
- Build a reproducable/immutable(ish) platform
- Control Application dependencies
- Automated build capabilities
- Low overhead compared to virtualization
- Stateless application
12. 12
Step 1 / Base Image
- Packer instead of Dockerfiles
- Puppet to build container
- Build on Jenkins
- Vagrant option available
- Tagged with latest
- Pushed to our Docker registry
13. 13
Step 2 / Develop Locally
- create separate run
directories per environment
- modules per environment
- consul_shared
14. 14
Local Terraform
- Sets up the docker container
- Sources variable
- calls the shared keys
- uses the run_locatoin
22. 22
Communication
QA to Dev - “tcd-mobileapi(container) build 12 failed to pass
smoke tests can you please look at class foo”
QA to Ops - “tcd-mobileapi(container) build 12 went is having
trouble connecting to the user database”
Ops to Dev - “after we rolled out tcd-mobileapi(container) build
12 we noticed the app_v1_userlookup(KPI) time doubled”
23. 23
Pipeline - Package Software
- Spin up a build container
- Mount the current directory
- Pull in dependencies
- Build a .deb with FPM
- Push to aptly
27. 27
Pipeline – Run in QATCD
- Spin up container in our QATCD Nomad cluster
- Run terraform to update
all of the configurations in consul
- Set up credentials using Vault
- container is now available
http://tcd-mobileapi-10.qatcd.example.com
28. 28
Pipeline – Deploy Test
- health checks are crucial
- needed for monitoring
- needed for LB
- needed for consul
- get hit like 20 times/second
- engineer came up with the
idea of deploy tests
- only hit occasionally
- more detailed
29. 29
Pipeline – Smoke test
- Calls another Jenkins server
- Managed by the QA team
- Detailed application level test
30. 30
Pipeline - Repointer
- allows for static hostnames
for applications or external
testers
- does some checking
31. 31
Pipeline – Next Steps
1) Preprod environment
- Push configuration LIVE
- Run a single container with the newer version
- Other tests run
- Build number is put in a Jenkins form and push button
2) Release to Production
- Put a build number in a Jenkins form
- Only allowed if the build is on preprod
- Containers are rolled out with sleep and concurrency set
39. 39
Vault / Consul Template
- Easily generate config files from key/value store
- Feature flags are easily implimented
- Store and filter Database credentials
40. 40
Logging
- Big challenge
- All Apache/Nginx logs include APPNAME/BUILD_NUMBER
information and are in JSON format
- Application logs are in JSON format and often include unique
IDs
- Stacktraces are fingerprinted
- Logstash picks up from the Nomad alloc dirs
44. 44
Stats / KPIs
- Data is pulled from the logs and sent to statsd→influxdb with a
Grafana front end
- Host and container level stats are picked up via cAdvisor
47. 47
Troubleshooting
- Devs have exec access to all containers through Vault
SSH
- This is audited
- After completion of any activities the container is
terminated
48. 48
No silver bullets...
- Unit tests are slow
- Initial learning curve
- Docker on anything other than Linux is painful
- Apps need to be modified
- Less control for devs compared to old method
49. 49
Improvements
- Better troubleshooting tools
- Shared docker host for apps with heavy upstream dependencies
- More local services to make development easier
- Better training/support for desktop Docker issues
- More code libraries to handle common app issues
50. 50
Thanks
Kevin - AppDynamics Sonal Joshi – Trulia Sr.
Automation Engineer
Vincent Lam – Trulia Sr.
Application Developer