Successful DevOps implementation for small teams a true story

SUCCESSFUL DEVOPS
IMPLEMENTATION FOR
SMALL TEAMS:
A TRUE STORY

Definition
“A set of practices intended to reduce the time
between committing a change to a system
and the change being placed into normal production,
while ensuring high quality”
Bass, Len; Weber, Ingo; Zhu, Liming. DevOps: A
Software Architect's Perspective. ISBN
978-0134049847.

It’s not a new thing
already established in the industry - tons of job offerings confirm that
● automation, automation, automation
● containers eeeeverywheeereee
● The Cloud (i.e. someone else’s computer)

Simple test:
“are you DevOps level over 9000?”
● your answer for “how many servers do you have?” is “I have to check..”
● you do multiple production deployments each day
● your dev team can create new (micro)service along with all supporting
components without any ticket for ops team
● you can terminate any random instance in your infrastructure and the
environment will self-heal
● .. but let’s not even start with security related topics

DON’TS
How not to do “DevOps”
● Post a job description for
“DevOps Engineer” and hire a
few
● Put them on an “on-call”
● Push away developers from
directly interacting with the
environment

Effect?
Apart from low velocity and
quality you will get these:
● “Hey, can you send me logs
from my service?”
● “Heey, can you purge Redis for
me on staging?”
● “Heeey, I clicked deploy on
Jenkins and it’s stuck, HALP”
● “Heeeeeeeeeeeeeey….”

DO’S
● Do enable developers
● Streamline deployment
process
● Streamline infrastructure
management
● Guide, advise, discuss
● Hide complexity, but not too
much
● Treat yourself as a service
provider - deliver products not
tickets

It’s ok to hire
devops engineer
Brings experience and
specialized focus
● Communication skills are super
important here
● Tech requirements: good *nix
skills, good google skills and
sixth sense for sniffing bad
practices
● Probably the first person to
handle Security in your new
startup

Starting point
● Production environment: two servers,
dozen microservices
● Everything spinned up manually through
AWS Console
● Deployment meant ssh’ing to a server,
downloading new docker image,
stop+start (incurring downtime)
● Monitoring? Just cloudwatch logs
❌
● Spring Boot + Spring Cloud (Netflix)
● Dockerized, built on Jenkins
● Configured via environment variables
● Stateless
● Use of AWS
● Use of managed services
● Most important thing: competent
development team, eager to innovate 🚀
✅

1. Kubernetes
Fixing error prone deployments
● batteries-included approach
● documentation
○ courses, FAQs, examples
● popular
● reasonably sane
○ apart from Milicores concepts and
a few others ;-)
● Lots of progress in the past ~2 years
○ stable
○ reliable
○ lots of know-how
○ lots of lessons learned
○ powerful CLI

Kubernetes
cont’d
● Helm
● Spinnaker
● Jenkins integrations
● Operators for complex
deployments
● Monitoring stack
● Cloud offerings (GKE, EKS,
Azure) tons of tools on top of it
Tons of tools on top of it

YAAAML 😱
● 200-400 lines of YAML to
describe a service..
● Secrets management..
● Even with Helm, deployment
is a complex command
● Tains, tolerations, affinity,
heap vs total memory,
exposing ports, scraping
metrics .. and keep it all
consistent across multitude
of services
● Tooling versioning

Re: hide complexity, but not too much
● Jenkins deployment job is nice and all, up until it stops working
● How can you expect proficiency with Kubernetes / kubectl if all developers
ever do is push a Run button?
● Enable them by making it easy to use CLI tools
○ Prepare Helm, helm-secrets, helm-diff, all along with binaries, configs and ./setup.sh script
for easy installation
○ Create one template for all services, supporting most common configuration
○ Add yet another abstraction layer for most common tasks

Demo: qp
~200 lines of BASH script as an
abstraction layer on top of Helm
●

DevOps == Collaboration
● Example: monitor performance of all microservices
○ Example stack: Prometheus via Prometheus Operator
○ Add Service Monitor objects to each deployment
● New application<->platform contract emerged: just expose prometheus
metrics on port N and you will see your service graphs on Grafana
○ Developers responsible for adjusting their services to obey the new contract, make domain
specific dashboards
● Good tools helped here: Kubernetes made it easy to deploy the stack,
Spring framework made it easy to expose metrics

2. Infrastructure
as Code
Terraform + Atlantis
● Git-versioned infrastructure
● Migrate/Move or import existing
resources
● Setup Atlantis for audited and
peer-reviewed infrastructure changes
● Use the same tools to detect state drift
(changes that were made outside of
atlantis flow)
● Optionally remove user permissions so
that changes must go through Pull
Requests

Terraform
Declarative infrastructure
management
● Define AWS resources
○ Readable syntax
○ Combine multiple resources into
reusable module
● Plan
○ Compare definition with current
state
○ Display detailed changeset
● Apply
○ Make changes to infrastructure
○ Record state
● Team-workflow supported
○ State in AWS S3
○ Locks in DynamoDB

Atlantis
Pull Requests for infrastructure
1. GitHub hook on each Pull
Request to terraform repo
2. Additional layer of locking so
no other PR can touch the
same parts of infrastructure
3. Autoplan: show plan preview in
PR comments
4. Review & Approve Pull Request
5. Apply changes
6. Remove locks and merge

Demo?
If time permits ;-)
If time won’t permit: shout out to
my friend Szymon W. who made a
nice blogpost about introducing
terraform and atlantis across whole
company:
https://lab.getbase.com/terraform-base/

From my own experience
Cosmose:
● One “devops engineer”, seven contributors
to terraform repo in a month, eleven now
● > 10 production deployments per day
● 3x more microservices since I joined (~6
months)
● Infrastructure autoscaled 10x one time,
when a dev wanted to “speed up his
processing task” ;-)
Base / Zendesk Sell:
● Around 8 Ops and 42 (!) contributors to
terraform repo
● 30-50 deployments to prod daily
● High level of ownership in dev teams,
including expertise in running databases
(e.g. ElasticSearch, MySQL), building their
own infrastructure stacks (QA Kubernetes)

Thanks!
Jakub P. Głazik
zytek@nuxi.pl
github.com/zytek
Questions are more than welcome

Successful DevOps implementation for small teams a true story

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Successful DevOps implementation for small teams a true story

Similar to Successful DevOps implementation for small teams a true story (20)

Recently uploaded

Recently uploaded (20)

Successful DevOps implementation for small teams a true story