Self-Service Infrastructure is Useless
if No One Knows How to Use It
Kevin Lynch
klynch@squarespace.com
● User submits a ticket
● Wait hours/days/weeks
● ...
● Profit?
● Waste of time
● Error prone
● Lack of user confidence
What is self-service infrastructure?
● Automated
● Repeatable
● Fast
● Understandable
How do we get there?
● Containers are cool!
● Amazing documentation
● Great fundamentals
○ Ephemeral Pods
○ Scalable Deployments
○ These things called StatefulSets
“Containers are cool!”
“How do I use this thing?”
“Is it reliable?”
Documentation!
Documentation!
Documentation!
● Everyone learns differently
○ Wiki docs
○ FAQs
○ Live Training Sessions
○ Recorded Sessions
○ Self-paced Labs
“Help! I’m trying to use super
alpha feature X and it’s not
working!”
“Can I put my database in
Kubernetes?”
● Persistence in Kubernetes is hard, so maybe
don’t do it yet?
● Only promise what you can support
○ Don’t just say no. Be clear why.
○ Builds trust with users
Clearly define expectations
“Okay! How do I create a VM
for my database?”
Traditional Provisioning Process
Provision
Find Resources
(CPU, RAM, Disk)
Assign IP
Configure VLAN
Configure Firewall
Update Ansible
Inventory
DNS Updates
PXE Boot
Install OS
Configure OS
Install App
Dependencies
Install App!
Configure
Monitoring
● Let product engineers focus on the product!
● Hide complexity when possible
● We’re the infrastructure experts
Provide sane defaults
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
replicas: 2
template:
spec:
containers:
- image: postgres:latest
resources:
requests:
cpu: 8
memory: 16Gi
Provide familiar interfaces
● Answers:
○ What is it?
○ How many?
○ How big?
Simple operations
● kubectl scale statefulset postgres --replicas 4
apiVersion: vm.squarespace.net/v1
kind: VirtualMachineStatefulSet
metadata:
name: postgres
spec:
type: vmware
replicas: 2
resources:
requests:
cpu: 8
memory: 16Gi
Provide familiar interfaces
● Assumes users:
○ Understand it?
○ Like the interface?
● Get feedback!
Simple operations
● vmctl scale statefulset postgres --replicas 4
“Give me 10 more web
servers!”
“And a database”
“Oh and replicate it…”
QUESTIONS?
Thank you!
squarespace.com/careers
Kevin Lynch
klynch@squarespace.com
@KevML
Self Service Infrastructure Is Useless if No One Knows How to Use It (SRE Dublin Meetup April 2018)

Self Service Infrastructure Is Useless if No One Knows How to Use It (SRE Dublin Meetup April 2018)

Editor's Notes

  • #3 Waste of time Lack of user confidence Customers don’t trust us to get the job done
  • #6 Automation is only half the battle Users need to be confident in the results The common case is frictionless
  • #9 Declarative infrastructure Can commit in a code repository
  • #11 New technology can be scary!
  • #14 Be transparent when you break those expectations
  • #16 Currently takes about 30 minutes to provision a VM Typical workflow for provisioning a VM at Squarespace Install system dependencies (LDAP, NTP, CollectD, Sensu…) Install app dependencies (Java, Consul, Mongo-S…) Some optimizations to be made here
  • #17 AWS users without systems backgrounds eventually become systems engineers Google cloud isn’t much better
  • #18 What does all this mean? Provide consistent interfaces!
  • #20 Assumes: Users like Kubernetes. Maybe it’s too complex? Maybe it’s fine. What works for us may not work for you
  • #21 Assumes: Users like Kubernetes. Maybe it’s too complex? Maybe it’s fine. What works for us may not work for you