SlideShare a Scribd company logo
1 of 23
Vault in Production at Apptio
Lee Briggs
Snr Infrastructure Engineer
© 2016 Apptio, All rights reserved (v2.5)2
$(whoami)
 Based in London
 Work for Apptio
 Github:
https://github.com/jaxxstorm
 Twitter:
https://twitter.com/briggsl
 Blog:
https://www.leebriggs.co.uk
Apptio Infrastructure
© 2016 Apptio, All rights reserved (v2.5)4
Some Apptio numbers
 Almost 6000 unique ”vms”
 15 global ”datacenters”
 Physical and AWS VPCs
 Hundreds of MySQL databases
 Over 3.5 petabytes of raw storage
 Over 178Tb of memory
 Over 170,000 CPU cores
“The (initial) problem”
How do we provide audited access
to lots of MySQL instances?
© 2016 Apptio, All rights reserved (v2.5)6
Vault
 Vault provides:
 Audit logging
 MySQL Credential management
 High availability
 A secure way to store credentials
© 2016 Apptio, All rights reserved (v2.5)7
Vault
 What we needed to figure out
 How to deploy vault in 15 datacenters
 Automated, easily configurable
 How to connect several hundred databases to those vaults
 High availability
 Sane backups
 Make it easier than passing around passwords or looking in app config files
The journey
© 2016 Apptio, All rights reserved (v2.5)9
Step 1: Deploy Vault
 We already had consul in all DCs
 Spread across racks in DC
 Across AZs in AWS
 Is connected using WAN federation
 We use Puppet for configuration management
 The puppet module takes care of download/install
 Connect to consul – HA backend
 This also provides us with TLS
 We deployed vault onto all consulservers
© 2016 Apptio, All rights reserved (v2.5)10
Step 2: Initialise Vault
 Automating this isn’t trivial
 Plaintext keys are bad
 By default, vault outputs plaintext unseal keys
 Solution: Use the GPG support
 We already used GPG to store encrypted files in git
 Using puppet + eyaml
 Also using git-crypt
 This way, the keys are protected by the each user’s GPG private key
 We used the API to init vault in each DC
 We provide 7 GPG keys, and need 3 users to unseal a vault
© 2016 Apptio, All rights reserved (v2.5)11
Step 3: Unseal the Vault
 At this stage, we have around 60 instances of vault to unseal..
 Doing this “manually” is obviously not tenable
 Automating this is dangerous..
© 2016 Apptio, All rights reserved (v2.5)12
Unseal
 https://github.com/jaxxstorm/unseal
 Add your vaults servers to a config file
 Add your encrypted unseal key
 You can also put the plaintext key, but don’t!
 Prompts for your GPG keyring password
 If you’re running GPG agent, this is a security risk..
 Unseals all vaults
 Each unseal command runs in a goroutine
 Can send unseal command to 75 vaults in around 15s!
Unseal Demo
© 2016 Apptio, All rights reserved (v2.5)14
Step 4: Configure the vault
 We need to now add some configuration for all DCs
 Answers
 https://github.com/UKHomeOffice/vaultctl
 https://www.hashicorp.com/blog/codifying-vault-policies-and-configuration/
 Allows you to define the vault config in yaml
 Can then run vaultctl to configure your vault server as you require
 Enable LDAP with config
 Enable audit logging
 Enable MySQL backend
 We run this in a loop for all DCs
 Only need to hit a single vault server in each DC
© 2016 Apptio, All rights reserved (v2.5)15
Step 5: Add MySQL configuration
 We provision VMs using internal tool “selfserve”
 When VM is provisioned for DB
 Puppet runs, installs mysql
 Puppet adds a “vault” user with grants
 We then add roles to each DB config – readonly and full
 Selfserve makes an API call to that regions vault, adding it as a backend
 Selfserve has its own token which has write permissions to the mysql backend using policy
 We mount all databases with path mysql/<hostname>
© 2016 Apptio, All rights reserved (v2.5)16
Step 6: Make logins easy
 Configure ldap auth with policies for customers mapped to LDAP groups
 Some people can get write access, some only get read access
 However, authing with ldap and then having to do vault write was
difficult for users to remember
 Have to vault auth
 Then vault read <creds>
 Having to look this up when on-call isn’t fun if you don’t do it regularly
© 2016 Apptio, All rights reserved (v2.5)17
Breakglass
 A simple golang command line tool to automate the login process
 Prompts for your AD password, and you specify the mysql host you need
 It finds the correct vault endpoint using DNS forwarding, and then automatically
drops you into a mysql shell
 Inspired by vault ssh
 It’s not currently open source, but hoping to have that done by end of
Q3.
Breakglass Demo
More Considerations
© 2016 Apptio, All rights reserved (v2.5)20
ACLs
 If you’re using consul as your backend turn on ACLS!
 You should also block access to port 8500/8501 where possible
 Consul can be used extensively to pivot to RCE:
 http://www.kernelpicnic.net/2017/05/29/Pivoting-from-blind-SSRF-to-RCE-with-
Hashicorp-Consul.html
 If you store your secrets in consul, don’t let someone delete them
 By default, the consul web api allows access to delete and modify any key
 This requires an investment in implementing tokens
 You can use vault to manage these!
© 2016 Apptio, All rights reserved (v2.5)21
Backups
 When we init vault, we use the key prefix “vault/$datacenter”
 Our DC’s are completely distinct, we never share secrets between DCs
 We use consul snapshot to take backups
 Take them once per hour
 We copy them to another DC
 We test restores weekly
 Start vault on a difference port
 Connect it to the existing consul with the “vault/$datacenter” prefix
 All done via ansible
 Have users unseal – users run when they come online
 Verify integrity
 Shutdown
© 2016 Apptio, All rights reserved (v2.5)22
Lessons Learned
 Pick 1 thing and “vault it”
 Trying to secure all your secrets in vault straight away can be overwhelming
 We now store the majority of our secrets in vault after lessons learned from MySQL
 Have a good story for configuration, backups and unsealing
 Consul + Vault has a great HA story
 As long as you use consul’s service discovery of course
 “Automated” secret management has trade-offs
 Be aware of them
 Abstract away the user pain where possible
 Golang is great for cmdline tools!
 These packages use viper + cobra
 https://github.com/spf13/cobra
THANK YOU

More Related Content

What's hot

Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017Codemotion
 
What Prometheus means for monitoring vendors
What Prometheus means for monitoring vendorsWhat Prometheus means for monitoring vendors
What Prometheus means for monitoring vendorsSysdig
 
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Codemotion
 
Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017
Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017
Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017Vincent Kok
 
Vert.x for Microservices Architecture
Vert.x for Microservices ArchitectureVert.x for Microservices Architecture
Vert.x for Microservices ArchitectureIdan Fridman
 
Istio Playground
Istio PlaygroundIstio Playground
Istio PlaygroundQAware GmbH
 
Security threats with Kubernetes - Igor Khoroshchenko
 Security threats with Kubernetes - Igor Khoroshchenko Security threats with Kubernetes - Igor Khoroshchenko
Security threats with Kubernetes - Igor KhoroshchenkoKuberton
 
Sysdig monitor - a brief introduction
Sysdig monitor - a brief introductionSysdig monitor - a brief introduction
Sysdig monitor - a brief introductionDaniel Kerwin
 
Criteo meetup - S.R.E Tech Talk
Criteo meetup - S.R.E Tech TalkCriteo meetup - S.R.E Tech Talk
Criteo meetup - S.R.E Tech TalkPierre Mavro
 
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기OpenStack Korea Community
 
The Sysdig Secure DevOps Platform
The Sysdig Secure DevOps PlatformThe Sysdig Secure DevOps Platform
The Sysdig Secure DevOps PlatformAshnikbiz
 
Load Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLoad Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLee Calcote
 
Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...
Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...
Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...Michael Man
 
The top 5 Kubernetes metrics to monitor
The top 5 Kubernetes metrics to monitorThe top 5 Kubernetes metrics to monitor
The top 5 Kubernetes metrics to monitorSysdig
 
HashiTLS Demystifying Security Certs
HashiTLS Demystifying Security CertsHashiTLS Demystifying Security Certs
HashiTLS Demystifying Security CertsMitchell Pronschinske
 
Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnes
 Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnes Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnes
Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnesdistributed matters
 
ZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processesZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processesJulia Proskurnia
 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor MicroservicesSysdig
 

What's hot (20)

Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
 
What Prometheus means for monitoring vendors
What Prometheus means for monitoring vendorsWhat Prometheus means for monitoring vendors
What Prometheus means for monitoring vendors
 
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
Jacopo Nardiello - Monitoring Cloud-Native applications with Prometheus - Cod...
 
Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017
Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017
Microservices: 5 Things I Wish I'd Known - Code Motion Milan 2017
 
Vert.x for Microservices Architecture
Vert.x for Microservices ArchitectureVert.x for Microservices Architecture
Vert.x for Microservices Architecture
 
Istio Playground
Istio PlaygroundIstio Playground
Istio Playground
 
Kubernetes security
Kubernetes securityKubernetes security
Kubernetes security
 
Security threats with Kubernetes - Igor Khoroshchenko
 Security threats with Kubernetes - Igor Khoroshchenko Security threats with Kubernetes - Igor Khoroshchenko
Security threats with Kubernetes - Igor Khoroshchenko
 
Sysdig monitor - a brief introduction
Sysdig monitor - a brief introductionSysdig monitor - a brief introduction
Sysdig monitor - a brief introduction
 
Criteo meetup - S.R.E Tech Talk
Criteo meetup - S.R.E Tech TalkCriteo meetup - S.R.E Tech Talk
Criteo meetup - S.R.E Tech Talk
 
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
 
The Sysdig Secure DevOps Platform
The Sysdig Secure DevOps PlatformThe Sysdig Secure DevOps Platform
The Sysdig Secure DevOps Platform
 
BRISK_Network_Pentest_
BRISK_Network_Pentest_BRISK_Network_Pentest_
BRISK_Network_Pentest_
 
Load Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & KubernetesLoad Balancing in the Cloud using Nginx & Kubernetes
Load Balancing in the Cloud using Nginx & Kubernetes
 
Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...
Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...
Control Plane: Continuous Kubernetes Security (DevSecOps - London Gathering, ...
 
The top 5 Kubernetes metrics to monitor
The top 5 Kubernetes metrics to monitorThe top 5 Kubernetes metrics to monitor
The top 5 Kubernetes metrics to monitor
 
HashiTLS Demystifying Security Certs
HashiTLS Demystifying Security CertsHashiTLS Demystifying Security Certs
HashiTLS Demystifying Security Certs
 
Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnes
 Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnes Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnes
Microservices with Netflix OSS & Spring Cloud - Arnaud Cogoluègnes
 
ZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processesZooKeeper - wait free protocol for coordinating processes
ZooKeeper - wait free protocol for coordinating processes
 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor Microservices
 

Similar to London Hug 20/6 - Vault production

Containerizing your Security Operations Center
Containerizing your Security Operations CenterContainerizing your Security Operations Center
Containerizing your Security Operations CenterJimmy Mesta
 
[Wroclaw #9] The purge - dealing with secrets in Opera Software
[Wroclaw #9] The purge - dealing with secrets in Opera Software[Wroclaw #9] The purge - dealing with secrets in Opera Software
[Wroclaw #9] The purge - dealing with secrets in Opera SoftwareOWASP
 
murakumo Cloud Controller
murakumo Cloud Controllermurakumo Cloud Controller
murakumo Cloud ControllerShingo Kawano
 
Microservices Server - MSS Workshop
Microservices Server - MSS WorkshopMicroservices Server - MSS Workshop
Microservices Server - MSS WorkshopWSO2
 
Workshop MSF4J - Getting Started with Microservices and Java
Workshop MSF4J - Getting Started with Microservices and JavaWorkshop MSF4J - Getting Started with Microservices and Java
Workshop MSF4J - Getting Started with Microservices and JavaEdgar Silva
 
Node.js primer for ITE students
Node.js primer for ITE studentsNode.js primer for ITE students
Node.js primer for ITE studentsQuhan Arunasalam
 
Developer Experience at the Guardian, Equal Experts Sept 2021
Developer Experience at the Guardian, Equal Experts Sept 2021Developer Experience at the Guardian, Equal Experts Sept 2021
Developer Experience at the Guardian, Equal Experts Sept 2021Akash Askoolum
 
OSGi Enterprise R6 specs are out! - David Bosschaert & Carsten Ziegeler
OSGi Enterprise R6 specs are out! - David Bosschaert & Carsten ZiegelerOSGi Enterprise R6 specs are out! - David Bosschaert & Carsten Ziegeler
OSGi Enterprise R6 specs are out! - David Bosschaert & Carsten Ziegelermfrancis
 
SREcon Europe 2016 - Full-mesh IPsec network at Hosted Graphite
SREcon Europe 2016 - Full-mesh IPsec network at Hosted GraphiteSREcon Europe 2016 - Full-mesh IPsec network at Hosted Graphite
SREcon Europe 2016 - Full-mesh IPsec network at Hosted GraphiteHostedGraphite
 
Digital Forensics and Incident Response in The Cloud Part 3
Digital Forensics and Incident Response in The Cloud Part 3Digital Forensics and Incident Response in The Cloud Part 3
Digital Forensics and Incident Response in The Cloud Part 3Velocidex Enterprises
 
Introducing Gridiron Security and Compliance Management Platform and Enclave ...
Introducing Gridiron Security and Compliance Management Platform and Enclave ...Introducing Gridiron Security and Compliance Management Platform and Enclave ...
Introducing Gridiron Security and Compliance Management Platform and Enclave ...Aptible
 
WordPressCafe - Deploying WordPress using Kontena
WordPressCafe - Deploying WordPress using KontenaWordPressCafe - Deploying WordPress using Kontena
WordPressCafe - Deploying WordPress using KontenaKontena, Inc.
 
The Developer Friendly Container & Micro Services Platform
The Developer Friendly Container & Micro Services PlatformThe Developer Friendly Container & Micro Services Platform
The Developer Friendly Container & Micro Services PlatformExove
 
Pipelining DevOps with Jenkins and AWS
Pipelining DevOps with Jenkins and AWSPipelining DevOps with Jenkins and AWS
Pipelining DevOps with Jenkins and AWSJimmy Ray
 
PVS-Studio in the Clouds: Travis CI
PVS-Studio in the Clouds: Travis CIPVS-Studio in the Clouds: Travis CI
PVS-Studio in the Clouds: Travis CIAndrey Karpov
 
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...Amazon Web Services
 
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)Nicolas Brousse
 
Kubernetes or OpenShift - choosing your container platform for Dev and Ops
Kubernetes or OpenShift - choosing your container platform for Dev and OpsKubernetes or OpenShift - choosing your container platform for Dev and Ops
Kubernetes or OpenShift - choosing your container platform for Dev and OpsTomasz Cholewa
 
Docker Internet Money Gateway
Docker Internet Money GatewayDocker Internet Money Gateway
Docker Internet Money GatewayMathieu Buffenoir
 

Similar to London Hug 20/6 - Vault production (20)

Containerizing your Security Operations Center
Containerizing your Security Operations CenterContainerizing your Security Operations Center
Containerizing your Security Operations Center
 
[Wroclaw #9] The purge - dealing with secrets in Opera Software
[Wroclaw #9] The purge - dealing with secrets in Opera Software[Wroclaw #9] The purge - dealing with secrets in Opera Software
[Wroclaw #9] The purge - dealing with secrets in Opera Software
 
murakumo Cloud Controller
murakumo Cloud Controllermurakumo Cloud Controller
murakumo Cloud Controller
 
Microservices Server - MSS Workshop
Microservices Server - MSS WorkshopMicroservices Server - MSS Workshop
Microservices Server - MSS Workshop
 
Workshop MSF4J - Getting Started with Microservices and Java
Workshop MSF4J - Getting Started with Microservices and JavaWorkshop MSF4J - Getting Started with Microservices and Java
Workshop MSF4J - Getting Started with Microservices and Java
 
Node.js primer for ITE students
Node.js primer for ITE studentsNode.js primer for ITE students
Node.js primer for ITE students
 
Developer Experience at the Guardian, Equal Experts Sept 2021
Developer Experience at the Guardian, Equal Experts Sept 2021Developer Experience at the Guardian, Equal Experts Sept 2021
Developer Experience at the Guardian, Equal Experts Sept 2021
 
OSGi Enterprise R6 specs are out! - David Bosschaert & Carsten Ziegeler
OSGi Enterprise R6 specs are out! - David Bosschaert & Carsten ZiegelerOSGi Enterprise R6 specs are out! - David Bosschaert & Carsten Ziegeler
OSGi Enterprise R6 specs are out! - David Bosschaert & Carsten Ziegeler
 
SREcon Europe 2016 - Full-mesh IPsec network at Hosted Graphite
SREcon Europe 2016 - Full-mesh IPsec network at Hosted GraphiteSREcon Europe 2016 - Full-mesh IPsec network at Hosted Graphite
SREcon Europe 2016 - Full-mesh IPsec network at Hosted Graphite
 
Digital Forensics and Incident Response in The Cloud Part 3
Digital Forensics and Incident Response in The Cloud Part 3Digital Forensics and Incident Response in The Cloud Part 3
Digital Forensics and Incident Response in The Cloud Part 3
 
Deployment automation
Deployment automationDeployment automation
Deployment automation
 
Introducing Gridiron Security and Compliance Management Platform and Enclave ...
Introducing Gridiron Security and Compliance Management Platform and Enclave ...Introducing Gridiron Security and Compliance Management Platform and Enclave ...
Introducing Gridiron Security and Compliance Management Platform and Enclave ...
 
WordPressCafe - Deploying WordPress using Kontena
WordPressCafe - Deploying WordPress using KontenaWordPressCafe - Deploying WordPress using Kontena
WordPressCafe - Deploying WordPress using Kontena
 
The Developer Friendly Container & Micro Services Platform
The Developer Friendly Container & Micro Services PlatformThe Developer Friendly Container & Micro Services Platform
The Developer Friendly Container & Micro Services Platform
 
Pipelining DevOps with Jenkins and AWS
Pipelining DevOps with Jenkins and AWSPipelining DevOps with Jenkins and AWS
Pipelining DevOps with Jenkins and AWS
 
PVS-Studio in the Clouds: Travis CI
PVS-Studio in the Clouds: Travis CIPVS-Studio in the Clouds: Travis CI
PVS-Studio in the Clouds: Travis CI
 
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
(ARC402) Deployment Automation: From Developers' Keyboards to End Users' Scre...
 
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
 
Kubernetes or OpenShift - choosing your container platform for Dev and Ops
Kubernetes or OpenShift - choosing your container platform for Dev and OpsKubernetes or OpenShift - choosing your container platform for Dev and Ops
Kubernetes or OpenShift - choosing your container platform for Dev and Ops
 
Docker Internet Money Gateway
Docker Internet Money GatewayDocker Internet Money Gateway
Docker Internet Money Gateway
 

More from London HashiCorp User Group

More from London HashiCorp User Group (10)

London HUG 15/8/17 - Elseviers World using Nomad
London HUG 15/8/17 - Elseviers World using NomadLondon HUG 15/8/17 - Elseviers World using Nomad
London HUG 15/8/17 - Elseviers World using Nomad
 
London HUG 15/8/17 - Lifeguard
London HUG 15/8/17 - LifeguardLondon HUG 15/8/17 - Lifeguard
London HUG 15/8/17 - Lifeguard
 
London HUG 12/4
London HUG 12/4London HUG 12/4
London HUG 12/4
 
London HUG 14/3
London HUG 14/3London HUG 14/3
London HUG 14/3
 
London Hug 19/5 - Terraform in Production
London Hug 19/5 - Terraform in ProductionLondon Hug 19/5 - Terraform in Production
London Hug 19/5 - Terraform in Production
 
London HUG 14/4 - Infratructure mgmt
London HUG 14/4 - Infratructure mgmtLondon HUG 14/4 - Infratructure mgmt
London HUG 14/4 - Infratructure mgmt
 
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and NomadLondon HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
London HUG 14/4 - Deploying and Discovering at Scale with Consul and Nomad
 
London HUG 8/3 - Nomad
London HUG 8/3 - NomadLondon HUG 8/3 - Nomad
London HUG 8/3 - Nomad
 
London HUG 8/3 - Developing a (VCD) Terraform Provider
London HUG 8/3 - Developing a (VCD) Terraform ProviderLondon HUG 8/3 - Developing a (VCD) Terraform Provider
London HUG 8/3 - Developing a (VCD) Terraform Provider
 
London HUG 8/3 - JustEat - Andrew Brown / Alberto Blanco
London HUG 8/3 - JustEat - Andrew Brown / Alberto BlancoLondon HUG 8/3 - JustEat - Andrew Brown / Alberto Blanco
London HUG 8/3 - JustEat - Andrew Brown / Alberto Blanco
 

Recently uploaded

AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 

Recently uploaded (20)

AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 

London Hug 20/6 - Vault production

  • 1. Vault in Production at Apptio Lee Briggs Snr Infrastructure Engineer
  • 2. © 2016 Apptio, All rights reserved (v2.5)2 $(whoami)  Based in London  Work for Apptio  Github: https://github.com/jaxxstorm  Twitter: https://twitter.com/briggsl  Blog: https://www.leebriggs.co.uk
  • 4. © 2016 Apptio, All rights reserved (v2.5)4 Some Apptio numbers  Almost 6000 unique ”vms”  15 global ”datacenters”  Physical and AWS VPCs  Hundreds of MySQL databases  Over 3.5 petabytes of raw storage  Over 178Tb of memory  Over 170,000 CPU cores
  • 5. “The (initial) problem” How do we provide audited access to lots of MySQL instances?
  • 6. © 2016 Apptio, All rights reserved (v2.5)6 Vault  Vault provides:  Audit logging  MySQL Credential management  High availability  A secure way to store credentials
  • 7. © 2016 Apptio, All rights reserved (v2.5)7 Vault  What we needed to figure out  How to deploy vault in 15 datacenters  Automated, easily configurable  How to connect several hundred databases to those vaults  High availability  Sane backups  Make it easier than passing around passwords or looking in app config files
  • 9. © 2016 Apptio, All rights reserved (v2.5)9 Step 1: Deploy Vault  We already had consul in all DCs  Spread across racks in DC  Across AZs in AWS  Is connected using WAN federation  We use Puppet for configuration management  The puppet module takes care of download/install  Connect to consul – HA backend  This also provides us with TLS  We deployed vault onto all consulservers
  • 10. © 2016 Apptio, All rights reserved (v2.5)10 Step 2: Initialise Vault  Automating this isn’t trivial  Plaintext keys are bad  By default, vault outputs plaintext unseal keys  Solution: Use the GPG support  We already used GPG to store encrypted files in git  Using puppet + eyaml  Also using git-crypt  This way, the keys are protected by the each user’s GPG private key  We used the API to init vault in each DC  We provide 7 GPG keys, and need 3 users to unseal a vault
  • 11. © 2016 Apptio, All rights reserved (v2.5)11 Step 3: Unseal the Vault  At this stage, we have around 60 instances of vault to unseal..  Doing this “manually” is obviously not tenable  Automating this is dangerous..
  • 12. © 2016 Apptio, All rights reserved (v2.5)12 Unseal  https://github.com/jaxxstorm/unseal  Add your vaults servers to a config file  Add your encrypted unseal key  You can also put the plaintext key, but don’t!  Prompts for your GPG keyring password  If you’re running GPG agent, this is a security risk..  Unseals all vaults  Each unseal command runs in a goroutine  Can send unseal command to 75 vaults in around 15s!
  • 14. © 2016 Apptio, All rights reserved (v2.5)14 Step 4: Configure the vault  We need to now add some configuration for all DCs  Answers  https://github.com/UKHomeOffice/vaultctl  https://www.hashicorp.com/blog/codifying-vault-policies-and-configuration/  Allows you to define the vault config in yaml  Can then run vaultctl to configure your vault server as you require  Enable LDAP with config  Enable audit logging  Enable MySQL backend  We run this in a loop for all DCs  Only need to hit a single vault server in each DC
  • 15. © 2016 Apptio, All rights reserved (v2.5)15 Step 5: Add MySQL configuration  We provision VMs using internal tool “selfserve”  When VM is provisioned for DB  Puppet runs, installs mysql  Puppet adds a “vault” user with grants  We then add roles to each DB config – readonly and full  Selfserve makes an API call to that regions vault, adding it as a backend  Selfserve has its own token which has write permissions to the mysql backend using policy  We mount all databases with path mysql/<hostname>
  • 16. © 2016 Apptio, All rights reserved (v2.5)16 Step 6: Make logins easy  Configure ldap auth with policies for customers mapped to LDAP groups  Some people can get write access, some only get read access  However, authing with ldap and then having to do vault write was difficult for users to remember  Have to vault auth  Then vault read <creds>  Having to look this up when on-call isn’t fun if you don’t do it regularly
  • 17. © 2016 Apptio, All rights reserved (v2.5)17 Breakglass  A simple golang command line tool to automate the login process  Prompts for your AD password, and you specify the mysql host you need  It finds the correct vault endpoint using DNS forwarding, and then automatically drops you into a mysql shell  Inspired by vault ssh  It’s not currently open source, but hoping to have that done by end of Q3.
  • 20. © 2016 Apptio, All rights reserved (v2.5)20 ACLs  If you’re using consul as your backend turn on ACLS!  You should also block access to port 8500/8501 where possible  Consul can be used extensively to pivot to RCE:  http://www.kernelpicnic.net/2017/05/29/Pivoting-from-blind-SSRF-to-RCE-with- Hashicorp-Consul.html  If you store your secrets in consul, don’t let someone delete them  By default, the consul web api allows access to delete and modify any key  This requires an investment in implementing tokens  You can use vault to manage these!
  • 21. © 2016 Apptio, All rights reserved (v2.5)21 Backups  When we init vault, we use the key prefix “vault/$datacenter”  Our DC’s are completely distinct, we never share secrets between DCs  We use consul snapshot to take backups  Take them once per hour  We copy them to another DC  We test restores weekly  Start vault on a difference port  Connect it to the existing consul with the “vault/$datacenter” prefix  All done via ansible  Have users unseal – users run when they come online  Verify integrity  Shutdown
  • 22. © 2016 Apptio, All rights reserved (v2.5)22 Lessons Learned  Pick 1 thing and “vault it”  Trying to secure all your secrets in vault straight away can be overwhelming  We now store the majority of our secrets in vault after lessons learned from MySQL  Have a good story for configuration, backups and unsealing  Consul + Vault has a great HA story  As long as you use consul’s service discovery of course  “Automated” secret management has trade-offs  Be aware of them  Abstract away the user pain where possible  Golang is great for cmdline tools!  These packages use viper + cobra  https://github.com/spf13/cobra