Charity Majors
@mipsytipsy
Charity Majors
@mipsytipsy
There and back again: a Chef tale
How we drank the Kool-Aid, sobered up, and
learned to cook responsibly.
Mobile apps platform
500k+ apps
AWS
MongoDB, Cassandra, Mysql, Redis
ruby & rails => golang
Our mission:
• Support relentless growth
• Ship products fast
• Solve mobile apps naively at scale
Active monthly Parse installations
API requests per second
• Support relentless growth
• Ship products fast
• Solve mobile apps naively at scale
Our mission:
our mission
your mission
Chef the Base System!!
• bootstrapping nodes with knife-ec2
• configuring system packages
• managing deb versions
• ec2 hostname tags from chef node names
• route53 DNS records from hostname tags
• cron jobs, batch jobs
Chef the Services!!
• haproxy configs
• generate yaml files
• generate host lists
• manage config files for Parse services
• monitoring and graphing based off roles
Chef the Databases!!
• creating/managing mongo replica sets
• provisioning & assembling RAID devices
• assigning cassandra initial tokens
• backups, snapshotting & restores
• community cookbooks for mysql, redis
Chef the Deploys!!
• deploy Parse services?
….??????
wait …
1) Things we did with
chef badly
2) Things that chef was
not the right tool for
mistakes were made …
• Overloading roles with too much work
• Confusion between role vs instantiation of service
• Using definitions instead of providers
• Using lots of data bags
• One attribute per config entry instead of a hash of all
entries
• Using knife search extensively
mistakes were made …
• Forking + modifying community cookbooks
• Importing community cookbooks with too many
custom dependencies
• Not using repo-per-cookbook / Berkshelf
• Not investing the time into vagrant, unit tests, staging
environment, versioning
• Where is my source of truth?!
but these are all solvable
problems.
but these are all solvable
problems.
what isn’t?
sometimes, chef just
ain’t enough.
• Provisioning from scratch
• Service registration & discovery
• Managing software & configs
• Databases
Problem areas
bootstrapping from vanilla AMIs
launching instances with knife-ec2
Provisioning
bootstrapping from vanilla AMIs
launching instances with knife-ec2
Provisioning
Solution: bake AMI with chef, use ASGs
realtime search needs realtime data
Service discovery
realtime search needs realtime data
Service discovery
Solution: zookeeper, consul, etcd, etc
Service discovery
avoid snowflake hosts
use distributed locking for cron jobs
Managing software & congs
• System software (debs, rpms)
• Developer-owned services
• Internal operations software
Managing software & congs
System software
Managing software & congs
Developer-owned services
• Do not tie code deploys to system changes
• Perform the minimal set of changes
• Configs *are* software. Version together.
Managing software & congs
Internal operations software
• Treat software engineering like software
engineering
• Treat systems-y packages like systems
packages
• Package and version “util” scripts
• Manage package versions with Chef
Databases at scale
Databases
DBA operations
Not really what chef is best at.
Imperative commands
Automatic remediation
Coordinating actions across nodes
Databases
DBA operations
• Create, tear down replica sets or nodes
• Verify backups
• Rolling version upgrade
• Elect new primary / switch masters
• Enable/disable query killer
• Change schemas or indexes
• Compaction, rotation
• Version replica set state
• Etc
Databases
DBA operations
If you don’t have to do a ton of DBA
ops, Chef can manage databases.
Don’t over-engineer in advance of
your actual needs.
Databases
Separation of conguration and state
Base system => chef
Detect and publish state changes => chef, zk
Generate monitoring congs => chef
Imperative commands => db tooling
Databases at scale
We chef for:
• Building base AMIs
• Generating monitoring configs
• Storing encrypted secrets
• Cron jobs (with zk lock)
• Inferring and publishing db state changes
Things we still suck at
• Single source of truth (git / chef-server)
• Isolated staging environment
• Full continuous testing for cookbooks
• Realtime data
• Internal software packaging & management
• Database administration at scale
Things we don’t chef
Charity Majors
@mipsytipsy

There and Back Again: How We Drank the Chef Kool-Aid, Sobered Up, and Learned to Cook Responsibly

  • 1.
  • 2.
  • 3.
    There and backagain: a Chef tale How we drank the Kool-Aid, sobered up, and learned to cook responsibly.
  • 4.
    Mobile apps platform 500k+apps AWS MongoDB, Cassandra, Mysql, Redis ruby & rails => golang
  • 5.
    Our mission: • Supportrelentless growth • Ship products fast • Solve mobile apps naively at scale
  • 6.
    Active monthly Parseinstallations
  • 7.
  • 8.
    • Support relentlessgrowth • Ship products fast • Solve mobile apps naively at scale Our mission:
  • 9.
  • 12.
    Chef the BaseSystem!! • bootstrapping nodes with knife-ec2 • configuring system packages • managing deb versions • ec2 hostname tags from chef node names • route53 DNS records from hostname tags • cron jobs, batch jobs
  • 13.
    Chef the Services!! •haproxy configs • generate yaml files • generate host lists • manage config files for Parse services • monitoring and graphing based off roles
  • 14.
    Chef the Databases!! •creating/managing mongo replica sets • provisioning & assembling RAID devices • assigning cassandra initial tokens • backups, snapshotting & restores • community cookbooks for mysql, redis
  • 15.
    Chef the Deploys!! •deploy Parse services? ….??????
  • 16.
  • 18.
    1) Things wedid with chef badly 2) Things that chef was not the right tool for
  • 19.
    mistakes were made… • Overloading roles with too much work • Confusion between role vs instantiation of service • Using definitions instead of providers • Using lots of data bags • One attribute per config entry instead of a hash of all entries • Using knife search extensively
  • 20.
    mistakes were made… • Forking + modifying community cookbooks • Importing community cookbooks with too many custom dependencies • Not using repo-per-cookbook / Berkshelf • Not investing the time into vagrant, unit tests, staging environment, versioning • Where is my source of truth?!
  • 21.
    but these areall solvable problems.
  • 22.
    but these areall solvable problems. what isn’t?
  • 23.
  • 24.
    • Provisioning fromscratch • Service registration & discovery • Managing software & configs • Databases Problem areas
  • 25.
    bootstrapping from vanillaAMIs launching instances with knife-ec2 Provisioning
  • 26.
    bootstrapping from vanillaAMIs launching instances with knife-ec2 Provisioning Solution: bake AMI with chef, use ASGs
  • 27.
    realtime search needsrealtime data Service discovery
  • 28.
    realtime search needsrealtime data Service discovery Solution: zookeeper, consul, etcd, etc
  • 29.
    Service discovery avoid snowflakehosts use distributed locking for cron jobs
  • 30.
    Managing software &configs • System software (debs, rpms) • Developer-owned services • Internal operations software
  • 31.
    Managing software &congs System software
  • 32.
    Managing software &configs Developer-owned services • Do not tie code deploys to system changes • Perform the minimal set of changes • Configs *are* software. Version together.
  • 33.
    Managing software &configs Internal operations software • Treat software engineering like software engineering • Treat systems-y packages like systems packages • Package and version “util” scripts • Manage package versions with Chef
  • 34.
  • 35.
    Databases DBA operations Not reallywhat chef is best at. Imperative commands Automatic remediation Coordinating actions across nodes
  • 36.
    Databases DBA operations • Create,tear down replica sets or nodes • Verify backups • Rolling version upgrade • Elect new primary / switch masters • Enable/disable query killer • Change schemas or indexes • Compaction, rotation • Version replica set state • Etc
  • 37.
    Databases DBA operations If youdon’t have to do a ton of DBA ops, Chef can manage databases. Don’t over-engineer in advance of your actual needs.
  • 38.
    Databases Separation of congurationand state Base system => chef Detect and publish state changes => chef, zk Generate monitoring congs => chef Imperative commands => db tooling
  • 39.
  • 40.
    We chef for: •Building base AMIs • Generating monitoring configs • Storing encrypted secrets • Cron jobs (with zk lock) • Inferring and publishing db state changes
  • 41.
    Things we stillsuck at • Single source of truth (git / chef-server) • Isolated staging environment • Full continuous testing for cookbooks
  • 42.
    • Realtime data •Internal software packaging & management • Database administration at scale Things we don’t chef
  • 44.