At Ninefold we've spent 3+ years with Chef. We've built a PaaS with Chef and we manage our internal systems with it.
In this presentation we explore the design decisions we needed to make in order to build the platform. It highlights the things we've learned along the way that weren't exactly obvious when we started.
Presentation on how to chat with PDF using ChatGPT code interpreter
Building a PaaS using Chef
1. {
Building a PaaS using
Chef
Shaun Domingo - @sdomsta
Head of Tech and Operations @ Ninefold
2. An IaaS cloud provider in Sydney for 4+ years,
branched out to PaaS
Great, powerful infrastructure
Wanted to extend the platform by the power of
Devops
We turned to Chef for help
Ninefold: who are we?
3. Why Chef? Long story, but…
Chef Puppet
•Adoption
•Announcements by Dell and Rackspace that they have
built their Openstack provisioners on top of Chef.
• Chef has a steeper learning curve for the basics but
becomes easier as you go deeper (common opinion)
•“The single biggest drawback to Chef is that it has a steeper
learning curve than Puppet. Further, most of the existing
tutorials focus on the deeper concepts in Chef rather than giving
novices quick gratification”
•Adoption
•Puppet is easier to get started with over Chef, simpler
constructs and starting point documentation.
•“It was easy to get started with Puppet, but things became more
complicated with time.“
Technical
• Databags: Chef's Data Bags are incredibly useful. They
are a very nice way to feed lists of users to multiple
cookbooks. Think of them as global variables for your
chef-server. One the benefits of data bags is that they
allow you to separate your corporate information from
your cookbooks. Thus, making it easier for Chef users to
open-source their cookbooks. Chef has encrypted
databags.
• Search function: An extremely intuitive way to tie
together dependencies between nodes, such as between a
nagios server and its clients.
• Chef uses Ruby as the configuration language
Technical
• ExtLookup: Puppet has extlookup but that is quite a bit
more limited than data bags.
• Puppet requires a third-party solution to encrypt data
such as passwords.
• Search: Puppet has a way to "export" resources but
common feeling is that it is complicated to understand
and use
• Puppet uses a custom DSL for the configuration language
Back
in
2012>>
4. Community cookbooks + chef.io (originally
Opscode) growth $$$
Developer focus
Search, Data bags, Knife
Ruby
Why Chef for Ninefold?
11. Chef Solo – no search > problem
Hosted Chef – far away from Sydney (no S3
back then), customer data, another integration
point
Chef Server – would require customers to buy
VMs for holding config about their app, or
we’d have to give it away free
Enterprise Chef (previously Private Chef) –
multitenancy built-in
Chef setup decisions
12. Responsibility of:
data bags
roles
tags
environments
cookbook pinning
Chef design decisions
13. Attributes or Data bags
Attributes persist between chef-runs,
searchable from recipes, good for controlling
node behaviour
Data bag is a collection of global data,
available to all nodes, searchable from
recipes, good for app-wide settings
App config persistence
14. Chef organisation isolation per customer or app?
Use environments to manage apps?
Attributes – Node, role or environment based: OK
Roles – cross app, generic: OK
Cookbooks – cookbook pinning per environment: OK
Data bags: FAIL
We had a view to clusters of nodes
Multitenancy / Isolation
15. Where did we land?
Enterprise Chef
1 org per app
Databags store global, app configuration
Roles provide expanded runlists
Nodes hold information about themselves only
Cookbooks uploaded to every chef organisation :(
16. Ruby > Our forte
Build an API
Chef Pushy (wasn’t available back then)
Purely github workflow?
Build a deployment engine
Preprovision / Just-in-time
Orchestration
17. Requirement Provided by Chef
Create chef org via API X
Delete chef org via API X
Update chef org via API
Read chef org via API
Chef organi(s|z)ation ecosystem
It’s ok, let’s
roll our own!
20. Environments: Dev, SIT, Staging,
Preproduction, Production
All managed by Chef
Jenkins per environment triggers deployments
Separate chef organisation per environment
Management
22. Difficult: no search across chef organisations
Monitoring per app
Isolation means we couldn’t find errors quickly
Logging was useful for tracking down errors
(e.g. how many apps were experiencing Chef
500 errors?)
Monitoring
23. Initially: berkshelf package and custom ruby
orchestration
And much later: we duck-punched / patched
chef to work in chef-solo mode for cookbook
downloads, but use Enterprise Chef at the same
time!
Customer cookbook
rollouts
25. Nodes should discover other nodes
automagically and not require orchestration to
completely converge
Cron-based?
Service-based daemon?
Externally-triggered?
Scheduled or unscheduled?
Auto-deploy or persist deployed git revisions
Convergence strategy
26. Decisions about setting
splay interval
Node convergence ordering
Custom Ninefold wrapper
shell script – highlander
(there can only be one ...
chef-client!)
Convergence strategy
ramifications – cron based
27. Team cookbook contributions in git
can be hard to manage
Ensure someone is in control of
release management
Use git flow, it works great
The develop branch
Master
Cutting releases
Metadata.rb and CHANGELOG.md
– commit at the last possible
moment
Tagging
Pushing out to all customers
Contribution strategy
28. Recommendation: do this as much as possible
Started with ninefold_portal cookbook
Moved to ninefold_app cookbook
Management / Customer diverged
Dogfooding strategy
29. Everything driven from
cookbook, including scaled
RabbitMQ, HAProxy,
Logstash and ElasticSearch
Logging
[C] ,-[AMQP]
|
[C]--[LB]---[AMQP]
/ | |
[C]/ | '-[AMQP]
|
[LS]---[ES]---
[PORTAL]
[ES]
[ES]
34. Break down cookbooks into small manageable
chunks that can be swapped in and not
Use a cookbook dependency management tool
Use berkshelf. It’s good, and comes out of the
box with Chef DK
Cookbook Dependencies
36. Plug-in Logic
Plug-in Version
.erb file used as template for
KB config
Use bundler to maintain
dependencies
It’s easy to write a knife
plug-in - ninefold-internal
37. After 3 years of heavy chef use, these are our
thoughts:
+ Positives
Setting up single-tenant, multi-node clusters
Community cookbooks via supermarket
Highly customisable – it’s just ruby
Powerful management console and supporting
tools
API
Lots of people love and embrace chef, vibrant
community, mailing lists, IRC and more.
Lessons learned
38. - Not as positive
Multi-node orchestration, although tools like Chef
Delivery, using machine resource look promising
Large learning curve: our customers didn’t want to
know about it, too hard, I’ll get around to it in the
future
Spinning up nodes takes too long – containers are
better at this, auto scale is best achieved in seconds, not
minutes
Powerful features like search only available with Chef
Server
Idempotency is great, but it is also slow
Chef-client will be as slow as the executables and
systems behind it
Lessons learned cont…
39. @sdomsta / @ninefold
Deploy a server via
Portal or API in
Australia today
We want to talk chef,
containers, devops with
people. Drop me a line!
We’re hiring: Operations
Support Engineer
Deploy on Ninefold!
Editor's Notes
Initially: custom ruby orchestration: berkshelf package, download cookbooks into a directory, knife cookbook upload all cookbooks into every chef organisation
In the case of a Master PostgreSQL database, configuring slaves requires two passes and requires a data bag item to act as the semaphore.
Use rapid mode until the total time since our first convergence has passed exceeds the quick period, or sustain rapid mode if there are errors.
Splay interval needed to be identical across nodes in the cluster.