A Bio based on talks: Obsessed by Puppet modules
I’ve been working on Puppet modules for more than 10 years.
Always looking for answers to problems like:
Reusability Puppet Camp San Francisco 2010: Re-use Your Modules!
Puppet Conf San Francisco 2013: Anatomy of a Reusable Module
Naming standards Puppet Camp Europe Ghent 2010: Puppet Modules and Module Standards
App deploy Puppet Camp Europe Amsterdam 2011: Automating Applications deployment with Puppi
Configurability Puppet Conf San Francisco 2012: Puppet modules for fun and profit
Puppet Camp Dublin 2012: Puppet Modules: an holistic approach
Optimization CfgMgmtCamp Ghent 2016 - Essential application management with Tiny Puppet
CfgMgmtCamp Ghent 2018 - Puppet Systems Infrastructure Construction Kit
Many things happened in module-land
Since when Puppet modules have been introduced, they succeeded in many
areas, like extendability to support new OSes, devices, IT objects,
interoperability, improved reusability and composability.
Puppet has evolved as well, with the introduction of Data in Modules,
functions in Puppet DSL, external facts, Tasks, Plans, Data Types.
In this presentation we will have an high level review of them.
New patterns have emerged over the years, here we focus on the ones I
prefer, in some cases they are so unorthodox that for someone they can
look like anti-patterns. I can explain. We will see them.
Well known Module paths and conventions
Common directories in modules we have always used:
manifests Puppet code in files with .pp extension whose names match class names
Class mysql is defined in mysql/manifests/init.pp
Class mysql::server in mysql/manifests/server.pp
files Static files used in the source argument of the file resource.
source => 'puppet:///modules/mysql/my.cnf' matches file mysql/files/my.cnf
templates Dynamic .erb or .epp templates used as argument of template() or epp()
functions. Typically used in the content arg of a file resource.
content => template('mysql/my.cnf.erb') matches file mysql/templates/my.cnf.erb
content => epp('mysql/my.cnf.epp') matches file mysql/templates/my.cnf.epp
lib Directory with custom functions, facts, types and providers in Ruby
language. This content is automatically synced to clients.
spec Directory for unix tests with Rspec Puppet.
Module paths added in more recent years
Directories added to modules for different purposes:
data Hiera data in module.
locales Multi language translations.
types Custom data types
tasks Bolt tasks
plans Bolt plans
facts.d External facts. Automatically synced to clients
functions Functions written in Puppet language
Custom Data Types
Puppet 4 comes with a rich type system for any kind of data, like:
String, Boolean, Integer, Array, Hash, Pattern, Struct, Undef...
It's possible to create custom data types to filter and validate our data.
They are shipped in the types dir of a module.
They are named and located in the module namespace:
Stdlib::Absolutepath is defined in stdlib/types/absolutepath.pp
New data types can use any other native or custom data type:
type Stdlib::Absolutepath = Variant[Stdlib::Windowspath, Stdlib::Unixpath]
Puppetlabs-stdlib module provide many useful additional data types.
Any module can provide its own types. IE, puppetlabs' ntp/types/poll_interval.pp:
type Ntp::Poll_interval = Integer[4, 17]
Bolt Tasks
They are scripts, in any language, which can be remotely executed via Bolt.
They are placed in the tasks dir, for each tasks we have the script and its json
metadata descriptor.
They are namespaced with the module, for example:
tp::test is placed under tp/task/test.sh with meta in tp/task/test.json
Scripts can access arguments via env vars with prefix PT_ or an input json.
tp/task/test.json
{ "description": "Run tp test on target nodes",
"parameters": {
"app": {
"description": "The application to test",
"type": "Optional[String[1]]"
} } }
tp/task/test.sh
#!/usr/bin/env bash
declare tp_options
PATH=$PATH:/usr/local/bin
[[ -n "${PT_app}" ]] && tp_options="${PT_app}"
tp test $tp_options
Plans
Orchestrated set of tasks that can be run on different hosts based on custom
logic according to the exit status of other runs.
They can be written in Yaml or in Puppet language and are shipped in the
plans dir.
Puppet plans have .pp extension and syntax similar to classes:
plan amazon_aws::create_kubernetes_cluster is defined in
amazon_aws/plans/create_kubernernes_cluster.pp
Can have parameters and special functions like run_task or run_puppet:
plan amazon_aws::create_kubernetes_cluster(
String[1] $cidr_block, [...]
) {
$responses=run_task("amazon_aws::iam_aws_list_roles", "localhost")
$role_list=$responses.first.value["roles"]
[...]
External facts in facts.d
Facts in the facts.d directory which are automatically synced to clients.
Contrary to native facts in lib directory, external facts don't need to be in written in Ruby.
Executable facts are executable scripts:
- In any language for Linux (must have the executable bit set)
- Files with extensions .bat, .exe, .com or .ps1 for Windows
The output of the script just has to be: fact_name = fact_value
Plain ascii files with extension .yaml, .txt (inifile format) or .json can be used as well.
They just need to specify the fact name and the relevant value (can be a string an array or
an hash).
Note that these facts are distributed to every node without possibility to have different facts
for different clients. So, in most cases, only executable facts are used here.
Functions in Puppet language
Are placed in the functions directory, have .pp extension and are written in Puppet
language. Their syntax is similar to classes or defines. For example, the function
psick::template is defined in psick/functions/template.ppand can look as follows:
function psick::template(
Optional[String] $filename,
Hash $parameters = {}
) >> Optional[String] {
if $filename and $filename !='' {
$ext=$filename[-4,4]
case $ext {
'.epp': { epp($filename, { parameters => $parameters } ) }
'.erb': { template($filename) }
default: { file($filename) }
}
} else {
undef
}
}
Hiera data in modules
Hiera 5 supports data in modules: a revolution on how default params are set.
A hiera.yamlin the module, like the one below, defines a Hierarchy in the module itself,
relative to the datadir.
On component modules this is useful to set the default values of the module classes’
parameters according to different OS. This completely replaces the params pattern.
Note that data in module can only be used to set values for the module’s params.
---
version: 5
defaults:
datadir: data
data_hash: yaml_data
hierarchy:
- name: "In module hierarchy"
paths:
- "%{facts.os.name}%{facts.os.release.major}.yaml"
- "%{facts.os.name}.yaml"
- "%{facts.os.family}%{facts.os.release.major}.yaml"
- "%{facts.os.family}.yaml"
- name: "Common"
path: "common.yaml"
My Unorthodox Patterns (who said anti-patterns?!)
After more than a decade of Puppet consulting, trainings and developments,
I’ve found myself using patterns which some times are so unorthodox that
someone might consider them anti-patterns.
Let’s review a few of them ;-)
● Profiles without roles (classes)
● Hiera driven classification (with Hashes)
● Custom templates and options hash (with hardcoded defaults)
● Ensure everything for presence or absence
● Applications abstraction with Tiny Puppet
● Class based server-side noop management
● Params to manage everything
● Reinvent the wheel (when it’s fast and easy)
Profiles without roles (classes)
The roles and profiles pattern is a widely established pattern which... I never
use, at least in the conventional way. Let's see how the pattern can evolve:
● Profiles can be any class which implements things in the way we need:
○ Classes of a custom profile module which use resources from a component module
○ Directly classes of a component module, if via Hiera data, they can do what we need
○ Maybe even one of the reusable profiles of the example42-psick module
● The concept of role is fundamental. Better if each node has only one
role and we can have a $role fact or global variable to use in hiera.yaml
● A role module and role classes which just include profiles are mostly
useless. Much better and more flexible to replace them with Hiera based
classification, setting profiles to include at a hierarchy level with $role
variable
● In large or complex environments the $role could not be enough to
define what each node does. Concepts like $product or $cluster and
their own roles might blend in.
Hiera driven classification (the psick way) 1/2
Classification is how we assign classes to nodes, according to their functions.
There are various classification methods: Arrays of classes looked via Hiera, an
ENC, role classes, node statement, LDAP....
The one I find the most flexible and easy to use is to set the classes to include
via Hiera, using an hash of keys to lookup in deep merge mode where keys
are arbitrary placeholders we can override in Hierarchies and values are the
class names to include.
A sample implementation is in example42-psick module, where:
- the hiera keys to use are named according to the kernel fact (to avoid the
need to have OS related hierarchies just for classification)
- Keys also identify different phases, whose classes are included in the
relevant order ('pre' before 'base' before 'profiles'). With an optional
'firstrun' phase, with classes included only at the first Puppet run.
Hiera driven classification (psick sample data) 2/2
# Classes applied at the first run on Linux and Windows
psick::enable_firstrun: true # By default firstrun phase is disabled
psick::firstrun::linux_classes:
aws: psick::aws::sdk
psick::firstrun::windows_classes:
aws: psick::aws::sdk
# Normal runs, prerequisite classes, applied before the others
psick::pre::linux_classes:
puppet: ::puppet
dns: psick::dns::resolver
psick::pre::windows_classes:
hosts: psick::hosts::resource
# Common baseline classes for Linux
psick::base::linux_classes:
sudo: psick::sudo
time: psick::time
# Application / role specific profiles (Linux)
psick::profiles::linux_classes:
www_blog: profile::www::blog
Custom templates and options, with defaults 1/2
For each configuration file to manage, let the users decide (via parameters):
- What erb/epp template to use (a default is OK, but allow override)
- A custom Hash of options: key-values used in the template
- Default options can be hardcoded in module and be overridden
- No need of a dedicated parameter for each configuration option!
- An Hash of options can be validated (with the Struct type), exactly as single params
- The class params you need (partial example) are just:
class profile::ssh (
String $config_file_template = 'profile/ssh/sshd_config.epp',
Profile::Ssh::Options $config_file_options = {},
Profile::Ssh::Options $config_file_defaults = {}, # Set in module data
) {
$all_options = $config_file_defaults + $config_file_options
file { '/etc/ssh/sshd_config':
content => epp( $config_file_template , { $options => $all_options }),
}
}
Custom templates and options, with defaults 2/2
User Hiera data, in the control-repo, can look as follows:
profile::ssh::config_file_options:
'PermitRootLogin': 'yes'
In the module Hiera data we can set the default options:
profile::ssh::config_file_defaults:
'PermitRootLogin': 'no'
'ListenAddress': "%{facts.networking.ip}"
The used epp template can be as follows:
PermitRootLogin <%= $options['PermitRootLogin'] %>
ListenAddress <%= $options['ListenAddress'] %>
The custom Profile::Ssh::Options type to validate the above can be like:
type Profile::Ssh::Options = Struct[{
Optional[PermitRootLogin] => Enum['yes','no'],
Optional[ListenAddress] => Stdlib::IP::Address, [...] ]}
Ensure for presence and absence 1/2
Puppet manages what we tell it to manage.
Any resource added in a class should be easily removable.
Any class should have an $ensure parameter which:
● Is applied AS IS to the managed package(s) to eventually allow Hiera
driven management of the version to install, or use latest
● Is adapted via custom functions for the other resources.
When you have to remove what a class has added, is enough to set on Hiera:
class_name::ensure: 'absent'
A Custom data type can be created to validate the possible values for $ensure:
type Profile::Ensure = Variant[ Boolean,
Enum['present', 'absent', 'latest'],
Pattern[/d+(.d+)*/]]
Application abstraction with Tiny Puppet 1/2
Puppet is about abstracting resources from the underlying OS.
Tiny Puppet (example42-tp) is about abstracting applications.
It provides defines that can manage installation and configuration of
potentially any application on any OS, managing also the relevant package
repositories or dependencies, and leaving to used full freedom to manage
files in the way we want.
Can replace, potentially, any module where just packages, services and files
are managed.
Tiny Puppet is ideal for the sysadmin who knows how to configure her/his
files and wants to have freedom in choosing how to manage them (erb/epp
templates, source, content...) without having to study a dedicated module and
make it do what [s]he wants.
Application abstraction with Tiny Puppet 2/2
The sample ssh profile seen so far can become (multi OS support included):
class profile::ssh (
String $ensure = 'present',
String $config_template = 'profile/ssh/sshd_config.epp',
Profile::Ssh::Options $config_options = {},
Profile::Ssh::Options $config_defaults = {},
) {
$all_options = $default_options + $config_options
tp::install { 'openssh':
ensure => $ensure,
}
tp::conf { 'openssh':
ensure => $ensure,
epp => $config_template,
options_hash => $all_options,
}
}
Class based server-side noop management 1/2
The Puppet purist will tell you that you should not have nodes running in noop mode. Let
me argue, again, on this.
On a very sensitive and business critical environments production nodes can, maybe
should, run in noop mode during standard operational times.
Conditions apply, of course:
● They should have regular runs (at least weekly) in enforcing mode during maintenance
windows to prevent the accumulation of changes.
● You should check the reports of the impending changes and eventually trigger from a
central place enforcing runs when needed
● You should have a way to always enforce some classes (see next slide)
● You should have some canary nodes in production where changes are applied
Benefits of production in noop by default are quite clear:
● You don't risk to destroy the business in 30 minutes for a wrong change
● Your DevOps won't have the fear of breaking everything at every commit
● Puppet code development, testing and deployment can be faster
Class based server-side noop management 2/2
We can control noop behaviour for each class, leveraging on trlinkin-noop module
with params as follows in a class:
class profile::ssh (
Boolean $noop_manage = false,
Boolean $noop_value = false,
) {
if $noop_manage {
noop($noop_value)
}
[... class resources ...] }
When profile::ssh::noop_manage: true the noop() function is invoked with
the $noop_value which adds the noop metaparameter to all the resources in the
same scope. This, according to $noop_value allows to:
● false Enforce application of the class resources also when client runs in noop
● true Test in noop mode the class resources when client runs normally
Params to manage everything 1/2
Ever had to modify a module because it was creating some resources in a way
different from your needs or it had duplicated resources?
What if it would be enough to just provide some Hiera data to override the
module default behaviour? ANY behaviour:
- If to manage some of its resources
- Any extra parameters of a given resource
This particularly applies to resources which might conflict with other modules
(this is in itself a red herring, but still might be a needed prerequisite in some
cases) or might need tweaking of arguments.
Say hi to $<something>_manage and $<something>_params parameters
for full freedom in deciding if and what to do with classes' resources.
Params to manage everything 2/2
class profile::ssh (
Profile::Ensure $ensure = 'present',
String $config_file_template = 'profile/ssh/sshd_config.erb',
Boolean $config_file_manage = true,
Hash $config_file_params = {},
) {
if $config_file_manage {
$config_file_defaults = {
ensure => psick::ensure2file( $ensure),
mode => '0644',
owner => 'root',
group => 'root',
content => template( $config_file_template),
}
file { $config_file_path:
* => $config_file_defaults + $config_file_params,
}
}
}
Hiera data to use a static source instead of the template:
profile::ssh::config_file_params:
content: ~
source: puppet:///modules/profile/ssh/sshd_config
Reinvent the wheel (when fast and easy)
Do we really need a dedicated module to manage EPEL? Or Motd?
Or even the typical package-service-file pattern?
A new module added to Puppetfile means:
- Need to resolve and add its dependencies
- Need to study, understand and adapt the module to our needs
- Longer deployment times (a new repo to clone / sync)
When developing my profiles I evaluate:
- If what I have to do is easy and fast enough to avoid the need of a
component module
- If I can use pdk with a custom template to quickly generate full featured
profile classes
- If I can do what I have to do with tp defines or psick profiles
Summing up...
Opinions based on years of modules development and experience, may not
apply to everybody, especially who is new to Puppet.
I think that modules are good, useful and necessary, they are an essential
part of Puppet ecosystem.
Still, even if there's a module for everything,
that doesn't mean you have to use an existing module for everything.
Public modules and common patterns, like roles and profiles, are
the best solution when starting and learning,
but once you grasp Puppet core concepts and are comfortable with writing
code, you can go beyond and explore your own ways.